e-Research

German Science Priority Initiative – Digital Information and e-infrastructure

I have been tracking national research initiatives in Australia, Canada, UK and USA in various blogs over previous months. Another potentially very important national initiative can now be added to the list from Germany.

An alliance of scientific organisations in Germany which includes all the majors players such as Deutsche Forschungsgemeinshaft (DFG, the German Research Foundation), Fraunhofer Society, Helmholtz Association of German Research Centres, and the Max Planck Society, have signed a joint national e-infrastructure policy initiative with six priority areas focusing on:

  • National licencing of e-journals;
  • Open Access;
  • National hosting strategy for preservation of e-journals;
  • Preservation and re-use of primary research data;
  • Virtual research environments; and
  • Legal frameworks (focusing on copyright law and equalising VAT treatment on print and electronic publications).

The Alliance agreed to coordinate the activities of the individual partner organisations and to expand on the ideal of the innovative information environment by means of a Joint Priority Initiative from 2008 to 2012 with the following goals:

  • to guarantee the broadest possible access to digital publications, digital data and other source materials;
  • to utilise digital media to create the ideal conditions for the distribution and reception of publications related to German research;
  • to ensure the long-term availability of the digital media and contents that have been acquired from round the world and their integration in the digital research environment;
  • to support collaborative research by means of innovative information technologies.

Further information on the initiative is now available to download as a PDF in English or you can brush up your language skills (as I did or at least tried to) and read it in the original German 🙂

Interim Report – UK Research Data Feasibility Study

I have previously blogged on UKRDS, the major consultancy work the company has been undertaking with ther lead partner SERCO Consulting over the last six months on a UK Research Data Service feasibility study for the Higher Education Funding Council for England (HEFCE).

The interim report of the study has just been released. The report analyses the current situation in the UK with a detailed review of relevant literature and funders policies, and data drawn from four major case study universities (Bristol, Leeds, Leicester, and Oxford). It describes the emerging trends of local data repositories and national facilities in the UK and also looks internationally at Australia, the US and the EU. Finally it presents possible ways forward for UKRDS. Preliminary findings from a UKRDS survey of over 700 UK researchers are presented in an Appendix. The study has now moved into its second phase building on the interim report and developing the business case.

Luis Martinez-Uribe, Digital Repositories Research Co-ordinator at Oxford University has written on the interim report in his blog “I highly recommend everyone with an interest in research data management to have a look at this report as not only it captures the current state of affairs in the UK and elsewhere but also offers possible ways forward.”

Research Data Canada

Readers of the blog may be interested in work underway in Canada via Research Data Canada which is running in parallel to work on UKRDS here in the UK, Datanets in the USA, and ANDS in Australia.

Research Data Canada has established The Research Data Strategy Working Group – a collaborative effort by a multi-disciplinary group of universities, institutes, libraries, granting agencies, and individual researchers to address the challenges and issues surrounding the access and preservation of data arising from Canadian research.

The group is currently working on a draft report “Stewardship of Research Data in Canada: Gap Analysis” which provides a statement of the ideal state of research data stewardship in Canada and a description of the current state, as determined by examining a number of indicators. The purpose is to provide evidence of gaps between current and ideal state in order to begin filling in the gaps. The indicators of the state of the stewardship of research data in Canada are as follows: policies; funding; roles and responsibilities; [trusted digital] data repositories; standards; skills and training; reward and recognition systems; research and development; accessibility; and preservation. The final version will be available in September.

Information on the working group and other Research Data Canada activities is available from its website.

�

UK Research Data Service Feasibility Study

The blog has been very quiet over August and the holidays. This is just a brief first entry (more to come next month) to flag up major consultancy work the company has been undertaking with SERCO Consulting over the last six months on a UK Research Data Service feasibility study for the Higher Education Funding Council for England (HEFCE).

The study has been initiated and led by the consortium of Research Libraries in the UK and Ireland (RLUK) and the Russell Group [of UK Universities] IT Directors (RUGIT) and aims to assess the feasibility and costs of developing and maintaining a national shared digital research data service for UK Higher Education sector. There is more background information on the UKRDS website.

A major part of the study has involved a feasibility and requirements stage working with the universities of Bristol, Leeds, Leicester and Oxford to survey over 700 academics in disciplines across the universities on their research data use and requirements. You will find further information on the Oxford results on the Oxford Scoping Digital Repository Services for Research Data Management Project website. Further information on the overall survey and findings will be available soon and a link and commentary will be posted on the blog.

Just published: A Comparative Study of e-Journal Archiving Solutions

I am pleased to announce that the JISC-funded report A Comparative Study of e-Journal Archiving Solutions has just been published and is now available to download as a pdf from the JISC Collections website. It has been a great pleasure to work with Julia Chruszcz, Maggie Jones and Terry Morrow on this study over the last few months.

The report is the result of a call by the JISC, issued in January 2008, for a Comparative Study of e-Journal Archiving Solutions. The Invitation to Tender asked for a report that ‘will be published for wide use by institutions to inform policies and investment in e-journal archiving solutions.’ The ITT also stated that the report should ‘also inform negotiations undertaken by JISC Collections and NESLi2 when seeking publishers compliance to deposit content with at least one e-journal archiving solution.’

The report contains chapters covering: Approaches to e-journal preservation, Publisher licensing and legal deposit, Comparisons of Six Current e-Journal Archiving Programmes (LOCKSS, CLOCKSS, Portico, the KB e-depot, OCLC’s Electronic Collections Online, and the British Library’s e-journal Digital Archive), Practical experience of e-journal archiving solutions, Evaluation of four common scenarios/trigger events, and Criteria for judging relevance and value of new archiving initiatives. There are two appendices on Publisher Participation in different programmes.

The report has the following recommendations:

  1. When negotiating NESLi2 agreements, JISCs negotiators should take the initiative by specifying archiving requirements, including a short-list of approved archiving solutions.
  2. To help quantify the insurance risk and the necessary appropriate investment, bodies representing publishers and other trade organisations should gather and share statistical information on the likelihood of the trigger events outlined in this report.
  3. Post cancellation access conditions should be defined in the licensing agreement between libraries and publishers. Publishers should be strongly encouraged to cooperate with one or more external e-journal archiving solutions as well as provide their own post-cancellation service (at minimal cost).
  4. The publisher (or subscription agent) should state their policy on perpetual access under the four scenarios described in section 9.
  5. When titles are sold on to other publishers, the Transfer Code of Practice (see section 9.3.) should be followed.
  6. Archiving service providers and publishers should work together to develop standard cross-industry definitions of trigger events and protocols on the conditions for release of archived content. Project Transfer is a potential exemplar. The ground rules for any post-trigger event negotiation should be clear and transparent and established in advance.
  7. Archive service providers must provide greater clarity on coverage details, including not only publishers and titles, but also the years and issues included in the archive.
  8. Using the scenarios outlined in this report, libraries should carry out a risk assessment on the impact of loss of access to e-journals by their institution, and a cost/benefit analysis, in order to judge the value and relevance of the archiving solutions on offer.
  9. Relevant UK bodies and institutions should use whatever influence they can bring to bear to ensure that archiving solutions cover publishers and titles of particular value to UK libraries.
  10. The findings of this study should be reviewed and updated at regular intervals to reflect continuing developments in the field of e-journal archiving and preservation.

Its publication comes hot on the heels of two related studies the Portico/Ithaka e-journal archiving survey of US Library Directors and the JISC-funded UK LOCKSS Pilot Programme Evaluation Report. A further blog entry will follow!

just published: Research Data Preservation Costs Report

I have posted two previous entries to the blog in March and January detailing progress with the JISC-funded research data preservation costs study. I am pleased to report that the online executive summary and full report (pdf file) titled “Keeping Research Data Safe: a cost model and guidance for UK Universities” is now published and can be downloaded from the JISC website.

It has been an very intensive piece of work over four months and I am extremely grateful to the many colleagues who contributed and made this possible. We have uncovered a lot of valuable data and approaches and hope this can be built on by future studies and implementation and testing. We have attempted to “show our workings” as far as possible to facilitate this so the text of the report is accompanied by extensive appendices.

We have made 10 recommendations on future work and implementation. For further information see the Executive Summary online.

The report iteself has chapters covering the Introduction, Methodology, Benefits of Research Data Preservation, Describing the Cost Framework and its Use, Key Cost Variables and Units,the Activity Model and Resources Template, Overviews of the Case Studies, Issues Universities Need to Consider, Different Service Models and Structures, Conclusions and Recommendations. There are also four detailed case studies covering the Universities of Cambridge, King’s College London, Southampton, and the Archaeology Data Service (University of York).

Although focused on the UK and UK universities in particular, it should be of interest to anyone involved with research data or interested generally in the costs of digital preservation.

Comments and Feedback welcome!

Research Data and the Computing Cloud: NSF/Google and IBM

Research in the Cloud: Providing Cutting Edge Computational Resources to Scientists is an interesting recent post to the Google Research Blog. It provides Googles take on its participation in the National Science Foundation/Google/IBM collaboration within The Cluster Exploratory Program (CluE).

The NSF solicitation for proposals was released last week. To quote from the call:
‘In addition to the widespread societal impact of data-intensive computing, this computational paradigm also promises significant opportunities to stimulate advances in science and engineering research, where large digital data collections are increasingly prevalent. Well-known examples include the Sloan Digital Sky Survey, the Visible Human, the IRIS Seismology Data Base, the Protein Data Bank and the Linguistic Data Consortium, however other valuable data collections or federations of data collections are being assembled on an ongoing basis. In many fields, it is now possible to pose hypotheses and test them by looking in databases of already collected information. Further, the possibility of significant discovery by interconnecting different data sources is extraordinarily appealing. In data-intensive computing, the sheer volume of data is the dominant performance parameter. Storage and computation are co-located, enabling large-scale parallelism over terabytes of data. This scale of computing supports applications specified in high-level programming primitives, where the run-time system manages parallelism and data access. Supporting architectures must be extremely fault-tolerant and exhibit high degrees of reliability and availability.
The Cluster Exploratory (CluE) program has been designed to provide academic researchers with access to massively-scaled, highly-distributed computing resources supported by Google and IBM. While the main focus of the program is the stimulation of research advances in computing, the potential to stimulate simultaneous advances in other fields of science and engineering is also recognized and encouraged.’

It should be interesting to see how this collaboration evolves and the datasets it includes. For more information see the The Cluster Exploratory (CluE) program call text.

OR2008 – Presentations available

 

The Open Repositories conference (OR2008) repository is available at http://pubs.or08.ecs.soton.ac.uk/ as a permanent record of the conference activities.

The repository contains papers, presentations and poster artwork for 144 different conference contributions from the main conference sessions (Interoperability, Legal, Models, Architectures & Frameworks, National Perspectives, Scientific Repositories, Social Networking, Sustainability, Usage, Web 2.0), the Poster session, User Group sessions (DSpace, EPrints, Fedora), Birds of a Feather sessions, the Repository Managers session and the ORE Information day.

My powerpoint presentation from the Plenary keynote for the Fedora International Users’ Meeting is also available there. Titled “Keeping alert: issues to know today for long-term digital preservation with repositories” it focussed on research data and sustainability. It drew heavily from the forthcoming JISC Research Data Preservation Costs study and the draft final report titled ‘Keeping Research Data Safe: A Cost Model and Guidance for UK Universities’. It concludes by outlining tentative findings and implications for repositories from that report.

Digital Preservation Cost Models

I blogged back in January on the JISC Research Data Preservation Costs study and promised an update at the end of March. Well the draft final report titled ‘Keeping Research Data Safe: A Cost Model and Guidance for UK Universities’ is now with JISC and being peer-reviewed.

Its been a significant effort and I think it should be a major contribution to thinking on digital preservation cost models and costs in general: hopefully the final report will be out later this Spring. In short we have produced:

’¢ A cost framework consisting of:

o A list of key cost variables divided into economic adjustments (inflation/deflation, depreciation, and costs of capital), and service adjustments (volume and number of deposits, user services, etc);

o An activity model divided into pre-archive, archive, and support services;

o A resources template including major cost categories in TRAC ( a methodology for Full Economic Costing used by UK universities); and divided into the major phases from our activity model and by duration of activity.

Typically the activity model will help identify resources required or expended, the economic adjustments help spread and maintain these over time, and the service adjustments help identify and adjust resources to specific requirements. The resources template provides a framework to draw these elements together so that they can be implemented in a TRAC-based cost model. Normally the cost model will implement these as a spreadsheet, populated with data and adjustments agreed by the institution.

The three parts of the cost framework can be used in this way to develop and apply local cost models. The exact application may depend on the purpose of the costing which might include: identifying current costs; identifying former or future costs; or comparing costs across different collections and institutions which have used different variables. These are progressively more difficult. The model may also be used to develop a charging policy or appropriate archiving costs to be charged to projects.

In addition to the cost framework there are:

’¢ A series of case studies from Cambridge University, Kings College London, Southampton University, and the Archaeology Data Service at York University, illustrating different aspects of costs for research data within HEIs;

’¢ A cost spreadsheet based on the study developed by the Centre for e-Research Kings College London for its own forward planning and provided as a confidential supplement to its case study in the report;

’¢ Recommendations for future work and use/adaptation of software costing tools to assist implementation.

Watch this space (well blog) for a future announcement of the final report and url for the download.

First African Digital Curation Conference

Most digital curation and preservation news seems to come from Europe and North America so it is interesting to see emerging interest in digital curation and digital preservation issues in the developing economies. With that in mind Im flagging up the first African Digital Curation Conference held in Pretoria on 12-13 February which concluded today. The conference was organised under the auspices of the South African Department of Science and Technology, three science councils (the CSIR, the Human Sciences Research Council and the National Research Foundation), the University of Pretoria and the Academy of Science of South Africa.

The conference programme looks interesting.

During the first day, international speakers shared perspectives mainly from the UK, the European Union and the USA, whilst also looking at new roles and opportunities. The South African Minister of Science and Technology, Mr Mosibudi Mangena, talked on the implications of the OECD declaration on Data Sharing for Publicly Funded Research Data for African and South African policy on research data and information management.

Curation of African digital content and practices in specific science domains was the focus of day two of the event. Proceedings concluded with discussion on a formalised network of African data and information curation centres.

I hope there will be conference proceedings or reports and perhaps some colleagues who attend will blog the event: if so I will add a future post to the blog.

« Prev - Next »