Digital Curation

Research Data Canada

Readers of the blog may be interested in work underway in Canada via Research Data Canada which is running in parallel to work on UKRDS here in the UK, Datanets in the USA, and ANDS in Australia.

Research Data Canada has established The Research Data Strategy Working Group – a collaborative effort by a multi-disciplinary group of universities, institutes, libraries, granting agencies, and individual researchers to address the challenges and issues surrounding the access and preservation of data arising from Canadian research.

The group is currently working on a draft report “Stewardship of Research Data in Canada: Gap Analysis” which provides a statement of the ideal state of research data stewardship in Canada and a description of the current state, as determined by examining a number of indicators. The purpose is to provide evidence of gaps between current and ideal state in order to begin filling in the gaps. The indicators of the state of the stewardship of research data in Canada are as follows: policies; funding; roles and responsibilities; [trusted digital] data repositories; standards; skills and training; reward and recognition systems; research and development; accessibility; and preservation. The final version will be available in September.

Information on the working group and other Research Data Canada activities is available from its website.

�

UK Research Data Service Feasibility Study

The blog has been very quiet over August and the holidays. This is just a brief first entry (more to come next month) to flag up major consultancy work the company has been undertaking with SERCO Consulting over the last six months on a UK Research Data Service feasibility study for the Higher Education Funding Council for England (HEFCE).

The study has been initiated and led by the consortium of Research Libraries in the UK and Ireland (RLUK) and the Russell Group [of UK Universities] IT Directors (RUGIT) and aims to assess the feasibility and costs of developing and maintaining a national shared digital research data service for UK Higher Education sector. There is more background information on the UKRDS website.

A major part of the study has involved a feasibility and requirements stage working with the universities of Bristol, Leeds, Leicester and Oxford to survey over 700 academics in disciplines across the universities on their research data use and requirements. You will find further information on the Oxford results on the Oxford Scoping Digital Repository Services for Research Data Management Project website. Further information on the overall survey and findings will be available soon and a link and commentary will be posted on the blog.

just published: Research Data Preservation Costs Report

I have posted two previous entries to the blog in March and January detailing progress with the JISC-funded research data preservation costs study. I am pleased to report that the online executive summary and full report (pdf file) titled “Keeping Research Data Safe: a cost model and guidance for UK Universities” is now published and can be downloaded from the JISC website.

It has been an very intensive piece of work over four months and I am extremely grateful to the many colleagues who contributed and made this possible. We have uncovered a lot of valuable data and approaches and hope this can be built on by future studies and implementation and testing. We have attempted to “show our workings” as far as possible to facilitate this so the text of the report is accompanied by extensive appendices.

We have made 10 recommendations on future work and implementation. For further information see the Executive Summary online.

The report iteself has chapters covering the Introduction, Methodology, Benefits of Research Data Preservation, Describing the Cost Framework and its Use, Key Cost Variables and Units,the Activity Model and Resources Template, Overviews of the Case Studies, Issues Universities Need to Consider, Different Service Models and Structures, Conclusions and Recommendations. There are also four detailed case studies covering the Universities of Cambridge, King’s College London, Southampton, and the Archaeology Data Service (University of York).

Although focused on the UK and UK universities in particular, it should be of interest to anyone involved with research data or interested generally in the costs of digital preservation.

Comments and Feedback welcome!

OR2008 – Presentations available

 

The Open Repositories conference (OR2008) repository is available at http://pubs.or08.ecs.soton.ac.uk/ as a permanent record of the conference activities.

The repository contains papers, presentations and poster artwork for 144 different conference contributions from the main conference sessions (Interoperability, Legal, Models, Architectures & Frameworks, National Perspectives, Scientific Repositories, Social Networking, Sustainability, Usage, Web 2.0), the Poster session, User Group sessions (DSpace, EPrints, Fedora), Birds of a Feather sessions, the Repository Managers session and the ORE Information day.

My powerpoint presentation from the Plenary keynote for the Fedora International Users’ Meeting is also available there. Titled “Keeping alert: issues to know today for long-term digital preservation with repositories” it focussed on research data and sustainability. It drew heavily from the forthcoming JISC Research Data Preservation Costs study and the draft final report titled ‘Keeping Research Data Safe: A Cost Model and Guidance for UK Universities’. It concludes by outlining tentative findings and implications for repositories from that report.

SLAIS C21st Curation public lectures 30 April 2008

Now in its fourth year, the annual C21st Curation lecture series is held at the School of Library, Archive and Information Studies (SLAIS) in University College London.

The 2008 C21st Curation public evening lectures will be on 30 April 2008. Come hear two speakers, Roy Clare (Chief Executive, Museums, Libraries and Archives Council) and Carole Souter (Chief Executive, Heritage Lottery Fund) talk about the impact of the recent Government Comprehensive Spending Review on their respective organisations. This seminar is open to students, professionals and the general public in the JZ Young lecture theatre at UCL from 6.00 -7.15pm, followed by a reception to which the speakers and the audience are invited. Attendance is free, but please email slais-admin@ucl.ac.uk to reserve a place.

I will be chairing the session and look forward to the lectures and seeing colleagues at the reception afterwards.

For further information and directions see the SLAIS C21st Curation lectures webpage .

Digital Preservation Cost Models

I blogged back in January on the JISC Research Data Preservation Costs study and promised an update at the end of March. Well the draft final report titled ‘Keeping Research Data Safe: A Cost Model and Guidance for UK Universities’ is now with JISC and being peer-reviewed.

Its been a significant effort and I think it should be a major contribution to thinking on digital preservation cost models and costs in general: hopefully the final report will be out later this Spring. In short we have produced:

’¢ A cost framework consisting of:

o A list of key cost variables divided into economic adjustments (inflation/deflation, depreciation, and costs of capital), and service adjustments (volume and number of deposits, user services, etc);

o An activity model divided into pre-archive, archive, and support services;

o A resources template including major cost categories in TRAC ( a methodology for Full Economic Costing used by UK universities); and divided into the major phases from our activity model and by duration of activity.

Typically the activity model will help identify resources required or expended, the economic adjustments help spread and maintain these over time, and the service adjustments help identify and adjust resources to specific requirements. The resources template provides a framework to draw these elements together so that they can be implemented in a TRAC-based cost model. Normally the cost model will implement these as a spreadsheet, populated with data and adjustments agreed by the institution.

The three parts of the cost framework can be used in this way to develop and apply local cost models. The exact application may depend on the purpose of the costing which might include: identifying current costs; identifying former or future costs; or comparing costs across different collections and institutions which have used different variables. These are progressively more difficult. The model may also be used to develop a charging policy or appropriate archiving costs to be charged to projects.

In addition to the cost framework there are:

’¢ A series of case studies from Cambridge University, Kings College London, Southampton University, and the Archaeology Data Service at York University, illustrating different aspects of costs for research data within HEIs;

’¢ A cost spreadsheet based on the study developed by the Centre for e-Research Kings College London for its own forward planning and provided as a confidential supplement to its case study in the report;

’¢ Recommendations for future work and use/adaptation of software costing tools to assist implementation.

Watch this space (well blog) for a future announcement of the final report and url for the download.

First African Digital Curation Conference

Most digital curation and preservation news seems to come from Europe and North America so it is interesting to see emerging interest in digital curation and digital preservation issues in the developing economies. With that in mind Im flagging up the first African Digital Curation Conference held in Pretoria on 12-13 February which concluded today. The conference was organised under the auspices of the South African Department of Science and Technology, three science councils (the CSIR, the Human Sciences Research Council and the National Research Foundation), the University of Pretoria and the Academy of Science of South Africa.

The conference programme looks interesting.

During the first day, international speakers shared perspectives mainly from the UK, the European Union and the USA, whilst also looking at new roles and opportunities. The South African Minister of Science and Technology, Mr Mosibudi Mangena, talked on the implications of the OECD declaration on Data Sharing for Publicly Funded Research Data for African and South African policy on research data and information management.

Curation of African digital content and practices in specific science domains was the focus of day two of the event. Proceedings concluded with discussion on a formalised network of African data and information curation centres.

I hope there will be conference proceedings or reports and perhaps some colleagues who attend will blog the event: if so I will add a future post to the blog.

Digital Special Collections in Libraries

Its still quite rare to see research library webpages covering the issues of how to manage and curate contemporary special collections in digital formats so I would like to flag up two particularly good examples here.
The first is the The Wellcome Trust Library’s Digital Curation webpages I came across recently. It is an excellent ‘how to’ guide and sharing of practical experience in dealing with digital special collections built up over the last couple of years at Wellcome. It includes links to the Library Strategy, a ‘Digital Curation Toolbox’, and useful glossary and links.
The second is the Workbook on Digital Private Papers produced by the Paradigm project. The Personal Archives Accessible in Digital Media (paradigm) project funded by JISC involved the research libraries of the Universities of Oxford and Manchester. The workbook captures the projects experience in accessioning and ingesting digital private papers into their digital repositories, and processing these in line with archival and digital preservation requirements.

Both are highly recommended.

Archaeology Data Service Charging Policy

I’m currently looking closely at various efforts by different organisations to capture and model digital preservation costs as part of our work for JISC on developing a preservation cost model for research data.

As part of desk research for that work I have re-visited the Archaeology Data Service (ADS) Charging Policy now in its 4th edition (November 2007). I remember its first edition 10 years ago and being invited to comment on it when I was at the Arts and Humanities Data Service. It has continued to develop over the last 10 years but lost none of its accessibility and (professional) interest.

In short, it is a very user friendly, concise and informative document aimed at its depositors in the archaeological data community but its treatment of digital preservation costs and the thorny issue of charging are likely to make it of much wider interest hence this blog entry!

Digital Preservation costs are categorised and briefly explained under four headings:

  • management and administration
  • Ingest
  • Dissemination
  • Storage and refreshment

The document identifies charges for standard deposits and levels of service and indicates potential variants and additional costs. There is an accompanying webpage on refreshment costs.

Its a fascinating (honest) and short read – highly recommended.

For those following the aftermath of the AHRC decision to stop funding the AHDS the following snippet from the charging policy may also be of interest:
“The ADS currently receives some core funding from the Arts and Humanities Research Council (AHRC). The AHRC have indicated that the ADS should investigate a move toward a responsive mode funding for archives created by AHRC funded projects in the long term. In the past the ADS has waived deposit charges for researchers based in UK Higher Education Institutions. Due to the change in our core funding arrangements, from 1st January 2008 ALL deposits, whether from projects created within or outwith UK Higher Education will be subject to some level of charge.”

Google to host research datasets

The Wired Blog gives advanced notice that the domain, http://research.google.com, will soon provide a home for terabytes of open-source scientific datasets. The storage will be free to scientists and access to the data will be free for all. The project, known as Palimpsest, missed its original launch date this week, but will debut soon. It is suggested that Palimsest will fill a major need for scientists who want to openly share their data, and will allow public access to an unprecedented amount of data. For example, two planned datasets are all 120 terabytes of Hubble Space Telescope data and the images from the 10th century manuscript the Archimedes Palimpsest.

Those with long memories (hopefully prevalent amongst digital preservationists!) will also remember the Google/ Nasa memorandum of understanding signed in September 2005 that ‘outlines plans for cooperation on a variety of areas, including large-scale data management, massively distributed computing, bio-info-nano convergence, and encouragement of the entrepreneurial space industry’ so perhaps we should expect more major announcements along these lines from Google and NASA in months to come.

« Prev - Next »