Digital Curation

Digital Preservation Handbook Update February 2016

Originally published in 2001 as a paper edition, ‘Preservation and Management of Digital Materials: a Handbook’ was the first attempt in the UK to synthesise the diverse and burgeoning sources of advice on digital preservation.  Demand was so great that in 2002, a free online edition of the Handbook was published by the newly established Digital Preservation Coalition.

After more than a decade, in which digital preservation has been transformed, the Handbook remains among the most heavily used area of the DPC website.

Funders and organisations are collaborating on re-designing, expanding and updating the Handbook so it can continue to grow as a major open-access resource for digital preservation. The DPC and Charles Beagrie Ltd have been engaged on a major re-working of the Digital Preservation Handbook for release as a new edition over 2015/2016. The National Archives (our Gold Sponsor) working together with other stakeholders including Jisc, the British Library, and The Archives and Records Association (our Silver Sponsors), and the National Records of Scotland (our Bronze Sponsor) is supporting the Digital Preservation Coalition in updating and revamping the Handbook. Many individuals and organisations are also contributing to this work through book sprints, peer review, project and advisory boards.

The revision, guided by the user feedback and consultation (see Report on the Preparatory User Consultation on the 2nd Edition of the Digital Preservation Handbook), is modular and being undertaken over a two year period to March 2016.

We have provided updates at regular intervals to inform the community on progress with the project and with this final February update we are delighted to announce a number of key developments.

 

Publication Schedule

The 2nd edition of the Handbook had a partial “soft launch” in October 2015 and approximately 2/3rds is online and publicity accessible at http://www.dpconline.org/advice/preservationhandbook

This partial release will be further enhanced by additional functionality when a new platform for the website focused on ‘responsive design’ is brought on stream by the DPC in 2016. This will provide an updated design and improved user experience on mobile and tablet devices, compared to the current site templates that are optimised for viewing on a desktop screen. We will also add the facility to generate PDFs. In the interim some functionality and content will remain “works in progress” but the community have gained early access to a significant new resource.

The remaining 14 sections to complete the Handbook have now been written, edited and are in peer review (see Handbook contents page for coming soon sections). We are aiming to complete this work and revise content for publication by the end of March 2016. The Handbook is now live so we will need to close and update section by section for these 14 remaining updates, hopefully in the final week of March and/or early April 2016. Watch this space for future announcements!

NRS joins funding group

The Digital Preservation Coalition was delighted to announce this month that The National Records of Scotland (NRS) had come on board as a ‘Bronze Sponsor’ for the eagerly anticipated second edition of the ‘Digital Preservation Handbook’. As of February 2016, with the addition of the NRS we have raised 93% of estimated funding required for the Handbook revision. We have prioritised content creation, scaled back some events, and adjusted budgets to ensure completion within a very tight funding profile.

Slideshare from Handbook Workshop at DCDC15

A workshop on the Digital Preservation Handbook was run at the DCDC15 conference in early October. Powerpoint slides from the Handbook presentation are now available on Slideshare. They provide a detailed overview of the new edition Handbook and work in progress. To date, there have been over 2,000 views of the slides.

European Bioinformatics Institute economic impact slideshare

A short set of 4 powerpoint slides summarising the findings on the economic impact of the European Bioinformatics Institute with extensive accompanying slides notes, all CC-BY licensed, have been placed on Slideshare.

The European Bioinformatics Institute (EMBL- EBI), located on the Wellcome Genome Campus in Hinxton, UK, manages public life-science data on a very large scale, making a rich resource of information freely available to the global life science community. EMBL-EBI is one of a handful of organisations in the world involved in global efforts to exchange information, set standards, develop new methods, and curate complex genome information.

We published a full report this week with the results of a quantitative and qualitative study of the Institute, examining the value and impact of its work. Our focus is the economic impact and can be seen as complementary to traditional academic measures, such as citation counts.

The summary slides show the quantitative economic approaches used included: estimates of access and use value, contingent valuation using stated preference techniques, an activity-costing approach to estimating the efficiency impacts of EMBL-EBI data and services, and a macro-economic approach that seeks to explore the impacts of EMBL-EBI use on returns to investment in research. These approaches allowed us to develop a picture, beginning with estimates of minimum direct values for the EMBL-EBI’s user community and moving progressively toward approaches that measure wider social and economic value.

New report: The Value and Impact of the European Bioinformatics Institute

We are pleased to announce a new report: The Value and Impact of the European Bioinformatics Institute.

In 2015, Charles Beagrie Ltd  was commissioned by the European Bioinformatics Institute (EMBL-EBI), to study and analyse its economic and social impact.

The EMBL- EBI, located on the Wellcome Genome Campus in Hinxton, near Cambridge in the UK, manages public life science data on a very large scale, making a rich resource of genome information freely available to the global life science community.

The full report published today presents the results of the quantitative and qualitative study of the Institute, examining the value and impact of its work. The report highlights key findings, including that EMBL-EBI data and services made commercial and academic R&D significantly more efficient. This benefit to users and their funders is estimated, at a minimum, to be worth £1 billion per annum worldwide – equivalent to more than 20 times the direct operational cost of EMBL-EBI.

A press release with further information is available on the EMBL-EBI website at http://www.ebi.ac.uk/about/news/press-releases/value-and-impact-of-the-european-bioinformatics-institute

The Full Report is available online in printable format at http://www.beagrie.com/EBI-impact-report.pdf

A short Executive Summary version of the report is available online in printable format at http://www.beagrie.com/EBI-impact-summary.pdf

12 slideshares for Xmas: 20 years in digital preservation

I have just posted the final instalment of a personal selection of 12 presentations drawn from events and topics over the last 20 years in digital preservation, which I hope will be of interest.

They are taken from events on four different continents including the first iPres conference and cover themes such as personal archiving, research data management, e-journals, the digital preservation lifecycle model, national and institutional strategies and collaboration, costs/benefit/economic impacts of digital preservation, the establishment of the Digital Preservation Coalition, and the development of the online Digital Preservation Handbook. I hope there will be something in there for everyone.

There are accompanying blog narratives which set the presentations into context and the powerpoint presentations themselves on Slideshare. Details and web links to them are as follows:

2014 – The Value and Impact of Research Data Infrastructure (economic impact), presentation to the Preservation and Archiving Special Interest Group (PASIG), Karlsruhe Germany    slides     narrative

2013 – Maintaining a Vision: how mandates and strategies are changing with digital content (changes and responses), keynote presentation to Screening the Future conference, London UK slides     narrative

2010 – Keeping Research Data Safe (digital preservation costs and benefits), presentation to KB Experts Workshop on Digital Preservation Costs, The Hague Netherlands          slides     narrative

2007 – Digital Preservation: Setting the Course for a Decade of Change (evolution or revolution?), keynote presentation to the Belgian Association for Documentation (ABD-BVD), Brussels Belgium              slides     narrative

2005 – Digital Preservation and Curation Summing up + Next Steps (setting curation and research agenda for2005-2015), conclusions to Warwick II Workshop, Warwick UK             slides     narrative

2005 – Plenty of Room at the Bottom? Personal Digital Libraries and Collections, keynote presentation to European Conference on Research and Advanced Technology for Digital Libraries (ECDL), Vienna Austria   slides     narrative

2004 – eScience and Digital Preservation, presentation to Association for Information Science and Technology (ASIST) conference, Rhode Island USA                  slides     narrative

2004 –  The JISC Continuing Access and Digital Preservation Strategy 2002-5(covering UK Higher Education sector and partners), presentation to the JISC-CNI conference, Brighton UK slides  narrative

2004 –Digital Preservation, e-journals and e-prints, presentation at private workshop 1st iPres conference, Beijing China                 slides     narrative

2004  –  The Digital Preservation Coalition (DPC), Its History, Programme, Rationale ,and Structure, set of 4 linked presentations to DPC Forum, London UK              slides     narrative

2001 – Preservation Management of Digital Materials (the Digital Preservation Handbook) presentation to Digital Preservation Workshop/State Library, Melbourne Australia         slides     narrative

1998 – Preserving Digital Collections: current methods and research (digital preservation lifecycle model), presentation to the Society of Archivists annual conference, Sheffield UK             slides     narrative

This is a baker’s dozen as there is a also bonus presentation from 2015 on slideshare covering the latest work on The Digital Preservation Handbook (new edition for full release in March 2016).

The background and narrative blog for this personal selection of presentations is also available.

SlideShare: The Value and Impact of Research Data Infrastructure

This slideshare, The Value and Impact of Research Data Infrastructure, was given at the Preservation and Archiving Special Interest Group (PASIG) meeting in September 2014 held at Karlsruhe, Germany. It is the final instalment of 12 presentations I have selected to mark 20 years in Digital Preservation. It demonstrates the value of preservation and re-use of research data.

Between 2011 and 2014, Charles Beagrie Ltd and John Houghton completed three major studies on the economic value and impact of the Archaeology Data Service, the British Atmospheric Data Centre, and the Economic and Social Research Data Service, and a synthesis of the three studies. In these studies, we developed and refined qualitative and quantitative methodologies to measure the value and impact of research data and associated services and tools.

This combination of methods has broken new ground in approaches to assessing the value and impact of major research data services and provided a strong evidence base and compelling outcomes.  In a recent review of the international state of the art as regards the relationships between large-scale science facilities and innovation performance, our work was one of 3 studies highlighted to UK Department of Business, Innovation and Skills as being particularly good examples of ‘good practice’ in the measurement of economic impacts.

The presentation focuses on these studies, with the study of the Archaeology Data Service given as a detailed example. It has a UK Focus but the research and lessons are international. These studies are also three of the few quantitative studies of the value and impact of digital preservation currently available.

A fourth study on the value and impact of the EMBL European Bioinformatics Institute has since been completed by Charles Beagrie Ltd and John Houghton and should be available in 2016.

New Resources page on Charles Beagrie Website

We have produced a new resources pages on our website describing all the outputs we have produced which are publicly available and accessible on open access to students and practitioners interested in our work. Areas described include Cost/Benefit, Impact, Technology Watch, Digital Preservation Policies and Strategies. Conference presentations, and other digital preservation resources. These are linked either to outputs on our website or on the websites of clients and partners. An extract of the page is shown below.

Keeping Research Data Safe (KRDS)

Keeping Research Data Safe (KRDS), a workshop presentation from 2010 available now on Slideshare, is the ninth of 12 presentations I have selected to mark 20 years in Digital Preservation. The remaining two to come will be published at monthly intervals over November and December 2015.

This presentation was given as part of the KB Experts Workshop on Digital Preservation Costs, held at The Hague in the Netherlands in 2010.

Although very small in terms of budget, the KRDS projects were terrific examples of collaboration to achieve influential results and the pleasure and value of working with colleagues from many disparate fields and organisations. I’ve selected it as an example of doing great things on small budgets if you have the right people, and for its influence on subsequent work both by me (e.g. impact studies) and on the field generally. For me, in terms of personal follow-up and later projects, the costs element of KRDS has been less important than the benefits side which has led to a series of project on impact with John Houghton (more on this in the final Slideshare in December).

The KB requested a briefing document on each cost model presented at the workshop in the form of responses to their set questions. I have reproduced mine for the KRDS presentation below – it captures lots of interesting context for the slides. I have added links to the KRDS Factsheet and KRDS costs data survey to it.

THE KEEPING RESEARCH DATA SAFE MODEL

Outline:

1. General presentation of the cost model

What is the purpose of the cost model?  The KRDS model aims to support the costing of digital preservation of research datasets and assessment of the benefits of preservation. A significant proportion of its work is also focussed on identification of preservation cost data sources and methods which could support any model. It is currently primarily a set of tools and methods to construct a localised model rather than a pre-developed generic costing tool. Further information on findings from the KRDS projects is available in the KRDS Factsheet.

Who are the users? – The primary audience is research organisations in the UK but organisations in other countries and sectors can adopt parts of the model and its methodologies.

What preservation strategies does it handle? – It can accommodate any preservation strategy or service strategy (e.g. outsourcing or shared services as well as preservation in-house).

What is the target data? – Research data from the sciences, social sciences, or arts and humanities.

What time perspective does it cover? – Any time period.

2. What method is the cost model based on?

What reference is the model based on?  – The model uses OAIS with extensions and adaptations by the project team.

What financial principles is it based on? – It is modelled to adopt the Transparent Approach to Costing (TRAC) a full economic costs (FEC) model approved by UK research funders and universities.

Which costing approach have you adopted?– We use an activity based costing approach supported by a Benefits Taxonomy for assessing benefits.

What implementation have you chosen? – N/A

3. Which challenges do you currently see in relation to cost modelling?

Special issues – General cost model challenges? –

Primarily a lack of good quality preservation cost data from a range of different types of archive and data types (see our KRDS costs data survey) which can be used to underpin and develop models.

Secondly an excessive focus on costs (rather than cost/benefits) and also sometimes a too limited focus on costs of preservation strategies rather than preservation service costs as a whole.

Occasional over-reliance on research project or start-up cost data which will not be representative of operational preservation costs.

The degree of confidence that can be placed in results from cost models. How reliable is any cost prediction for a model and how does that change over time or other variables?

4. What are the opportunities for standardisation of cost models and collaboration between projects?

Possible standardisation and alignment of cost models? – I think cost models always need to be tailored to some degree to different audiences/sectors and prospects for standardisation and alignment may be variable. Some areas e.g. digital storage costs may be more promising than others.

Collaboration? – I can see beneficial opportunities for both formal and informal partnerships between projects and organisations. There may be opportunities for European and international collaboration.

5. What are your initial comments and feedback on the draft decision tree appended below?

A decision tree could start much earlier and involve different decisions on the cost model itself e.g. scope of activities, level of detail, and sources of data.

6. Please provide a short one paragraph biography for yourself

Neil Beagrie is director of consultancy at Charles Beagrie and principal investigator for the JISC Keeping Research Data Safe project which has investigated the costs and benefits of digital preservation for research data. He is an experienced senior consultant and an internationally recognised expert with extensive experience in information management, digital preservation, and developing access to digital collections.

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade [2005-2015]. How did we do?

The Warwick3 Workshop: Digital Preservation and Curation Summing up + Next Steps available now on Slideshare is the eighth of 12 presentations I have selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015.

I have chosen it as it briefly allows us to look back at aspirations and achievements in Digital Preservation over a 20 year period from the very first (and seminal) Warwick 1 workshop held in 1995 to today. The first Warwick workshop considered the Long Term Preservation of Electronic Materials and a UK response to the final report of the RLG/CPA Task Force on Digital Archiving. Two further Warwick workshops followed in 1999 and 2005 to review progress and set a forward agenda.

The two-day workshop that took place over 7 – 8 November 2005 at the University of Warwick aimed for the first time to address digital preservation issues for both scientific data and cultural heritage and to map out a future research agenda for them. Sponsored by JISC, the Digital Curation Centre (DCC), the British Library and the Council for the Central Laboratory of the Research Councils (CCLRC), the invitation-only event drew a wide range of national and international experts to explore the current state of play with a view to shaping future strategy. The slides are from my summing up and conclusions at the workshop close.

Part of my conclusions (slides 12-13), outlined the recommendations of the previous Warwick workshop held in 1999 and reviewed the progress that had been made in implementing them over the subsequent five years with a very subjective level of achievement (some) to √ √ √ (good) as follows:

Raise awareness

√ √ √ DPC advocacy, EU council, UNESCO, CODATA, ICSTI, NSF,RCUK

Encourage cross-sectoral communication

√ √ Established Digital Preservation Coalition 2001 – now 27 members

Develop guidelines

√ √ Preservation Management Handbook, Curation Manual, Cornell tutorial

Preservation Centre/Network of centres

√ √ Digital Curation Centre, British Library, The National Archives

Certification criteria

RLG/NARA checklist (TRAC)

Checklist to determine complexity and cost

JISC 04/04 funding programme (LIFE project, assessment tool project)

New research – emulation, dynamic data

Camileon project, JISC 04/04 programme, DCC research agenda

So how have we done 10 years further on?  Overall, OK I think with the caveat progress in digital preservation can take a long time. Perhaps I would raise the achievement levels if doing this exercise again in 2015 for “Encourage cross-sectoral communication”, “Checklist to determine complexity and cost”, and “New research”. However I would probably move Raise Awareness down one level. The others would probably be about the same. How about you?

20 years in DP: eScience and Digital Preservation 2004

eScience and Digital Preservation, presentation to Association for Information Science and Technology (ASIST) conference November 2004, Rhode Island USA, available now on Slideshare is the sixth of 12 presentations I’ve selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015.

It is closely related to the previous slideshare for May on the Jisc continuing access and digital preservation strategy but focuses just on the science component.

This is one I wasn’t able to present in person but it was kindly delivered by Gail Hodge.

My brief for the presentation was “thoughts or citations you have for the impact of e-science, particularly the GRID, on information management, particularly archiving, preservation and long-term access.”

It is a short presentation of 15 slides covering collection-based science, the Grid, data publishing, and the background and rationale for the Digital Curation Centre (just launched two weeks before in the UK).

It is a snapshot in time and of key issues in 2004 – interesting to contrast with what one would write 10 years on and ponder on progress made.

20 years in DP: The JISC Continuing Access and Digital Preservation Strategy 2002-5

The JISC Continuing Access and Digital Preservation Strategy 2002-5, presentation to the 2004 JISC-CNI conference, Brighton UK available now on Slideshare is the fifth of 12 presentations I’ve selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015 (however due to sheer volume of work over May this year including the EBI Impact Survey and the 2nd Handbook sprint, two monthly selections are appearing together this time!).

For those outside the UK, an important context is that Jisc’s role as a national body for digital infrastructure and content on behalf of UK universities and colleges, gave the Strategy considerable influence at the time not just within HE but in other sectors through partnership activities.

This presentation from 2004 is important largely for the legacy of the Strategy that helped establish bodies such as the Digital Preservation Coalition and the Digital Curation Centre, which still have a major influence today.

The presentation sets out the context and rationale for the Strategy including the predicted growth of electronic publications, scientific data, and data curation. The implications of that growth were seen as:

  • Core funding for institutions would not grow in line with information growth;
  • A need for more automation and tools;
  • A need for new shared services and information infrastructure;
  • A significant need for R&D and investment to prepare for this.

Therefore  the objectives of Strategy were:

  • As an advocacy document to secure additional funding of £6m over 3 years (2002-5) for new programmes in electronic records management and digital preservation;
  • Justify the accompanying implementation plan;
  • Provide a longer-term framework and rationale for activity extending beyond 2005.

Fortunately activity in these areas did continue beyond 2005 under a series of very able Jisc programme directors and managers.

« Prev - Next »