Digital Curation

Reflections on the Digital Preservation Handbook Book Sprint 28-29 October 2014

What a terrific couple of days! We completed a two day book sprint in London last week focussing on developing new content for the first release of the next edition of the Digital Preservation Handbook that is being funded by The National Archives, the British Library, and Jisc. Really pleased with the outputs and progress we made.

A group of 11 people Matthew Addis (Arkivum), Neil Beagrie (Charles Beagrie Ltd), Stephanie Davidson (West Yorkshire Archive Service), Michael Day (British Library), Matt Faber (Jisc), Chris Fryer (Parliamentary Archives), Anna Henry (the Tate Gallery), William Kilbride (DPC), Ed Pinsent (ULCC), Virginia Power (Jisc), Susan Thomas (Bodleian Library Oxford), met up over two days to progress sections of the content for the new “Technical Solutions and Tools” chapter of the Handbook (as identified in the Draft Outline of the 2nd Edition of the Digital Preservation Handbook). Accommodation for the sprint was kindly provided by the Jisc in their central London offices via the good offices of Neil Grindley.

We have completed draft sections for:

  • Tools (including guidance on Tool Registries)
  • Media and Storage
  • File Formats
  • Digital Forensics

In addition a content outline was agreed for the “Getting Started” sub-section of the Introduction.  Alongside this work, other sections including the Background, How to Use the Handbook, Definitions and Concepts, Acronyms and Initials, and References have been partially revised as we went.

The revision has been guided by the user feedback and consultation (see Report on the Preparatory User Consultation on the 2nd Edition of the Digital Preservation Handbook) in short to keep the Handbook text practical, concise, and accessible with more detail available in the case studies and further reading.

This was the first book sprint for all bar one of the participants. We learnt a lot about the strengths and weaknesses of “Booktype” the open source software we used that had been developed to help support this type of activity, eventually settling on using it in parallel with collaborative text tools such as Google Docs to get the best from each approach. A two-day book sprint was very intense but few could have spared more time away from the workplace, and as one participant said a tight-deadline helped everyone focus on the tasks in hand.

At the end of the sprint the challenge was set to aim to make the new content available within 3 months – we hope sufficient additional sections to create a ready critical mass, potentially the complete Tools and Solutions Chapter of the Handbook can be readied and transferred to the DPC website and reviewed for release in the New Year.

Survey results and the contents outline for new edition of the Digital Preservation Handbook just published

A big thank-you from Neil Beagrie and William Kilbride to everyone who contributed to the recent audience research survey or who  commented on the potential contents outline for the new edition of the Digital Preservation Handbook.

Following that work, the DPC and Charles Beagrie Ltd are delighted to announce the release two important documents which will form the foundations of the new edition of the DPC Digital Preservation Handbook: the results of a major survey into audience needs, an the first full outline of content.

‘We are very keen to make sure that the new edition of the handbook fits with people’s actual needs so we were very encouraged by the substantial response to the consultation document which we sent out before summer’ explained Neil Beagrie who is editor and lead author of the new edition of the handbook. ‘We estimate that the digital preservation community represented on the JiscMail list numbers around 1500 people in total: and there were 285 responses to the survey.’

‘It a very large sample of the community but it’s also re-assuringly diverse.  There’s a strong representation from higher education and public sector agencies but there’s also a sizeable group from industry, from charities as well as museums and community interest groups.  When asked if they would use the handbook, not a single respondent said no.’

‘The survey has directly informed the contents of the new handbook’, explained William Kilbride, Execuitve Director of the DPC.  ‘We started with an idea of the gaps and the many parts that had become outdated since the original handbook was published.  So we invited users to tell us what they wanted and how they wanted it – both in terms of content and presentation.  The project team has responded thoughtfully to these requests so I am confident that the resulting list of content is tailored to people’s needs. But we remain open to suggestions and comments’

‘This will help ensure that the handbook remains relevant for many years to come.’

The two documents are available as follows:

Trending: The Value and Impact of Data Sharing and Curation

A colleague has pointed out that our synthesis report for Jisc on the Value and Impact of Data Sharing and Curation has had over 3,900 downloads since April 2014. You can see the stats and access the report here on the Jisc Repository.

It is great to see that there is a very high level of interest in the topic and report. I’m not sure how that figure compares, but if you have done work for Jisc you should now be able to search or browse the Jisc repository and see the download stats for your own work. Potentially, access to the Jisc repository stats is going to be very useful for those involved in REF or needing to demonstrate their  impact to their institutions and other stakeholders.

New Research: The value and impact of data curation and sharing

Substantial resources are being invested in the development and provision of services for the curation and long-term preservation of research data. It is a high priority area for many stakeholders, and there is strong interest in establishing the value and sustainability of these investments.

A 24 page synthesis report published today aims to summarise and reflect on the findings from a series of recent studies, conducted by Neil Beagrie of Charles Beagrie Ltd. and Prof. John Houghton of Victoria University, into the value and impact of three well established research data centres – the Economic and Social Data Service (ESDS), the Archaeology Data Service (ADS), and the British Atmospheric Data Centre (BADC). It provides a summary of the key findings from new research and reflects on: the methods that can be used to collect data for such studies; the analytical methods that can be used to explore value, impacts, costs and benefits; and the lessons learnt and recommendations arising from the series of studies as a whole.

The data centre studies combined quantitative and qualitative approaches in order to quantify value in economic terms and present other, non-economic, impacts and benefits. Uniquely, the studies cover both users and depositors of data, and we believe the surveys of depositors undertaken are the first of their kind. All three studies show a similar pattern of findings, with data sharing via the data centres having a large measurable impact on research efficiency and on return on investment in the data and services. These findings are important for funders, both for making the economic case for investment in data curation and sharing and research data infrastructure, and for ensuring the sustainability of such research data centres.

The quantitative economic analysis indicates that:

  • The value to users exceeds the investment made in data sharing and curation via the centres in all three cases – with the benefits from 2.2 to 2.7 times the costs;
  • Very significant increases in work efficiency are realised by users as a result of their use of the data centres – with efficiency gains from 2 to 20 times the costs; and
  • By facilitating additional use, the data centres significantly increase the returns on investment in the creation/collection of the data hosted – with increases in returns from 2 to 12 times the costs.

The qualitative analysis indicates that:

  • Academic users report that the centres are very or extremely important for their research, with between 53% and 61% of respondents across the three surveys reporting that it would have a major or severe impact on their work if they could not access the data and services; and
  • For depositors, having the data preserved for the long-term and its dissemination being targeted to the academic community are seen as the most beneficial aspects of depositing data with the centres.

An important aim of the studies was to contribute to the further development of impact evaluation methods that can provide estimates of the value and benefits of research data sharing and curation infrastructure investments. This synthesis reflects on lessons learnt and provides a set of recommendations that could help develop future studies of this type.

The synthesis report

Beagrie, N. and Houghton J.W. (2014) The Value and Impact of Data Sharing and Curation: A synthesis of three recent studies of UK research data centres, Jisc. PDF (24 pages)

 

What is the Impact of Research Data in the Arts and Humanities?

The AHRC periodically commissions case studies to investigate the impact and value of AHRC-funded research. Across the series as a whole, impact has been defined in its broadest sense to include, economic, social, and cultural elements. The latest AHRC case study, Safeguarding our heritage for the future, focuses on the impact of data sharing and curation through the Archaeology Data Service.

It cites some of the Jisc-funded “The Value and Impact of the Archaeology Data Service: A study and methods for enhancing sustainability” study by ourselves and John Houghton.

There is the headline research efficiency impact message on page 1 and the relevant detail on page 2 of the case study as follows:

“JISC commissioned research carried out in 2012 found that the ADS has a broad user group which goes well beyond academia: whilst 38% of users are conducting academic research, 19% use ADS for private research;17% for general interest enquiries; 11% are Heritage Management users and 8% are commercial users; 6% use it to support teaching and learning activities; and 1% use it for family history research. The ADS is respected as an invaluable resource, saving users time and therefore money, and providing security for those who use the service to deposit their data. A significant increase in research efficiency was reported by users as a result of using the ADS, worth at least £13 million per annum – five times the costs of operation, data deposit and use. A potential increase in return on investment resulting from the additional use facilitated by ADS may be worth between £2.4 million and £9.7 million over thirty years in net present value from one-year’s investment – a 2-fold to 8-fold return on investment.”

The pdf version of the Safeguarding our heritage for the future case study  is available for download on the AHRC website.

AHRC Case Study

 

 

New Study Shows Availability of Research Data Declines Rapidly with Article Age

A Nature news item “Scientists losing data at a rapid rate“ reports and provides a valuable commentary on, a research article by Timothy Vines et al published today in Current Biology that looked at the availability of research data for Ecology articles over 2-22 years.

The researchers had requested data sets from a relatively homogenous set of 516 Ecology articles published between 2 and 22 years ago, and found that availability of the underlying data was strongly affected by article age. For papers where the authors gave the status of their data, the odds of a data set being extant fell by 17% per year over that period. Availability dropped to as little as 20% for research data from the early 1990s. In addition, the odds that they could find a working e-mail address for the first, last, or corresponding author fell by 7% per year.

Although solely focussed on Ecology, this is an interesting addition to a growing body of research on data sharing and availability, and to the case for archiving initiatives such as Dryad, Figshare, and institutional data repositories when no international or disciplinary archive exists.

Measuring the Value and Impact of Research Data Curation and Sharing

My colleague John Houghton gave an excellent 20 minute Presentation at the October 2013 Open Access Research Conference in Brisbane on recent studies conducted by Charles Beagrie Ltd and Victoria University covering the value and impact of sharing research data via three UK research data centres. I highly recommend it as an accessible, concise, overview. The video of the presentation is now available at https://vimeo.com/82043019

It summarises recent studies exploring the impact and value of the Economic and Social Data Service (ESDS), the Archaeology Data Service (ADS), and the British Atmospheric Data Centre (BADC). The aim of the studies was to both assess the costs, benefits, value and impacts of the data centres, and to test a range of economic methods in order to ascertain which methods might work across three very different fields, with very different data production and use practices, and very different user communities. The presentation focuses on the methods used and lessons learned, as well as the headline findings.

As blogged previously the three reports for the ESDS, ADS, and BADC are all available now as individual open-access publications. A short synthesis of all three reports is being published by Jisc in the New Year.

The Value and Impact of The Archaeology Data Service: findings released on research data sharing and curation

Neil Beagrie of Charles Beagrie Ltd and Professor John Houghton of the Centre for Strategic Economic Studies (CSES) are pleased to announce the release of their final report from the Jisc study which examined the value and impact of the Archaeology Data Service (ADS). The aim of this study is to explore and attempt to measure the value and impact of the ADS. A range of economic approaches were used to analyses data gathered through online surveys, and user and depositor statistics, to supplement and extend other non-economic perceptions of value.

The study reveals the benefits of integrating qualitative approaches exploring user perceptions and non-economic dimensions of value with quantitative economic approaches to measuring the value and impacts of research data services. Such a mix of methods is important in capturing and presenting the full range and dimensions of value. The approaches are complementary and mutually reinforcing, with stakeholder perceptions matching the economic findings. For example, both qualitative and quantitative analysis highlights the important contribution of ADS data and services to research efficiency.

The study has changed stakeholder perceptions, increasing recognition of the value of the ADS and digital archiving and data sharing generally. Most stakeholders already valued ADS highly, but felt the study had extended their understanding of the scope of that value, and the degree of its value to other stakeholders. They were positive about seeing value expressed in economic terms, as this was something they had not previously considered or seen presented,

The report is available for download as a PDF file at: http://repository.jisc.ac.uk/5509/1/ADSReport_final.pdf

This report forms part of a series of independent studies produced by the authors on the value and impact of three UK research data centres. The other data centres already reported upon are the Economic and Social Research Data Service (ESDS), and the British Atmospheric Data Centre (BADC). To summarise and facilitate dissemination of the key findings from all three data centre studies a separate synthesis is currently being prepared by Jisc.

New study released: the Value and Impact of the British Atmospheric Data Centre (BADC)

Jisc in partnership with the Natural Environment Research Council (NERC) have commissioned work by Neil Beagrie of Charles Beagrie Ltd and Professor John Houghton of Victoria University to examine the value and impact of the British Atmospheric Data Centre (BADC).  We are pleased to announce publication today of the study report.

The key findings

The study shows the benefits of integrating qualitative approaches exploring user perceptions and non-economic dimensions of value with quantitative economic approaches to measuring the value and impacts of research data services.

The measurable economic benefits of BADC substantially exceed its operational costs. A very significant increase in research efficiency was reported by users as a result of their using BADC data and services, estimated to be worth at least £10 million per annum.

The value of the increase in return on investment in data  resulting from the additional use facilitated by the BADC was estimated to be between £11 million and £34 million over thirty years (net present value) from one-year’s investment – effectively, a 4-fold to 12-fold return on investment in the BADC service.

The qualitative analysis also shows strong support for the BADC, with many users and depositors aware of the value of the services for them personally and for the wider user community.

For example, the user survey showed that 81% of the academic users who responded reported that BADC was very or extremely important for their academic research, and 53% of respondents reported that it would have a major or severe impact on their work if they could not access BADC data and services.

Surveyed depositors cited having the data preserved for the long-term and its dissemination being targeted to the academic community, as the most beneficial aspects of depositing data with the BADC, both rated as a high or very high benefit by around 76% of respondents.

The study report

The study report is available for download as a PDF file at: http://www.jisc.ac.uk/whatwedo/programmes/di_directions/strategicdirections/badc.aspx

The British Atmospheric Data Centre (BADC)
The BADC, based at the STFC Rutherford Appleton Laboratory in the UK, is the Natural Environment Research Council’s (NERC) Designated Data Centre for the Atmospheric Sciences. Its role is to assist UK atmospheric researchers to locate, access, and interpret atmospheric data and to ensure the long-term integrity of atmospheric data produced by NERC projects. There is also considerable interest from the international research community in BADC data holdings.

public release of guidance document Research Data Management and REF2014

The Research360 project is pleased to announce the public release of its guidance document Research Data Management and REF2014 prepared by staff at the University of Bath and Charles Beagrie Ltd. It is being disseminated and shared with the research community in Bath and other universities.

Many universities are still in the process of enhancing and formalising strategies for research data management at this time, so this paper may contribute to planning for future assessment exercises beyond REF2014, as well as business cases for further development of strategies and procedures for research data in research-intensive universities.

With the results from the REF determining institutional quality-related (QR) funding allocations (just over £1.3 billion in 2012/13), the research element of QR funding is one of the key funding streams for research in UK universities. Support for future assessment exercises is therefore a potential element in any business case for research data management.

The Research Data Management and REF2014 document can be downloaded in Word or PDF formats from: http://opus.bath.ac.uk/35518/.

The REF guidance document follows on from the previous release of the summary stakeholder benefits analysis (based on the KRDS Benefits Framework) from the Research Data Management business case for the University of Bath. The stakeholder benefits analysis is also still available separately to download in PDF format from http://opus.bath.ac.uk/32509.

The Research360 project is funded by Jisc and is developing the technical and human infrastructure for research data management at the University of Bath, as an exemplar research-intensive university.

« Prev - Next »