Libraries and Archives

Scholarly Journals introduce Supplementary Data Archiving Policy

An important editorial has just appeared online in the February issue of The American Naturalist.
To promote the preservation and fuller use of data, The American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology will soon introduce a new data archiving policy. The policy has been enacted by the Executive Councils of the societies owning or sponsoring the journals. For example, the policy of The American Naturalist will state:

This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species.

This policy will be introduced approximately a year from now, after a period when authors are encouraged to voluntarily place their data in a public archive. Data that have an established standard repository, such as DNA sequences, should continue to be archived in the appropriate repository, such as GenBank. For more idiosyncratic data, the data can be placed in a more flexible digital data library such as the National Science Foundation–sponsored Dryad Archive.

Authors of the editorial, Michael C. Whitlock, Mark A. McPeek, Mark D. Rausher, Loren Rieseberg, and Allen J. Moore present the case for the importance of data archiving in science.   This is the first of several coordinated editorials soon to appear in major journals.

US Scholarly Publishing Roundtable calls for Open Access and Digital Preservation

The Association of American Universities and the American Institute of Physics have issued the following press release:

WASHINGTON, D.C., January 12, 2010 — An expert panel of librarians, library scientists, publishers, and university academic leaders today called on federal agencies that fund research to develop and implement policies that ensure free public access to the results of the research they fund “as soon as possible after those results have been published in a peer-reviewed journal.”

The Scholarly Publishing Roundtable was convened last summer by the U.S. House Committee on Science and Technology, in collaboration with the White House Office of Science and Technology Policy (OSTP). Policymakers asked the group to examine the current state of scholarly publishing and seek consensus recommendations for expanding public access to scholarly journal articles.

The various communities represented in the Roundtable have been working to develop recommendations that would improve public access without curtailing the ability of the scientific publishing industry to publish peer-reviewed scientific articles.

The Roundtable’s recommendations, endorsed in full by the overwhelming majority of the panel (12 out of 14 members), “seek to balance the need for and potential of increased access to scholarly articles with the need to preserve the essential functions of the scholarly publishing enterprise,” according to the report.

“I want to commend the members of the Roundtable for reaching broad agreement on some very difficult issues,” said John Vaughn, executive vice president of the Association of American Universities, who chaired the group. “Our system of scientific publishing is an indispensible part of the scientific enterprise here and internationally. These recommendations ensure that we can maintain that system as it evolves and also ensure full and free public access to the results of research paid for by the American taxpayer.”

The Roundtable identified a set of principles viewed as essential to a robust scholarly publishing system, including the need to preserve peer review, the necessity of adaptable publishing business models, the benefits of broader public access, the importance of archiving, and the interoperability of online content.

In addition, the group affirmed the high value of the “version of record” for published articles and of all stakeholders’ contributions to sustaining the best possible system of scholarly publishing during a time of tremendous change and innovation.

To implement its core recommendation for public access, the Roundtable recommended the following:

  • Agencies should work in full and open consultation with all stakeholders, as well as with OSTP, to develop their public access policies.
  • Agencies should establish specific embargo periods between publication and public access.
  • Policies should be guided by the need to foster interoperability.
  • Every effort should be made to have the Version of Record as the version to which free access is provided.
  • Government agencies should extend the reach of their public access policies through voluntary collaborations with non-governmental stakeholders.
  • Policies should foster innovation in the research and educational use of scholarly publications.
  • Government public access policies should address the need to resolve the challenges of long-term digital preservation.
  • OSTP should establish a public access advisory committee to facilitate communication among government and nongovernment stakeholders.
  • In issuing its report, the Roundtable urged all interested parties to move forward, beyond “the too-often acrimonious” past debate over access issues towards a collaborative framework wherein federal funding agencies can build “an interdependent system of scholarly publishing that expands public access and enhances the broad, intelligent use of the results of federally-funded research.”

The report, as well as a list of Roundtable members, member biographies, and the House Science and Technology Committee’s charge to the group, can be found here.

Keeping Research Data Safe2: Data Survey added to project website

The Keeping Research Data Safe2 project (KRDS2) commenced on 31 March 2009 and will complete in December 2009. The project is identifying long-lived datasets for the purpose of cost analysis (including social sciences and humanities research) and is building on the work of the first “Keeping Research Data Safe” study completed in 2008.

We are currently undertaking detailed analysis of available cost information from 3 of our project partners and aim to develop guidance for how cost metrics can be captured and applied in future from this.

In addition we have now added a survey proforma to the project website to help us identify other research data collections with information on preservation costs and issues. We invite you to contribute to the data survey if you have research datasets and associated cost information that you feel may be of interest to the study.

We anticipate that no organisation will have complete information on costs but most will have cost information in some areas. The aim of the survey is to compile an overview of what preservation cost information is collected.

The Survey proforma is available to download as an Acrobat form (requires Adobe Reader 8+ installed) or a Word form (requires Microsoft Word installed). It should take less than 30 minutes to complete and we are seeking responses (to info@beagrie.com) by the end of October 2009.  The Survey proforma is available as a single main questionnaire or alternatively if you have multiple cost datasets you can complete a separate organisational cover sheet and multiple collection details as required. Please do not hesitate to contact us at info@beagrie.com if you have any difficulty or questions.

Just Published: Survey of Researchers’ Views on Research Data Preservation and Access

The latest Volume of Ariadne (issue 60 July 2009) publishes an article based on recent work by Charles Beagrie Limited and Serco Consulting for the UK Research Data Service (UKRDS) Feasibility Study. It should be of interest to an international as well as UK audience as may of the issues addressed apply to research and research data  issues in any national context.

Research Data Preservation and Access: The Views of Researchers present findings from a UKRDS survey of researchers’ views on and practices for preservation and dissemination of research data in four UK universities (Bristol, Leeds, Leicester, and Oxford) and place them in the wider UK and international context.

A preliminary report from the Survey was included in the UKRDS Interim Report . Elements of the Survey and its findings were also incorporated in the Final Report of the UKRDS Feasibility Study submitted to HEFCE . However space constraints precluded presentation of all the data and findings in full in these reports and they were mainly included in a separate unpublished appendix. This article therefore aims to publish more of this material and set it in its context  with updates from more recent published studies.

UK Ministry of Justice issues updated code of practice to support digital preservation

The following recent press announcement from the UK Ministry of Justice may be of interest to readers of the blog:

The government has today [16 July] set out plans to make sure that more public information is made available and is preserved for future generations.

Justice Minister, Michael Wills, has today announced the publication of a new Code of Practice on managing digital and other records, and the government’s plans to extend the Freedom of Information Act.

Freedom of Information depends on good record keeping and the preservation of information is important if we are to further increase transparency in public life. The updated Code of Practice is a significant step in ensuring that key records remain accessible to public bodies for day to day business and are preserved for future generations. The Code recommends public bodies across the country introduce a strategy for the preservation of digital records to ensure that they can continue to be accessed and used and are resilient to future changes in technology.

The government has also published its response to the consultation on extending the Freedom of Information Act. The government’s response reflects the considerable support for extending the Act. A further consultation will now be undertaken with those proposed for inclusion within the scope of the Act: Academies, the Association of Chief Police Officers (ACPO), the Financial Ombudsman Service and the Universities and Colleges Admissions Service (UCAS).

This is an initial step and further consultations with Network Rail and utility companies will examine how the Freedom of Information Act could apply to other bodies.

These publications support the government’s plans to increase the accessibility of public information and promote the culture of openness and transparency in public life. On 10 June the Prime Minister committed to a reduction of the 30 year rule to 20 years in response to the 30 Year Rule Review. The government is considering carefully the practical details of implementing a new rule and aims to publish its full response in late summer.

Michael Wills, Justice Minister, said:

‘The introduction of the Freedom of Information Act has significantly increased transparency in public life and the right to access information has become a cornerstone of our democracy.

‘The steps we are taking today – to keep and preserve public information for the future and extend the Freedom of Information Act – are significant if we are to truly promote the culture of openness in public life.’

The Code is an updated 2009 version of the Lord Chancellor’s Code of Practice on the management of records issued under section 46 of the Freedom of Information Act 2000.

New Project – study on a National Hosting Strategy for electronic Resources in Germany

The Alliance of German Science Organisations has established a priority initiative for digital information. The digital information initiative is focusing on six major areas: national licensing; open access; a national hosting strategy for electronic resources; primary research data; virtual research environments; and legal frameworks.

I am pleased to announce that Charles Beagrie Limited in association with Globale Informationstechnik GmbH have been awarded the consultancy on behalf of the Alliance for German Scientific Organisations to develop recommendations for a national hosting strategy for electronic resources in Germany.

Neil Beagrie will lead the consultancy with Prof Matthias Hemmje. Charles Beagrie Associates working on the project are Mary Auckland, Julia Chruszcz, Diana Leitch, Tery Morrow, and Najla Rettberg.

Keeping Research Data Safe 2 – Project webpage and project plan now available

The project plan and project webpage for the JISC-funded Keeping Research Data Safe 2 project (KRDS2) are now available on the Charles Beagrie website. The webpage has been set-up to support dissemination of information on the project and provide the background to the work, details of the project partners, and the project plan.

The first Keeping Research Data Safe study funded by JISC made a major contribution to the study of preservation costs by developing a cost model and indentifying cost variables for preserving research data in UK universities.

KRDS2 aims to extend this previous work on digital preservation costs. It is identifying long-lived datasets for the purpose of cost analysis and building on the work of the first “Keeping Research Data Safe” study completed in 2008.

The KRDS2 project commenced on 31 March 2009 and will complete in December 2009. For further information see  the project plan.

Fedora and DSpace Merge to Create DuraSpace Organisation

A landmark development has been announced with the merger of DSpace Foundation and Fedora Commons. Both are major players in digital preservation and open source content management systems particularly in the Higher Education sector. Both have been collaborating closely in recent years and the two organisations have now merged to form the new organisation DuraSpace.

DuraSpace will continue to support its existing software platforms, DSpace and Fedora but in addition is planning a number of new developments. The first new technology to emerge will be a Web-based service named “DuraCloud” – a hosted service that takes advantage of the cost efficiencies of cloud storage and cloud computing, while adding value to help ensure longevity and re-use of digital content. The DuraSpace organisation is developing partnerships with commercial cloud providers who offer both storage and computing capabilities to deliver this service.

I agree wholeheartedly with Cliff Lynch Executive Director of the Coalition for Networked Information (CNI) who is quoted in the press release as follows:

“This is a great development. It will focus resources and talent in a way that should really accelerate progress in areas critical to the research, education, and cultural memory communities. The new emphasis on distributed reliable storage infrastructure services and their integration with repositories is particularly timely.”

For further information on DuraSpace see the new website and press release .

New Project: Keeping Research Data Safe 2

I am pleased to report that Charles Beagrie Ltd will be the lead contractor for Keeping Research Data Safe 2: a  new JISC-funded study of the identification of long-lived digital datasets for the purposes of cost analysis.

The study aims to build on the work from the original Keeping Research Data Safe consultancy and is being undertaken by a consortium consisting of 4 partners involved in the original  study (University of Cambridge, Charles Beagrie Ltd, OCLC, and University of Southampton) and 4 new partners (the Archaeology Data Service, University of Oxford, UK Data Archive, and University of London Computer Centre) with significant data collections and interests in preservation costs. All the partners bring considerable relevant expertise, knowledge and resources to the project.

The new study will identify and analyse sources of long-lived data and develop longitudinal data on associated preservation costs and benefits. We believe these outcomes will be critical to developing preservation costing tools and cost benefit analyses for justifying and sustaining major investments in repositories and data curation.

The project will utilise the Keeping Research Data Safe cost framework as a tool for organising and scoping its work. We will undertake a combination of desk research; a data survey; analytical work with national and disciplinary digital archives that have existing historic cost data for preservation of digital research data; and interaction with digital archives in research universities who have little or no historic cost data but a strong interest in this study and identifying criteria and metrics for capturing cost data going forward.

A project website will be available shortly and regular updates on the study will be posted to this blog.

A Future Combination of PRONOM and GDFR?

An interesting emerging digital preservation development is the Unified Digital Formats Registry (UDFR) combining efforts from the UK National Archive’s PRONOM service and Harvard University’s Global Digital Formats Registry (GDFR).

THE GDFR website notes in April 2009 the GDFR initiative joined forces with the UK National Archives’ PRONOM registry initiative under a new name – the Unified Digital Formats Registry (UDFR). The UDFR will support the requirements and use cases compiled for GDFR and will be seeded with PRONOM’s software and formats database. A new website is being constructed for the UDFR and will be available at www.udfr.org.

To quote from the UDFR Proposal and Roadmap:

” There are two major efforts underway to create a format registry with complimentary strengths and weaknesses. PRONOM, created by The National Archives (TNA) in the UK, has a strong technological base, and has been building a database of original information about various digital formats. PRONOM at this point however is owned and maintained by a single organization, making it vulnerable to changes in that institution. The Global Digital Formats Registry (GDFR) effort, hosted by Harvard University, has developed a model for a registry based on shared governance, cooperative data contribution, and distributed data hosting. However, GDFR is technically less far along in development, and has not yet begun database building.

Given the paucity of resources in the digital preservation community it would be highly unfortunate if these efforts were to compete for resources. Therefore a group of involved and interested institutions have agreed to join together to create a single shared formats registry drawing on the individual strengths of the two existing efforts. The initiative would:

  • be technically based on the existing PRONOM system and database;
  • create a community governance model for the registry involving all institutions willing to contribute to its development;
  • develop a mechanism for the distribution of the registry data in such a way as to support local extensions and additions to the database;
  • develop both technical and organizational support for distributed input to the registry, including some form of quality vetting of contributed data.”

Further details of the proposal are available from the GDFR website.

« Prev - Next »