Digital Preservation

Preserving Moving Picture and Sound Technology Watch Report: preview released for DPC members

I am delighted to announce that the second title in the new Technology Watch Reports Series that Charles Beagrie Ltd been producing for the Digital Preservation Coalition has just been released as a preview to DPC members.

The report ‘Preserving Moving Pictures and Sound’ is authored by Richard Wright, formerly of the BBC. It discusses issues of moving digital content from carriers (such as CD and DVD, digital videotape, DAT and minidisc) into files. This digital to digital ‘ripping’ of content is an area of digital preservation unique to the audio-visual world, and has unsolved problems of control of errors in the ripping and transfer process. It goes on to consider digital preservation of the content within the files that result from digitization or ripping, and the files that are born digital. While much of this preservation has problems and solutions in common with other content, there is a specific problem of preserving the quality of the digitized signal that is again unique to audio-visual content. Managing quality through cycles of ‘lossy’ encoding, decoding and reformatting is one major digital preservation challenge for audio-visual as are issues of managing embedded metadata.

Neil Beagrie, Director of Consultancy at Charles Beagrie Ltd, was commissioned to act as principal investigator and managing editor of the new series in 2011.  The managing editor has been further supported by an Editorial Board drawn from DPC members and peer reviewers who have commented on the text prior to release.  The Editorial Board comprises William Kilbride (Chair), Neil Beagrie (Series Editor), Janet Delve (University of Portsmouth), Tim Keefe (Trinity College Dublin), Andrew McHugh (University of Glasgow), Dave Thompson (Wellcome Library).

The full text of the report is available now to DPC members (from the DPC member web pages – accessible by DPC members only) but it is expected to have its wider public release on the DPC website in April or early May 2012. The public release and url for the public version will be announced on release.

Happy 10th Anniversary to the Digital Preservation Coalition

Organisations can be a bit like the Queen with a selection of birth dates to choose from but the DPC is perhaps 10 years old today.

It was incorporated earlier but formally launched as an organisation at a reception in the House of Commons on 27th February 2002. Speakers at the event were Rosie Winterton MP Parliamentary Secretary in the Lord Chancellor’s Department, author and broadcaster Loyd Grossman, and Lynne Brindley then chair of the DPC.

At the time of the launch the DPC had 9 full members and 8 associate members: there are over 40 members and allied organisations today.

I came across the launch reception invitation recently so here is a scan.

Happy birthday!

Preserving Email DPC Technology Watch Report released

We are delighted to announce that the Preserving Email technology watch report has now been published by the DPC. It is published electronically as a PDF and is now free to download from the DPC website at: http://dx.doi.org/10.7207/twr11-01.  It was previously available as a preview to DPC members only from December 2011. Charles Beagrie are managing editors for the Technology Watch Series and have worked closely with the DPC in the production of this report. The full press release for the report is copied below and you are welcome to forward it to interested colleagues.


DPC report cover
17/02/2012

For immediate release

Email tomorrow … and next year … and forever

Preserving Email, a new report from the DPC gives practical advice on how to ensure email remains accessible

Email is a defining feature of our age and a critical element in all manner of transactions. Industry and commerce depend upon email; families and friendships are sustained by it; government and economies rely upon it; communities are created and strengthened by it.  Voluminous, pervasive and proliferating, email fills our days like no other technology.  Complex, intangible and essential, email manifests important personal and professional exchanges.  The jewels are sometimes hidden in massive volumes of ephemera, and even greater volumes of trash. But it is hard to remember how we functioned before the widespread adoption of email in public and private life.

Institutions, organizations and individuals have a considerable investment in – and legal requirements to safeguard – large collections of email.  IT managers and archivists have long recognised that email requires careful management if it is to be available in the long term but practical advice about how to do this is surprisingly sparse.  So a new ‘Technology Watch Report’ from the Digital Preservation Coalition (DPC) will be of wide interest.

‘The first email was probably sent by researcher at the Massachusetts Institute of Technology in 1965’, explained Chris Prom of the University of Illinois, the report’s author. ‘It has long since gone missing, deemed too trivial to be worth preserving.’

‘Since then email has become a valuable documentary form because people typically use it to write things that were not intended for wide revelation at the time. So it can contain material which researchers – and high court judges – find incredibly useful.’

‘Users normally shoulder the ultimate responsibility for managing and preserving their own email.  This exposes important records to needless risks and is counterproductive in many cases. But it doesn’t have to be like this.  Individuals and organizations can lay the foundation for long term access so long as they understand the technical standards that underlie email systems. Based on this understanding, they can implement sensible preservation strategies.’

‘The Preserving Email report provides a comprehensive advanced introduction to the topic for anyone who has to manage a large email archive in the long term: and in the long term that will be most of us.’

Gareth Knight of King’s College London welcomed the report.  ‘Preserving Email provides an excellent overview of the topic, drawing together observations made in a number of research projects to provide a succinct overview of the legal, technical, and cultural issues that must be addressed to ensure that these digital assets can be curated and preserved in the long-term. Its conclusion, providing a set of pragmatic, easy-to-understand recommendations that individuals and institutions may apply to better manage their email archive, highlights the complexity of email preservation.  It also sends a clear message that it is something that everyone can perform.’

The British Library is among the agencies currently working on new strategies to preserve email.  Maureen Pennock of the British Library welcomed in particular the two short case studies which are included in the report. ‘The report includes case studies from the Bodleian Library and the Medical Research Council which are really useful in making sense of the practical problems which we face, and how to resolve them in practice not just theory.  They show what can be achieved  and underline just how useful the core email standards are.’

Neil Beagrie of Charles Beagrie Ltd, managing editor and principal investigator of the Technology Watch Series highlighted the plans for more reports in the series in the near future.  ‘Preserving Email is the first of five planned publications from leading experts in the new DPC Technology Watch Series.  The format of the new reports has had a major redesign, and ISSN and DOI identifiers have been added.  We hope these features will enhance the use, citation and impact of the reports. Further reports on Preservation of Moving Picture and Sound, Intellectual Property Rights for Digital Preservation, Digital Forensics and Preservation, and Preservation Trust and Continuing Access for e-Journals will be released later in 2012. The DPC and Charles Beagrie hope the new series will be a significant contribution to encouraging digital preservation and best practice worldwide.’

Richard Ovenden, Deputy Director of the Bodleian Libraries at Oxford University and Chair of the DPC welcomed the report.  ‘This is the tenth anniversary of the Coalition, which was launch in the House of Commons in February 2002.  One of the ways we are marking this year is by releasing a new set of reports to update and extend the advice we offer.  The Technology Watch Reports are a popular and lasting help to anyone interesting in ensuring that their digital memory available in the long term, and we work hard to ensure they are accessible as well as authoritative.  This new report of Preserving Email will be particularly relevant to a wide readership so it’s a great way to kick off our tenth anniversary year.’

The report is online at: http://dx.doi.org/10.7207/twr11-01

Notes for editors

1.    Preserving Email (DPC Technology Watch Report 11-01, ISSN 2048-7916, Digital Preservation Coalition 2011) was written by Chris Prom of the University of Illinois.  It is published electronically as a PDF and is now free to download from the DPC website at: http://www.dpconline.org / … It was previously available as a preview to DPC members only from December 2011.
2.    Chris Prom is the Assistant University Archivist at the University of Illinois, Urbana USA.  During 2009–10, as part of his Fulbright Distinguished Scholar Award, Prom directed a research project at the Centre for Archive and Information Studies at the University of Dundee, Scotland, on ‘Practical Approaches to Identifying, Preserving, and Providing Access to Electronic Records’. This included a major focus on the preservation of email.
3.    The report is published by the DPC in association with Charles Beagrie Ltd. Neil Beagrie, Director of Consultancy at Charles Beagrie Ltd, was commissioned to act as principal investigator for and managing editor of this Series in 2011. He has been further supported by an Editorial Board drawn from DPC members and peer reviewers who comment on texts prior to release.
4.    The Digital Preservation Coalition (DPC) is an advocate and catalyst for digital preservation, enabling our members to deliver resilient long-term access to content and services, and helping them derive enduring value from digital collections.  We raise awareness of the importance of the preservation of digital material and the attendant strategic, cultural and technological issues. We are a not-for-profit membership organisation and we support our members through knowledge exchange, capacity building, assurance, advocacy and partnership.  Our vision is to make our digital memory accessible tomorrow. For more information about the DPC see: http://www.dpconline.org/
5.    The Technology Watch Report series was established in 2002 and has been one of the Coalition’s most enduring contributions to the wider digital preservation community.  They exist to provide authoritative support and foresight to those engaged with digital preservation or having to tackle digital preservation problems for the first time. These publications support members work forces, they identify disseminate and discuss best practice and they lower the barriers to participation in digital preservation. Each ‘Technology Watch Report’ analyses a particular topic pertinent to digital preservation and presents an evaluation of workable solutions, a review the potential of emerging solutions and posits solutions that might be appropriate for different contexts.  The reports are written by leaders-in-the-field and are peer-reviewed prior to publication.
6.    Future reports in the series include:
·    Preserving Moving Picture and Sound, Richard Wright (BBC Research and Development)
·    Digital Forensics for Preservation, Jeremy Leighton-John (British Library)
·    Intellectual Property Rights for Digital Preservation, Andrew Charlesworth (Bristol University)
·    Preservation, Trust and E-Journals, Neil Beagrie (Charles Beagrie Ltd)

###

Merger of KB (Dutch National Library) and the Dutch National Archives

I heard recently from a colleague  that the Dutch Ministry of Education, Culture and Sciences has announced that the Dutch National Archives are to merge with the Dutch National Library (KB). Both institutions have been major players in digital preservation and in development of appropriate systems and practices for preserving digital records and publications respectively. The Dutch National Archives has implemented Tessella’s Safety Deposit Box whereas the KB has been working with IBM’s DIAS system since 2003 and is now developing a new architecture to replace it. The merger follows on from a series of mergers of national libraries and archives including Canada and more recently New Zealand.

Data Storage: Top Five Trends for 2012 from IBM (Data Preservation and Data Curation are up there!)

A very interesting presentation on Data Storage: IBM and Storage: Top Five Trends for 2012 from Steve Wojtowecz, vice president of storage software development at IBM on eWeek. Wojtowecz outlined five storage trends that will emerge in 2012: Data Preservation, Data Curation, Storage Analytics, Mass storage in Entertainment and Healthcare industries and Data Records Management (“Data Hoarders”). All major topics of interest to this blog with data preservation, data curation and even digital lives getting a mention. The article suggests “As storage becomes a key business driver in 2012, IBM officials said the industry will see new breakthroughs in storage research and business models coming from sectors such as entertainment and health care”. Worth a look.

Preserving Email report marks launch of new DPC Tech Watch Series

I am delighted to announce that the first of the new Technology Watch Reports Series that Charles Beagrie Ltd been producing for the Digital Preservation Coalition has just been released as a preview to DPC members. It is also the first report in the new series design and report format. I’m just back from the winter solstice at Stonehenge this morning, so I feel we have done all the rites and it is an auspicious time to launch the first report in the series!

The report ‘Preserving Email’ is authored by Chris Prom, Assistant University Archivist at the University of Illinois. During 2009-10, as part of the Fulbright Distinguished Scholar Award, Chris directed a research project at the Centre for Archive and Information Studies at the University of Dundee concerning “Practical Approaches to Identifying, Preserving, and Providing Access to Electronic Records”. This included a major focus on preservation of email.

Neil Beagrie, Director of Consultancy at Charles Beagrie Ltd, was commissioned to act as principal investigator and managing editor of the new series in 2011.  The managing editor has been further supported by an Editorial Board drawn from DPC members and peer reviewers who have commented on the text prior to release.  The Editorial Board comprises William Kilbride (Chair), Neil Beagrie (Series Editor), Janet Delve (University of Portsmouth), Tim Keefe (Trinity College Dublin), Andrew McHugh (University of Glasgow), Dave Thompson (Wellcome Library).

The full text of the report is available now to DPC members (from the DPC member web pages – accessible by DPC members only) but it is expected to have its wider public release on the DPC website in late January or early February 2012. The public release and url for the public version will be announced on release.

JISC Collections conclude a UK consortium membership rate for the Portico archive

I missed a significant announcement back in the summer holidays so I will blog a (belated) update this month. In July JISC Collections and Portico concluded a UK consortium membership rate for the Portico archive. This builds on a previous consortium agreement with Portico for Scottish university libraries in SHEDL.

The JISC agreement offers a substantial discount to consortium members on the individual library rate that would apply. As of 12 October 2011, 57 UK university libraries have joined the Portico e-journal archive (20 of these are since July 2011 and as part of the new JISC Collections agreement). Take-up of the e-books archive has been much lower so far: only the SHEDL libraries (where e-journal and e-book archive membership was bundled) and 5 out of the 20 new UK library members have taken this option.

Membership of Portico acts as an “insurance policy” should post-cancellation access arrangements with a participating publisher fail and provides a fully out-sourced service. The Portico archive service is discussed in more detail in a JISC Collections Guide to e-Archiving Solutions.

New Projects for 2011-2013

It is a busy time of year with very little time to update the blog but a short update on current and future projects for 2011-2013 may of interest:

Economic Evaluation of Research Data Infrastructure – a study for the Economic and Social Research Council in the UK. This is being conducted jointly by Charles Beagrie Ltd with Prof John Houghton of the Centre for Strategic Economic Studies at Victoria University and is looking at the economic impact of the Economic and Social Data Service in the UK. Such studies on the impact of research data services are rare and we have the opportunity to test some experimental approaches. Already we have interesting data and I think this is going to be a very significant study. We are about half-way though – having started mid-July 2011 and will finish in January 2012.

Smart Research Framework (SRF) and Biomedical Research Infrastructure Software Service kit (BRISSkit). We are  junior partners in two of the four Research Data Management projects in the JISC University Modernisation Fund shared services programme. In both we are supporting their work on developing cost/benefit and return on investment  cases. Both are great projects so I would encourage you to take a look. They will complete in the first half of next year.

Research 360 – just starting up at the University of Bath and will run until March 2013. The Project addresses the long-tail of high quality small science characterised by applied research and faculty-industry partnerships. We will contribute to building on and applying the I2S2/KRDS Benefits Toolkit with a focus on faculty research data drivers for the Research Excellence Framework (REF).

DPC Technology Watch Series – work is also progressing  for the five titles in the new DPC Technology Watch Series. I’m really enjoying working  as series editor with William Kilbride at the DPC  and the authors and keeping up to date on cutting-edge developments. Look out for the first release in the New Year (or from December if you are a DPC member).


          
				
			

KRDS Digital Preservation Benefits Analysis Toolkit and KRDS Updates now available

The KRDS-I2S2 Digital Preservation Benefits Analysis Project is pleased to announce the release of the KRDS Digital Preservation Benefits Analysis Toolkit. Development of the toolkit has been funded by JISC. The worksheets, guidance documentation and exemplar test cases can be downloaded from the project website.

The Toolkit consists of two tools: the KRDS Benefits Framework (Tool 1); and the Value-chain and Benefits Impact tool (Tool 2). Each tool consists of a detailed guide and worksheet(s). Both tools have drawn on partner case studies and previous work on benefits and impact for digital curation/preservation. This experience has provided a series of common examples of generic benefits that are employed in both tools for users to modify or add to as required.

The KRDS Benefits Framework (Tool 1) is the “entry-level” tool requiring Less experience and effort to implement and can be used as a stand-alone tool in many tasks. It can also be the starting point and provide input to the use of the Value-chain and Impact analysis.

The Value-chain and Benefits Impact analysis (Tool 2) is the more advanced tool in the Toolkit and requires more experience and effort to implement. It is likely to be most useful in a smaller sub-set of longer-term and intensive activities such as evaluation and strategic planning.

The combined Toolkit provides a very flexible set of tools, worksheets, and lists of examples of generic benefits and potential metrics. These are available for use in different combinations appropriate to needs and level of expertise.

Guides for the toolkit and each individual tool and case studies of completed examples of the worksheets provide documentation and support for your own implementation.

In addition we have updated the KRDS Factsheet (new version 2 July 2011) and the KRDS User Guide (new version 2 July 2011) on the KRDS web site. The benefits toolkit is also linked from there. For future reference please bookmark the KRDS web site as all the latest KRDS tools and materials and updates are/will be accessible from that access point.

Report and Presentations from the JISC Digital Curation/Preservation Benefits Tools Project Dissemination Workshop

There was a very successful end of project dissemination workshop and lively discussion last week on implementing the toolkit with funders and other attendees. A full report of the workshop and links to the presentations are provided below. The Benefits Analysis Toolkit will be released on 31 July from the project web site and the KRDS web site.

Tools Background

This is a six month project funded by JISC, testing developing and documenting a toolkit consisting of two evolving tools, the KRDS Benefits Framework and the Value Chain and Benefits Impact tool. The Benefits Framework is the entry level tool and Value Chain and Benefits Impact tool is more advanced with a narrower range of applicable activities. Any benefit from digital curation should fit within the Framework and can be reworded and adapted to fit with the local application. From the funders perspective the easily tailored benefits offer a consistent and powerful way of stimulating thinking. The toolkit’s official release date is July 31st.

Welcome and Project Background (Liz Lyon UKOLN) [Presentation]

The Toolset (Neil Beagrie, Charles Beagrie Ltd) [Presentation]

Case Studies

Dipak Kalra (Centre for Health Informatics and Multi-professional Education (CHIME) at UCL)

The toolkit was used in an MRC data support service investigation to understand how data sharing takes place. He presented results via a ‘virtual study’ that took all six studies into account to be more comprehensive. Generic benefits were taken from the tool and given a localised expression etc. He summarised that the tool should work for these kinds of studies though some parts are more applicable. Working through a toolkit could be of value for studies and particularly useful for putting forward a case for funding or prioritising resource utilisation within a study. Completing the spreadsheet and working out weightings might be nicely undertaken in a team workshop.

[Presentation]

Catherine Hardman (Archaeology Data Service)

In this case the toolkit was used from the point of view of a repository (more a macro level than micro), for looking in particular at issues of cost in the lifecycle. Archives often have to help justify costs/ effort associated with digital preservation even if they are well established. This can be used to address a range of audiences and with different levels of complexity- in individual projects or within project teams to boost cases for support. The value chain can help with identifying different values for different audiences. Quantification of impact can help in a number of ways: in research bid terms it helps justify resources; in archive preparation terms it helps with selection and retention decisions. The tool can be used as a light touch to help persuade stakeholders of benefits or for deeper insight into project planning decisions.

[Presentation]

Monica Duke (SageCite Project)

Here the tools were used to assess the benefits of data citation, an undertaking with a project perspective based on an organisation whose main business is science. It showed direct benefits as well as indirect ones such as better discovery of network models and better access. The Benefits Framework was easy to apply and helped to articulate benefits, although an intermediary may be required to facilitate the process.

[Presentation]

Matthew Woollard (UK Data Archive (UKDA) at University of Essex)

The tools were put into practice at the UK Data Archive and used to emphasise benefits to stakeholders. They helped to prioritise internal activities, justify costs to stakeholders and give an understanding of the service impact. They showed where value added is needed, where value is added, and who can benefit and when. The framework for activities seems to be where it will be of most use. It is important to note that generic benefits may have impacts to more than one stakeholder.

[Presentation]

Discussion

Q: What is the ongoing support for the tool?

  • It will be present on the project website with a persistent web archive copy. There is a commitment to make it continuously available and it may be updated in future in light of future projects and applications. There is extensive documentation and if the need were felt for more support there is the potential for consultancy and assistance from Charles Beagrie Ltd as required.

Q: Do you see it incorporated within an outline data management plan?

  • I can definitely see an advantage in the benefits framework. You can also use the value chain in a data heavy project, possibly when sitting down as a project group.

Q: Are we going to get too many statements on value, many of which are blatantly obvious rather than generically just true? If expressions are generic it would be better to cut them out.

  • The tool is for focussing the mind and the generic examples only a starting point for what should be customised specific statements of value. In terms of presentation in the user guide we present an alternative version of the completed Framework with more specific examples and level of detail for the benefits. The user should have the ability to select those points of greatest impact for specific stakeholders and develop them ie not presenting a generic benefit.
  • We are interested in ensuring researchers can do research and explaining value to the government and other stakeholders. If you’re promoting data sharing benefits then also promoting them to an internal audience is important and powerful for motivation.
  • Research Councils can be deluged with metrics- it is better to have a few, simple and powerfully chosen. Case studies are incredibly powerful though not sufficient on their own. You spend little time discussing them compared to the time taken to create them. A case study should actually illustrate a metric.
  • The Benefits Framework looks helpful in learning and preparation- in evaluation it should then be less necessary.
  • Part of what we are doing is upping the game with studies and funders. Funders will need to respond proactively. There is a need for a more forceful tool but it is premature to deliver it now, such as a planning tool allowing the user to take up to three actions and workup a two or three year action plan. This is a possible direction of travel in the future.

Q: Homogenisation? What does it mean for funders when there is a long checklist of benefits? If established where will it lead us? Funders will have to look closely and make sure they are used carefully to draw out where we want benefits to accrue. Who is this making life easier for?

  • Hopefully for researchers. They often have a box to fill in anyway but generally it is not well structured. The framework is fast to use, not a heavyweight commitment. We would hope it could make filling in of a benefits statement richer without much extra work as it provides a more advanced starting point for brainstorming the benefits.
  • This has to be looked at against administrative burden. If it enables the user to identify and realise benefits they otherwise would not have realised then it has an advantage. If it isn’t repaid by better realisation of benefits then it deserves to fail.
  • The process of using the tool can have valuable results in itself.
  • Many benefits feed into other benefits. The funder should ask for requirements, it’s not necessary to show everything. This is a platform that everyone can use in a way that benefits them.
  • There is value in prioritisation and communication. If we can work with researchers to highlight the key impacts of what they’re doing then the tool when simply done is of real value.

Q: We are talking about potential benefits- they haven’t actually been achieved. I can see the theoretical value but am worried these benefits could be three or four years ahead of what we can actually achieve.

  • The time element of the framework does bring that in. We’re trying to think not just of the long term but how to get there and any benefits along the way. The element is there but you must have some degree of caution with how you apply this tool in the same way as any other.

Q: You mentioned OCLC is a partner in the project. Are they involved because of their interest in cataloguing and metadata?

  • Our partner is the research division within OCLC which has a broad range of interests within digital library research. Brian Lavoie who has had an important input to this project and KRDS is a research scientist there. As an economist he has taken a close interest in the economics and benefits of digital preservation and this has been an important theme within OCLC Research – hence their interest and active participation in the project.

Q: I’m not sure who’s going to use the tool. What audience are you promoting it to? Will it mean a generic standard will be adopted?

  • Different sectors or disciplines are different. There aren’t homogenous states so there will always be bespoke relevant next steps in working from the tools.
  • Examples of all the common benefits listed are not seen in every project. There should always be a degree of selection so you wouldn’t end up with homogenous benefits in every case. Generic benefits should also be customised and expressed in ways specific to a particular project or service.
  • What you may have to do is demonstrate your benefits to a wider audience not just a primary beneficiary so sometimes the wider list of benefits is also helpful. It is good to put in front of workers to demonstrate why things are done in a certain way. The audience is not as wide as we would like it to be but the value to those who can use it is great.
  • The case studies presented are tailored towards particular projects/services but it can be adapted without too much additional tailoring. There may be elements which need to be tweaked.
  • I think there is value beyond the digital preservation community particularly for the Framework and maybe other versions could be needed tailored to those other audiences.
  • Arguments for further funding are always made on the science; informatics communities to some extent are disenfranchised. Anything we can do that supports honesty and helps to get discussion going within studies by linking benefits to science and data management must be good. If the Framework can accommodate different perspectives of benefits and allows them to be joined up in the story then we should try it out to more people.
  • There can be reciprocal benefits or benefits with clear knock on effects to each other. Actions may give benefits to the user, which give benefits to the creator.
  • Could there be eventual development of a web/matrix of benefits? Not one-to-one or even two-way but a network with flow going around it.

Q: Good ideas unless heavily marketed don’t take off. Even if there is a benefit to a tool it wouldn’t be given unless people know to use it. Are there steps funders would advise to encourage researchers and services pro-actively in seizing benefits and using the tools?

  • Will people be persuaded to use the tool to compete? You only compete if a competition is created.
  • Once certain good policies are floating around everyone uses them to tick the box whether they are applicable or not.

Q: Will presentations from the workshop be available later?

  • Yes we intend to make them available later and a short write-up of the day and key areas of discussion.

 

« Prev - Next »