Scholarly Communication

Datanomics: the value of research data

Glasgow_NB_Keynote

Twenty years ago format obsolescence was seen as the greatest long-term threat to digital information.  Arguably, experience to date has shown that funding and organisational challenges are perhaps more significant threats. I hope this presentation helps those grappling with these challenges and shows some key advances in how to use knowledge of costs, benefits and value to support long-term sustainability of digital data and services.

These are the slides from my keynote presentation to the joint Digital Preservation Coalition / Jisc workshop on Digital Assets and Digital Liabilities – the Value of Data held in Glasgow in February 2018. The slides summarise work over the last decade in the key areas of exploring costs, benefits and value for data. The slides posted here have additional slide notes and references to new publications since the workshop and some modifications such as removal of animations. One day I hope to have time to synthesis this presentation in an accessible way as a more extensive article but hope this slide deck on Slide share at https://www.slideshare.net/Nbeagrie is a useful interim resource.

Datanomics

New “What To Keep” research data report published by Jisc

What to Keep

“What To Keep?” a new Jisc research data report by Charles Beagrie Ltd has just been published by Jisc. You can access the full report directly at: https://repository.jisc.ac.uk/7262/

What to keep in terms of research data has been a recognised issue for some time but research data management and in particular appraisal and selection (i.e. “what to keep and why”) has become a more significant focus in recent years as volumes and diversity of data have grown, and as the available infrastructure for ‘keeping’ has become more diverse.

The purpose of the What to Keep report is to provide new insights that will be useful to institutions, research funders, researchers, publishers, and Jisc on what research data to keep and why, the current position, and suggestions for improvement.

The analysis of emerging themes and mappings is available as a set of tables. Seven mini case studies illustrate in more detail the approaches and rationale for what to keep for different repositories, stakeholders and disciplinary areas.

The report provides insights on how what to keep decisions can be guided and supported, and the ten study recommendations and the potential implementations for them, provide practical suggestions for future development.

What to Keep Recommendations

European Open Science Cloud

EOSCpilot_web

Charles Beagrie Ltd have been providing additional expert resource in Open Science and Open Scholarship to Jisc, a partner in the EOSCpilot project funded by the EC’s Horizon 2020 Research & Innovation programme. The EOSC – European Open Science Cloud – aims to create a trusted environment for hosting and processing research data to support EU science.

We helped to support the finalisation of draft policy recommendations aimed at encouraging implementation and take-up of the EOSC. This involved supporting consultation on the draft policy recommendations, and helping to prioritise and develop them in more detail, to produce a coherent policy proposition.

We look forward to seeing the final public recommendations and future development of EOSC.

Research Data: What to Keep?

Charles Beagrie Ltd has started a new research data study for Jisc and UK institutions.

Jisc is working to develop shared infrastructure, influence policy and provide guidance to support institutions with the growing need for robust research data management. There is a wide-range of needs and existing provision for creation, collection, storage and preservation, and reuse of data within UK Higher Education.

What research data should be kept?

Researchers, data curators and policy makers all need to answer the question, what research data should be kept? We can’t keep it all, because that would be too expensive and time-consuming. However, we have to keep data that is irreplaceable and unique in its value for future research; to enable it to be reused and validated: to enable peer review to be informed; and to enable there to be trust in research findings. Types of data needing to be retained vary and may include related materials such as software and documentation. But how much and what is enough? Obviously, there is no single answer to that: it depends on many factors, but what are those factors, and how should we weight them? These remain difficult and open questions, but this year Jisc is working with us to take a step toward answering them.

How can we identify what to keep?

We are setting out to explore, what actually is the optimal data to keep from research projects conducted at UK institutions? Over the course of the rest of 2018, our project will work with a small number of research areas to find out. What conditions, such as openness or timescales, might be ideal? We will consult the views of researchers (as data creators and data users), research funders, ethics professionals, archivists, research data managers, peer reviewers, other research users, and others on these questions. We will dig into the reasons for their views, and into whether research data is currently kept in line with those views, or not.

Why are we carrying out this investigation now?

This work comes at a critical time in the evolution of research data management and sharing. At the policy level, the recommendations from the UK Open Research Data Taskforce are expected shortly. These may take into account both the recommendations to Government of the 2017 report by Dame Wendy Hall and Jérôme Pesenti into the future of the UK artificial intelligence industry and the recent Government announcements around this, where research data can be a key input into AI tools. The availability of research data is also a matter of concern to those interested in research integrity and reproducibility. Relevant infrastructure investments include both the Jisc research data shared service and the increasing activity around the European Open Science Cloud.

Both policy and infrastructure investments need better information about the extent and nature of the research data that needs to be kept, under what conditions, and for how long. Our 2018 project will not provide all this information, but it will explore current practices and take the next step.

Digital Preservation Handbook Update February 2016

Originally published in 2001 as a paper edition, ‘Preservation and Management of Digital Materials: a Handbook’ was the first attempt in the UK to synthesise the diverse and burgeoning sources of advice on digital preservation.  Demand was so great that in 2002, a free online edition of the Handbook was published by the newly established Digital Preservation Coalition.

After more than a decade, in which digital preservation has been transformed, the Handbook remains among the most heavily used area of the DPC website.

Funders and organisations are collaborating on re-designing, expanding and updating the Handbook so it can continue to grow as a major open-access resource for digital preservation. The DPC and Charles Beagrie Ltd have been engaged on a major re-working of the Digital Preservation Handbook for release as a new edition over 2015/2016. The National Archives (our Gold Sponsor) working together with other stakeholders including Jisc, the British Library, and The Archives and Records Association (our Silver Sponsors), and the National Records of Scotland (our Bronze Sponsor) is supporting the Digital Preservation Coalition in updating and revamping the Handbook. Many individuals and organisations are also contributing to this work through book sprints, peer review, project and advisory boards.

The revision, guided by the user feedback and consultation (see Report on the Preparatory User Consultation on the 2nd Edition of the Digital Preservation Handbook), is modular and being undertaken over a two year period to March 2016.

We have provided updates at regular intervals to inform the community on progress with the project and with this final February update we are delighted to announce a number of key developments.

 

Publication Schedule

The 2nd edition of the Handbook had a partial “soft launch” in October 2015 and approximately 2/3rds is online and publicity accessible at http://www.dpconline.org/advice/preservationhandbook

This partial release will be further enhanced by additional functionality when a new platform for the website focused on ‘responsive design’ is brought on stream by the DPC in 2016. This will provide an updated design and improved user experience on mobile and tablet devices, compared to the current site templates that are optimised for viewing on a desktop screen. We will also add the facility to generate PDFs. In the interim some functionality and content will remain “works in progress” but the community have gained early access to a significant new resource.

The remaining 14 sections to complete the Handbook have now been written, edited and are in peer review (see Handbook contents page for coming soon sections). We are aiming to complete this work and revise content for publication by the end of March 2016. The Handbook is now live so we will need to close and update section by section for these 14 remaining updates, hopefully in the final week of March and/or early April 2016. Watch this space for future announcements!

NRS joins funding group

The Digital Preservation Coalition was delighted to announce this month that The National Records of Scotland (NRS) had come on board as a ‘Bronze Sponsor’ for the eagerly anticipated second edition of the ‘Digital Preservation Handbook’. As of February 2016, with the addition of the NRS we have raised 93% of estimated funding required for the Handbook revision. We have prioritised content creation, scaled back some events, and adjusted budgets to ensure completion within a very tight funding profile.

Slideshare from Handbook Workshop at DCDC15

A workshop on the Digital Preservation Handbook was run at the DCDC15 conference in early October. Powerpoint slides from the Handbook presentation are now available on Slideshare. They provide a detailed overview of the new edition Handbook and work in progress. To date, there have been over 2,000 views of the slides.

New report: The Value and Impact of the European Bioinformatics Institute

We are pleased to announce a new report: The Value and Impact of the European Bioinformatics Institute.

In 2015, Charles Beagrie Ltd  was commissioned by the European Bioinformatics Institute (EMBL-EBI), to study and analyse its economic and social impact.

The EMBL- EBI, located on the Wellcome Genome Campus in Hinxton, near Cambridge in the UK, manages public life science data on a very large scale, making a rich resource of genome information freely available to the global life science community.

The full report published today presents the results of the quantitative and qualitative study of the Institute, examining the value and impact of its work. The report highlights key findings, including that EMBL-EBI data and services made commercial and academic R&D significantly more efficient. This benefit to users and their funders is estimated, at a minimum, to be worth £1 billion per annum worldwide – equivalent to more than 20 times the direct operational cost of EMBL-EBI.

A press release with further information is available on the EMBL-EBI website at http://www.ebi.ac.uk/about/news/press-releases/value-and-impact-of-the-european-bioinformatics-institute

The Full Report is available online in printable format at http://www.beagrie.com/EBI-impact-report.pdf

A short Executive Summary version of the report is available online in printable format at http://www.beagrie.com/EBI-impact-summary.pdf

12 slideshares for Xmas: 20 years in digital preservation

I have just posted the final instalment of a personal selection of 12 presentations drawn from events and topics over the last 20 years in digital preservation, which I hope will be of interest.

They are taken from events on four different continents including the first iPres conference and cover themes such as personal archiving, research data management, e-journals, the digital preservation lifecycle model, national and institutional strategies and collaboration, costs/benefit/economic impacts of digital preservation, the establishment of the Digital Preservation Coalition, and the development of the online Digital Preservation Handbook. I hope there will be something in there for everyone.

There are accompanying blog narratives which set the presentations into context and the powerpoint presentations themselves on Slideshare. Details and web links to them are as follows:

2014 – The Value and Impact of Research Data Infrastructure (economic impact), presentation to the Preservation and Archiving Special Interest Group (PASIG), Karlsruhe Germany    slides     narrative

2013 – Maintaining a Vision: how mandates and strategies are changing with digital content (changes and responses), keynote presentation to Screening the Future conference, London UK slides     narrative

2010 – Keeping Research Data Safe (digital preservation costs and benefits), presentation to KB Experts Workshop on Digital Preservation Costs, The Hague Netherlands          slides     narrative

2007 – Digital Preservation: Setting the Course for a Decade of Change (evolution or revolution?), keynote presentation to the Belgian Association for Documentation (ABD-BVD), Brussels Belgium              slides     narrative

2005 – Digital Preservation and Curation Summing up + Next Steps (setting curation and research agenda for2005-2015), conclusions to Warwick II Workshop, Warwick UK             slides     narrative

2005 – Plenty of Room at the Bottom? Personal Digital Libraries and Collections, keynote presentation to European Conference on Research and Advanced Technology for Digital Libraries (ECDL), Vienna Austria   slides     narrative

2004 – eScience and Digital Preservation, presentation to Association for Information Science and Technology (ASIST) conference, Rhode Island USA                  slides     narrative

2004 –  The JISC Continuing Access and Digital Preservation Strategy 2002-5(covering UK Higher Education sector and partners), presentation to the JISC-CNI conference, Brighton UK slides  narrative

2004 –Digital Preservation, e-journals and e-prints, presentation at private workshop 1st iPres conference, Beijing China                 slides     narrative

2004  –  The Digital Preservation Coalition (DPC), Its History, Programme, Rationale ,and Structure, set of 4 linked presentations to DPC Forum, London UK              slides     narrative

2001 – Preservation Management of Digital Materials (the Digital Preservation Handbook) presentation to Digital Preservation Workshop/State Library, Melbourne Australia         slides     narrative

1998 – Preserving Digital Collections: current methods and research (digital preservation lifecycle model), presentation to the Society of Archivists annual conference, Sheffield UK             slides     narrative

This is a baker’s dozen as there is a also bonus presentation from 2015 on slideshare covering the latest work on The Digital Preservation Handbook (new edition for full release in March 2016).

The background and narrative blog for this personal selection of presentations is also available.

Breaking News: Digital Preservation Handbook Update October 2015

Originally published in 2001 as a paper edition, ‘Preservation and Management of Digital Materials: a Handbook’ was the first attempt in the UK to synthesise the diverse and burgeoning sources of advice on digital preservation. Demand was so great that in 2002, a free online edition of the Handbook was published by the newly established Digital Preservation Coalition.

After more than a decade, in which digital preservation has been transformed, the Handbook remains among the most heavily used area of the DPC website.

Funders and organisations are collaborating on re-designing, expanding and updating the Handbook so it can continue to grow as a major open-access resource for digital preservation. The DPC and Charles Beagrie Ltd have been engaged on a major re-working of the Digital Preservation Handbook for release as a new edition over 2015/2016. The National Archives (our Gold Sponsor) working together with other stakeholders including Jisc, the British Library, and The Archives and Records Association (our Bronze sponsors), is supporting the Digital Preservation Coalition in updating and revamping the Handbook. Many individuals and organisations are also contributing to this work through book sprints, peer review, project and advisory boards.

The revision, guided by the user feedback and consultation (see Report on the Preparatory User Consultation on the 2nd Edition of the Digital Preservation Handbook), is modular and being undertaken over a two year period to March 2016.

We have provided updates at regular intervals to inform the community on progress with the project and with this October update we are delighted to announce a number of key developments.

Publication Schedule

We are pleased to share the news that a critical mass of content has been prepared and peer reviewed and the project board has agreed we should release a majority of the Handbook.  DPC members have already seen the emerging revised 2nd Edition of the Handbook on the members’ private area and this has been switched to the public side of the DPC website. This partial release will be further enhanced by additional functionality when a new platform for the website focused on ‘responsive design’ is brought on stream by the DPC early in 2016. This will provide an updated design and improved user experience on mobile and tablet devices, compared to the current site templates that are optimised for viewing on a desktop screen. We will also add the facility to generate PDFs. We hope to complete remaining sections of the Handbook for a formal full publication release of the Handbook by March 2016. In the interim some functionality and content will remain “works in progress” but the community will gain early access to a significant new resource.

ARA joins funding group

The Digital Preservation Coalition was delighted to announce in September that The Archives and Records Association (ARA) had come on board as a ‘Bronze Sponsor’ for the eagerly anticipated second edition of the ‘Digital Preservation Handbook’. As of Oct 2015, with the addition of the ARA we have raised 87% of estimated funding required for the Handbook revision and continue working to complete it.

Section Illustrations and icons

We are using graphics available from digitalbevaring.dk (http://digitalbevaring.dk/about-us/) for main sections of the Handbook. They have kindly worked in collaboration with us to develop new illustrations when we have identified topics in the Handbook requiring new graphics for illustrations or icons.

New resources icon designs were received over the summer from digitalbevaring.dk  and the interim versions have been replaced in the Handbook. These are the new set:            

 

They are embedded now in all the Resources and Case Studies sections of the Handbook. It means there is now a consistent style to the Handbook with the icons and section heading illustrations sharing the same design, something we all felt was desirable. We are very pleased with the results and overall look that is now in place, and with the collaboration with digitalbevaring.dk that has added a lot to the visual appeal of the Handbook.

Multi-media

Multi-media resources where relevant have been selected and embedded in the Handbook. Selection has focussed on short, high-quality videos that can add significant value to experience and content.

Handbook Workshop at DCDC15

A workshop on the Digital Preservation Handbook was run at the DCDC15 conference in early October. Powerpoint slides from the Handbook presentation are now available on Slideshare. They provide a detailed overview of the new edition Handbook and work in progress.

 

20 years in DP: eScience and Digital Preservation 2004

eScience and Digital Preservation, presentation to Association for Information Science and Technology (ASIST) conference November 2004, Rhode Island USA, available now on Slideshare is the sixth of 12 presentations I’ve selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015.

It is closely related to the previous slideshare for May on the Jisc continuing access and digital preservation strategy but focuses just on the science component.

This is one I wasn’t able to present in person but it was kindly delivered by Gail Hodge.

My brief for the presentation was “thoughts or citations you have for the impact of e-science, particularly the GRID, on information management, particularly archiving, preservation and long-term access.”

It is a short presentation of 15 slides covering collection-based science, the Grid, data publishing, and the background and rationale for the Digital Curation Centre (just launched two weeks before in the UK).

It is a snapshot in time and of key issues in 2004 – interesting to contrast with what one would write 10 years on and ponder on progress made.

Reflections on the 2nd Digital Preservation Handbook Book Sprint 18-19 May 2015

Another rewarding but exhausting couple of days! We completed a two day book sprint in Kew earlier this week focussing on developing more new content for the release of the next edition of the Digital Preservation Handbook that is being funded by The National Archives, the British Library, and Jisc. Really pleased with the outputs and progress we made.

This is now the second book sprint we have held and we have been able to build on the sterling work at the first sprint held in October last year.

A group of 9 people Neil Beagrie (Charles Beagrie Ltd), Glenn Cumisky (British Museum), Matt Faber (Jisc), Stephen Grace (University of East London), Alex Green (The National Archives), William Kilbride (DPC), Gareth Knight (London School of Hygiene & Tropical Medicine), Sharon McMeekin (DPC), and Paul Wheatley (DPC), met up over two days to progress sections of the content for the new “ Getting Started” and “Organisational Activities” sections of the Handbook (as identified in the Draft Outline of the 2nd Edition of the Digital Preservation Handbook). We also progressed some sub-sections of “Technical Solutions and Tools” left over from Book Sprint 1. The venue for the sprint was kindly provided by The National Archives in their Kew building.

We completed draft sections for:

Getting Started

Creating digital materials

Acquisition and appraisal

Retention and review

Preservation

Metadata and documentation

Access

Information Security

Persistent Identifiers

We covered more topics than the first sprint so were occasionally thinly spread: as a cautionary note we may need to review our draft content carefully to ensure the final outputs have the breadth and depth of perspective we aim for:  what I have read so far has been terrific although inevitably it will need some more content adding and final polishing.

The revision has been guided by the user feedback and consultation (see Report on the Preparatory User Consultation on the 2nd Edition of the Digital Preservation Handbook) in short to keep the Handbook text practical, concise, and accessible with more detail available in the case studies and further reading.

We used a different tool from book sprint 1 and successfully adopted Google Docs for our collaborative writing.

A two-day book sprint was very intense but few could have spared more time away from the workplace, and a tight-deadline helped everyone focus on the tasks in hand.

We followed a process of scoping contents for a specific section, brainstorming key points for inclusion, writing, and then review.

Participants were also able to see the substantial emerging Handbook content that is already in the DPC content management system together with the excellent illustrations re-used with permission from digitalbevaring.dk. In addition Google Docs was pre-populated with any relevant text from the previous Handbook, marked in red so it was easily identifiable for review, retention, deletion, amendment or addition/replacement  as needed. The Google Docs were also pre-populated with all case studies and external resources relevant to those sections identified during desk research for the new edition of the Handbook.

The after work drinks in the Tap on the Line and group dinner at Café Mamma were enjoyed by all and allowed everyone to relax and socialise outside the event itself. Next time I will try to remember to take photos for the report!

In June the draft text will be the focus for detailed editorial review, additions, arrangement, proof-reading and input to the DPC content management system. Based on the 1st book sprint that will be at least a two month process after which we will look for peer review to be completed by around the end of September.

It is great to see so much more of the new Handbook there in preliminary form after the sprint. With the contents of the first sprint, supplementary work, and its peer review, there is now substantial draft content emerging for the 2nd edition of the Handbook.

Next »