Universities

Datanomics: the value of research data

Glasgow_NB_Keynote

Twenty years ago format obsolescence was seen as the greatest long-term threat to digital information.  Arguably, experience to date has shown that funding and organisational challenges are perhaps more significant threats. I hope this presentation helps those grappling with these challenges and shows some key advances in how to use knowledge of costs, benefits and value to support long-term sustainability of digital data and services.

These are the slides from my keynote presentation to the joint Digital Preservation Coalition / Jisc workshop on Digital Assets and Digital Liabilities – the Value of Data held in Glasgow in February 2018. The slides summarise work over the last decade in the key areas of exploring costs, benefits and value for data. The slides posted here have additional slide notes and references to new publications since the workshop and some modifications such as removal of animations. One day I hope to have time to synthesis this presentation in an accessible way as a more extensive article but hope this slide deck on Slide share at https://www.slideshare.net/Nbeagrie is a useful interim resource.

Datanomics

New “What To Keep” research data report published by Jisc

What to Keep

“What To Keep?” a new Jisc research data report by Charles Beagrie Ltd has just been published by Jisc. You can access the full report directly at: https://repository.jisc.ac.uk/7262/

What to keep in terms of research data has been a recognised issue for some time but research data management and in particular appraisal and selection (i.e. “what to keep and why”) has become a more significant focus in recent years as volumes and diversity of data have grown, and as the available infrastructure for ‘keeping’ has become more diverse.

The purpose of the What to Keep report is to provide new insights that will be useful to institutions, research funders, researchers, publishers, and Jisc on what research data to keep and why, the current position, and suggestions for improvement.

The analysis of emerging themes and mappings is available as a set of tables. Seven mini case studies illustrate in more detail the approaches and rationale for what to keep for different repositories, stakeholders and disciplinary areas.

The report provides insights on how what to keep decisions can be guided and supported, and the ten study recommendations and the potential implementations for them, provide practical suggestions for future development.

What to Keep Recommendations

European Open Science Cloud

EOSCpilot_web

Charles Beagrie Ltd have been providing additional expert resource in Open Science and Open Scholarship to Jisc, a partner in the EOSCpilot project funded by the EC’s Horizon 2020 Research & Innovation programme. The EOSC – European Open Science Cloud – aims to create a trusted environment for hosting and processing research data to support EU science.

We helped to support the finalisation of draft policy recommendations aimed at encouraging implementation and take-up of the EOSC. This involved supporting consultation on the draft policy recommendations, and helping to prioritise and develop them in more detail, to produce a coherent policy proposition.

We look forward to seeing the final public recommendations and future development of EOSC.

Research Data: What to Keep?

Charles Beagrie Ltd has started a new research data study for Jisc and UK institutions.

Jisc is working to develop shared infrastructure, influence policy and provide guidance to support institutions with the growing need for robust research data management. There is a wide-range of needs and existing provision for creation, collection, storage and preservation, and reuse of data within UK Higher Education.

What research data should be kept?

Researchers, data curators and policy makers all need to answer the question, what research data should be kept? We can’t keep it all, because that would be too expensive and time-consuming. However, we have to keep data that is irreplaceable and unique in its value for future research; to enable it to be reused and validated: to enable peer review to be informed; and to enable there to be trust in research findings. Types of data needing to be retained vary and may include related materials such as software and documentation. But how much and what is enough? Obviously, there is no single answer to that: it depends on many factors, but what are those factors, and how should we weight them? These remain difficult and open questions, but this year Jisc is working with us to take a step toward answering them.

How can we identify what to keep?

We are setting out to explore, what actually is the optimal data to keep from research projects conducted at UK institutions? Over the course of the rest of 2018, our project will work with a small number of research areas to find out. What conditions, such as openness or timescales, might be ideal? We will consult the views of researchers (as data creators and data users), research funders, ethics professionals, archivists, research data managers, peer reviewers, other research users, and others on these questions. We will dig into the reasons for their views, and into whether research data is currently kept in line with those views, or not.

Why are we carrying out this investigation now?

This work comes at a critical time in the evolution of research data management and sharing. At the policy level, the recommendations from the UK Open Research Data Taskforce are expected shortly. These may take into account both the recommendations to Government of the 2017 report by Dame Wendy Hall and Jérôme Pesenti into the future of the UK artificial intelligence industry and the recent Government announcements around this, where research data can be a key input into AI tools. The availability of research data is also a matter of concern to those interested in research integrity and reproducibility. Relevant infrastructure investments include both the Jisc research data shared service and the increasing activity around the European Open Science Cloud.

Both policy and infrastructure investments need better information about the extent and nature of the research data that needs to be kept, under what conditions, and for how long. Our 2018 project will not provide all this information, but it will explore current practices and take the next step.

Digital Past 2018

I spent two days last week at the excellent Digital Past 2018 conference in Aberystwyth. It was my first time at the conference.

Organised by the Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW), it showcased innovative digital technologies and techniques for data capture, interpretation and dissemination of the heritage of Wales, the UK and beyond. An image from one of the digital projects featured “Commemorating the Forgotten U-boat War around the Welsh Coast 1914-18” is used in this blog.

Fortunately for a first-time attendee, it was also the 10th anniversary of the conference, so there were several outstanding keynotes that looked back over developments in the last decade, current emerging trends, and more speculatively into the future.

I had my first attempt at doing a conference summing up at Digital Past this year. I have always been a great admirer of Cliff Lynch’s conference summings up at the Coalition for Networked Information and elsewhere. It is a very difficult job to do well. I think I still have a lot to learn from Cliff but it was an interesting challenge!

I would highly recommend the conference to colleagues. Keep an eye out for the next one.

CESSDA SaW Final Conference in Dublin

The final conference of the CESSDA SaW project was held in Dublin, Ireland on 19th October 2017 and summarised the project results in strengthening and widening of European infrastructure of social science data archives. Organized by the Irish Social Science Data Archive (ISSDA) and CESSDA ERIC, the event was very successful hosting representatives from 28 countries. CESSDA members, non-members and aspiring members, were rounded to present the outcomes of a two-year project which has helped increasing the consortium and strengthening its members.

It has been an extremely productive and collaborative project with many valuable and interesting outputs. Charles Beagrie Ltd has led on the development of the cost-benefit advocacy toolkit (released in April 2017) in CESSDA-SaW and we covered this in a previous blog post – but there are many other project outputs now available that will be of interest to the research data management community.

There is a fuller report, presentations and photos from the conference available here.

Public Release of New PDF/A Technology Watch Report

The Digital Preservation Coalition (DPC) and Charles Beagrie Ltd have released Preservation with PDF/A by Betsy Fanning, the latest in their series of Technology Watch Reports to the public. This is now the 14th Technology Watch Report produced over the last 5 years by Charles Beagrie Ltd and the DPC. It provides a comprehensive review of the PDF/A standard and its use.

An update to the original Technology Watch Report, Preserving the Data Explosion: Using PDF published in 2008, the report begins with a history of the PDF/A standard and its development, before moving on to an examination of conformance levels, validation methods and considerations to be made when choosing to use PDF/A for long-term preservation.

“Conformance to the standard is not a simple ‘yes/no’ binary state, in part because there are now four variants of PDF/A,” explains author Betsy Fanning. “One question that is often asked is: ’When should I use PDF/A, and which version should I use?’ This report attempts to answer that question and to provide some guidance about the strengths, weaknesses, opportunities and threats associated with each.”

Preservation with PDF/A examines each of the four variants and lays out the conditions under which it might be beneficial to use PDF/A-3 rather than PDF/A-1, and vice versa, before presenting a range of practical considerations to make the most effective use of the file format.

Neil Beagrie, managing editor of the Technology Watch Report series on behalf of the DPC, added “the choice of file format is a component of a wider technical and organizational infrastructure which comprises a comprehensive digital preservation solution. This report will make interesting reading for anyone putting together their digital preservation strategy.”

Note the new style cover design!

Read ‘Preservation with PDF/A’ now

Presentation on the Value and Impact of Social Science Data Archives and the CESSDA SaW Toolkit

A set of 38 slides now on slideshare used for the Focus Group Cost-Benefit Funding Advocacy Program (Task 4.6) session at the CESSDA Saw Workshop in The Hague 16/17 June 2016.

This was an interactive focus group repeated over two parallel sessions.  It was aimed at European social science data archive staff with responsibility for bidding for funding or promotion and advocacy of the archive to key stakeholders.  The presentation covers some of the key ideas on how the CESSDA Saw funding advocacy toolkit will be structured, its components, and key facts and approaches it will include.

We expect the cost-benefit funding advocacy toolkit under development to support the negotiation with ministries and funding organisations across Europe.

The results of the toolkit user requirements survey with responses from 24 European social science archives were presented and discussed, together with suggested approaches and content for the toolkit. 22 people attended the two sessions overall, representing a mix of countries at different stages on the development path for social science archives (none, new/emerging, mature). There was strong interest and support for the emerging toolkit together with open discussion of how it can be applied in the specific political and administrative context of different European countries.

The slide set presented here is an extended version including a number of hidden background/ reference slides not used in the presentation. The focus group is one of a series guiding further development of the toolkit and its adoption being given to either: (a) social science data archive staff or (b) their key stakeholders (senior management in their universities, research councils and academies, funding ministries, national statistics offices, research users and depositors).

CESSDA is the Consortium of European Social Science Data Archives. The CESSDA SaW project “Strengthening and widening the European infrastructure for social science data archives” is funded by the European Commission as part of its Horizon2020 programme.

New project to transform the user experience of social science data in Europe

We are pleased to be working with partners in the Consortium of European Social Science Data Archives (CESSDA) on a project funded by the European Commission in the framework of its Horizon2020 programme. The CESSDA SaW “Strengthening and widening the European infrastructure for social science data archives” project. After the successful launch of CESSDA in 2013, the aim is now to achieve full European coverage, to strengthen the network and to ensure sustainability of its data for the widened network.

“The CESSDA SaW project will build strength and sustainability into the CESSDA infrastructure” comments Ivana Ilijasic Versic of CESSDA. “We will begin by building on what we have already established across the data archives within our membership. The widened CESSDA network which will result from this project should become a strong infrastructure with global best practice in-built. This will translate into a greater body of work in the social sciences, in turn providing evidence for policy making at a greater scale than today”.

The project runs for two years from August 2015 and brings together partners from across Europe.

Charles Beagrie Ltd are leading task 4.6 in the project, which focuses on developing  a funding and cost-benefit advocacy toolkit for social science data archives. The toolkit being developed will draw on a range of projects and studies looking at benefits, costs, return on investment and advocacy including inter alia 4C, Keeping Research Data Safe (KRDS), and a range of economic impact studies.

Charles Beagrie Ltd is leading on the development of core documents and materials for the Toolkit with support from CESSDA SaW partners for the gathering of information and user testing. A survey is currently in progress to help shape the toolkit and a set of focus groups will further refine it. The completed toolkit will be available by June 2017.

For further information and to keep up to date with the CESSDA SaW project visit: www.cessda.net or follow CESSDA on Twitter @CESSDA_Data.

Digital Preservation Handbook Update February 2016

Originally published in 2001 as a paper edition, ‘Preservation and Management of Digital Materials: a Handbook’ was the first attempt in the UK to synthesise the diverse and burgeoning sources of advice on digital preservation.  Demand was so great that in 2002, a free online edition of the Handbook was published by the newly established Digital Preservation Coalition.

After more than a decade, in which digital preservation has been transformed, the Handbook remains among the most heavily used area of the DPC website.

Funders and organisations are collaborating on re-designing, expanding and updating the Handbook so it can continue to grow as a major open-access resource for digital preservation. The DPC and Charles Beagrie Ltd have been engaged on a major re-working of the Digital Preservation Handbook for release as a new edition over 2015/2016. The National Archives (our Gold Sponsor) working together with other stakeholders including Jisc, the British Library, and The Archives and Records Association (our Silver Sponsors), and the National Records of Scotland (our Bronze Sponsor) is supporting the Digital Preservation Coalition in updating and revamping the Handbook. Many individuals and organisations are also contributing to this work through book sprints, peer review, project and advisory boards.

The revision, guided by the user feedback and consultation (see Report on the Preparatory User Consultation on the 2nd Edition of the Digital Preservation Handbook), is modular and being undertaken over a two year period to March 2016.

We have provided updates at regular intervals to inform the community on progress with the project and with this final February update we are delighted to announce a number of key developments.

 

Publication Schedule

The 2nd edition of the Handbook had a partial “soft launch” in October 2015 and approximately 2/3rds is online and publicity accessible at http://www.dpconline.org/advice/preservationhandbook

This partial release will be further enhanced by additional functionality when a new platform for the website focused on ‘responsive design’ is brought on stream by the DPC in 2016. This will provide an updated design and improved user experience on mobile and tablet devices, compared to the current site templates that are optimised for viewing on a desktop screen. We will also add the facility to generate PDFs. In the interim some functionality and content will remain “works in progress” but the community have gained early access to a significant new resource.

The remaining 14 sections to complete the Handbook have now been written, edited and are in peer review (see Handbook contents page for coming soon sections). We are aiming to complete this work and revise content for publication by the end of March 2016. The Handbook is now live so we will need to close and update section by section for these 14 remaining updates, hopefully in the final week of March and/or early April 2016. Watch this space for future announcements!

NRS joins funding group

The Digital Preservation Coalition was delighted to announce this month that The National Records of Scotland (NRS) had come on board as a ‘Bronze Sponsor’ for the eagerly anticipated second edition of the ‘Digital Preservation Handbook’. As of February 2016, with the addition of the NRS we have raised 93% of estimated funding required for the Handbook revision. We have prioritised content creation, scaled back some events, and adjusted budgets to ensure completion within a very tight funding profile.

Slideshare from Handbook Workshop at DCDC15

A workshop on the Digital Preservation Handbook was run at the DCDC15 conference in early October. Powerpoint slides from the Handbook presentation are now available on Slideshare. They provide a detailed overview of the new edition Handbook and work in progress. To date, there have been over 2,000 views of the slides.

Next »