…

Facilitating the sharing of open weather data

Leigh Dodds

Ensuring that data can be easily accessed, used and shared requires the use of data standards. If you are currently working on a data project you should take time to consider what standards might be available to you to help to achieve the goals of your project.

Ensuring that data can be easily accessed, used and shared requires the use of data standards. If you are currently working on a data project you should take time to consider what standards might be available to you to help to achieve the goals of your project.

Data such as statistics, maps and real-time sensor readings help people to make decisions, build services and gain insight. A strong data infrastructure is critical to fostering business innovation, driving better public services and creating healthy, sustainable communities. Local, national and global data infrastructure will only become more vital as populations grow, and economies and societies become more reliant on getting value from data.

Data infrastructure consists of data assets, the organisations that operate and maintain them, and the guidance and processes that describe how to organise, use and manage that data. Sustainable publishing of high-quality data in standard forms is an important element of a strong data infrastructure. The value of data comes from its use. It is important to reduce friction that may stop data from being accessed, used and shared. A recent survey reported that data scientists spend the majority of their time cleaning and tidying data.

Open licenses give permission for anyone to access, use and share data. But it is by publishing data using common consistent language, identifiers and formats that it becomes easier for people and machines to create value from it.

What are open standards for data?

Data standards are documented, useable agreements that help organisations to publish and exchange data in consistent ways. There are many different types of data standard, reflecting the different types of agreement that are needed to support the consistent representation of data.

For example, we can standardise the words and concepts that we use to describe things in a dataset. By using consistent language, it becomes possible to confirm that datasets are measuring and describing the same things. It is also possible to standardise the ways in which individual data items are measured, and how data can be organised to make it machine-readable.

Take, for example, the reporting of temperature observations from a weather station. There are many things that can be standardised in this dataset alone. The means by which temperature readings are carried out (e.g. the frequency with which observations are made), the units in which temperature is reported (e.g. Centigrade or Fahrenheit) and the format in which dates and time are recorded against each measurement are obvious starting points. But standards can also define how the temperature readings and timestamps are organised in data files, like spreadsheets. Or how specific data formats (e.g. CSV, XLS, or JSON) are used to exchange the machine-readable data. The metadata used to describe the dataset, including an identifier for the weather station and information about the data publisher, can also be standardised.

These standards may be developed at different times, by different communities. Standards development is a collaborative process, involving an international community. Open standards are those that are created in ways that conform to the OpenStand principles: using open processes and published under terms that allow them to be freely used for any purpose.

By building on the work of other communities it is easier to create new standards that help to improve data is published to support new applications and services.

How do standards create impact?

The most immediate benefits of creating and using open standards for data is in ensuring that data can be easily accessed and used. Once data is published in standard formats, it reduces the costs of working with new data, because existing code and analysis can be applied to the new datasets. 

Standards can also encourage the creation of new tools that are designed to help support those who are publishing and using datasets that conform to that standard. But the technical impacts are just the most obvious benefits that a standard can bring.

Standards can help to create ecosystems and drive innovation. The availability of collection of consistently organised data can enable start-ups to create new tools and services that will help others to create value from that data. For example, the adoption of the GTFS standard by transport authorities around the world has enabled a range of new services that help millions of people around the world to make the most of public transport. Transport for London have reported that their publication of open data using this standard may be contributing up to £130 million a year to the local economy.

Standards can also help to enact policy or legislation, helping to change markets and improve delivery of public services. For example, the UK’s open banking standard was imposed by the competition regulator to create a more competitive and innovative banking sector. 

Standards can also be used to drive social change by encouraging governments and private sector companies to publish data in consistent ways to help create transparency. The Aid Transparency, EITI and Open Contracting Partnerships are each using standards as an important tool to enable change.

Challenges in standards development and adoption

However, creating good data standards is hard. There are often many different competing standards that could help support the publication of a dataset. But they are often poorly adopted, and it is not clear to users which standard might be the best to use.

At the Open Data Institute, we have been researching the factors that contribute to the failure to create and adopt data standards, and some approaches towards addressing those challenges. We have been conducting user and desk research, and have been collaborating with other organisations involved in data standards to better understand the challenges faced in standards development.

As part of the GODAN Action project, the ODI has also been working to understand the range of standards used in the agricultural sector, e.g. to share weather data. The goal is to identify some useful interventions that may help encourage greater adoption of standards in the sector.

The project has already identified several needs. Firstly, developers need better tools to help them discover relevant standards. The recently launched map of agri-food data standards will help to address this issue in the agricultural sector, while the open data standards directory has a wider scope. Secondly the project has highlighted that more guidance is needed on the process of standards development to help organisations collaborate to create well-designed standards. Finally, standards developers need to think more about how their standard with be adopted and the types of tools, documentation and engagement that will ensure their standard is successful.

There are also potential benefits in building better peer networks between organisations involved in standards development, and between those people involved in developing standards and the organisations and communities that might benefit from them.

The ODI will be working with partners in the GODAN Action project to implement solutions to some of these issues, whilst also working to publish a new guidebook to support the development of new standards. 

other articles in this issue

Copyright © 2016, CTA. Technical Centre for Rural and Agricultural Cooperation

CTA is a joint international institution of the African, Caribbean and Pacific (ACP) Group of States and the European Union (EU). CTA operates under the framework of the Cotonou Agreement and is funded by the EU.