There is an ever-increasing amount of data, projects, and publications hosted across several platforms coming from different disciplines. For one part, funding agencies require that as many results as possible are made available to be accessible even after a project is finished. Furthermore, there should be an intrinsic motivation by scientists to make their data as well described and easily findable as possible. Thoroughly described metadata is key to increase findability and subsequently increase reusability, which can increase the impact of one’s own research. Such metadata can be rated using the FAIR principles. However, only less than half of all scientists have ever heard, and even less used them to efficiently describe their research data. Consequently, metadata found on scientific repositories shows often strongly varying levels of completeness, or is simply faulty. There are various guidelines and initiatives led by e.g. research institutes to improve metadata quality, as it really is as famously noted “A love letter to the future”. We, the National Centre for Environmental and Nature Conservation Information based at the German Environment Agency, aim to integrate Germany’s scattered data and information landscape on environmental and nature knowledge into one central access point. We have encountered a wide …
Meeting rooms
📢 You're Invited: Advancing FAIR Data with NetCDF – Join the Conversation!
Ensuring that scientific data is Findable, Accessible, Interoperable, and Reusable (FAIR) is more important than ever. In Earth System Science, NetCDF has become the quasi-standard for storing multidimensional data. But to truly unlock its potential, we need rich, standardized metadata .
Join us for an insightful discussion where we’ll explore what are the key challenges in metadata compatibility and completeness. What tools do we need for improving the metadata of our scientific output? And how can we guarantee seamless metadata integration, AI-readiness, and improved data discoverability.
🔍 What to Expect:
- An overview of the HMG NetCDF Initiative and its goals
- A first look at the NetCDF metadata attribute guidelines
- Discussion on aligning metadata fields across disciplines
- A Discussion on tools for machine-readable templates and user-friendly metadata entry
🌍 This is a collaborative effort across German research centers and contributes to broader Helmholtz initiatives like HMC .
Let’s shape the future of geoscientific data together. We look forward to your participation and insights!
Meeting rooms
The proliferation of digital portals for accessing environmental and geoscientific data has significantly enhanced the ability of researchers, policymakers, and the public to retrieve and utilize critical information.
Furthermore, metadata content can be harvested, which brings the added value that information is collected only once and then presented in different web presences. In this way, it is possible to tailor the presentation of metadata attributes to the relevant user groups - optimally presented according to their priorities.
The Earth Data Portal (https://earth-data.de/), as a collaborative effort of the Helmholtz centers of the research field Earth and Environment enables querying data from multiple repositories, particularly from the Helmholtz research centers.
In addition, umwelt.info (https://umwelt.info/de), operated by the German Environment Agency, offers a user-friendly interface of openly available environmental and nature protection data tailored for seamless access by the general public.
The respective metadata content, on the other hand, is certainly best presented on the website of the source repository.
This poster provides a general overview of the highlighted portals and delves into the specifics of their implementations, with a focus on the use of Persistent Identifiers (PIDs). PIDs play a crucial role in ensuring the long-term accessibility and citability of data, …
At the Helmholtz Association, we aim to establish a well-structured and harmonized data space that connects information across distributed data infrastructures. Achieving this goal requires the standardization of dataset descriptions using appropriate metadata and the definition of a single source of truth for much of this metadata, from which different systems can draw. Persistent Identifiers (PIDs) in metadata enable the reuse of common information from shared sources. Broad adoption of PID types enhances interoperability and supports machine-actionable data. As a first step, we recommend implementing ROR, ORCID, IGSN, PIDINST, DataCite DOI, and Crossref DOI in our data systems.
However, to practically record and integrate this information into our repositories, we must first identify the specific locations and stakeholders within institutions where this data is generated and maintained. We must also assess what kinds of tools and services the Association needs to provide to support seamless data management for its users.
In this presentation, we highlight several tools we propose to implement across the organization, based on envisioned workflows. These include, for example, repository software, electronic lab notebooks (ELNs), terminology services, and other infrastructure components. Implementing these tools will support the various stakeholder groups in fulfilling their roles and will contribute …
The Helmholtz Metadata Collaboration (HMC) Hub Earth and Environment seeks to create a framework for semantic interoperability across the diverse research data platforms within the Helmholtz research area Earth and Environment (E&E). Standardizing metadata annotations and aligning the use of semantic resources are essential for overcoming barriers in data sharing, discovery, and reuse. To foster a unified, community-driven approach, HMC, together with the DataHub, has established the formal "Metadata-Semantics" Working Group, which brings together engaged data stewards from major Helmholtz research data platforms within the E&E domain.
As part of its strategy to standardize metadata annotation in collaboration with the community, the working group will begin by harmonizing device-type denotations across two Helmholtz sensor registries: the O2A REGISTRY, developed at AWI, and the Sensor Management System (SMS), maintained by UFZ, GFZ, KIT, and FZJ. This harmonization involves the development of a shared FAIR controlled vocabulary and the implementation of a peer-reviewed curation process for it.
The common vocabulary will support the creation of referenceable ad-hoc terms when needed, incorporate versioning and quality assurance measures, and establish links with existing terminologies in the field (e.g., NERC L05, L06, ODM2, GCMD). Its development will involve experts from various disciplines within Helmholtz E&E …
To ensure FAIR data (Wilkinson et al., 2016: https://doi.org/10.1038/sdata.2016.18 ), well-described datasets with rich metadata are essential for interoperability and reusability. In Earth System Science, NetCDF is the quasi-standard for storing multidimensional data, supported by metadata conventions such as Climate and Forecast (CF, https://cfconventions.org/ ) and Attribute Convention for Data Discovery (ACDD, https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3 ).
While NetCDF can be self-describing, metadata often lacks compatibility and completeness for repositories and data portals. The Helmholtz Metadata Guideline for NetCDF (HMG NetCDF) Initiative addresses these issues by establishing a standardized NetCDF workflow. This ensures seamless metadata integration into downstream processes and enhances AI-readiness.
A consistent metadata schema benefits the entire processing chain. We demonstrate this by integrating enhanced NetCDF profiles into selected clients like the Earth Data Portal (EDP, https://earth-data.de ). Standardized metadata practices facilitate repositories such as PANGAEA ( https://www.pangaea.de/ ) and WDCC ( https://www.wdc-climate.de ), ensuring compliance with established norms.
The HMG NetCDF Initiative is a collaborative effort across German research centers, supported by the Helmholtz DataHub. It contributes to broader Helmholtz efforts (e.g., HMC) to improve research data management, discoverability, and interoperability.
Key milestones include:
- Aligning metadata fields across disciplines,
- Implementing guidelines,
- Developing machine-readable templates and validation tools,
- Supporting user-friendly metadata …