CHEMeDATA Initiative

Main website of the CHEMeDATA umbrella organization

Forwords

There is no clear or autoritarian definition of a chemistry object (CO) or mappin into any specific ontology. This may come at the later stage (and a few words are mentionned below).

It is currently based on a use-case needs, in particular the need to “indentify” chemistry stuff from archived chemistry data stored in .zip files.

Digital Objects

Compound (Meta)Data

Spectroscopy Data

Assignment Data

Digital Entities

TODO: Map these ideas to the above

Chemistry objects

This is a tentative list of objects and relations.

Compound

A full NMReDATA records could create:

NMReDATA includes multiple ontologies: -The core assignment (chemical shift and coupling (agregated) linked to atom number in molecule. -ontology of single-spectrum description -ontology of a 1D signal -ontology of a 2D signal

In this context it refers to any piece of information relevant to chemistry.

The word object has a special meaning in computer sciences indicating that it has an electronic form meaning that it is computer understandable. A .jpg image of a spectrum, for example, cannot be understood by a computer to be a spectrum. It is an image object, not a chemistry or spectrum object.

Ontology of chemistry object

Objects will have schema representation. We would probably use RDF to link elements (part of a OWL ontology).

Prelimary list of chemistry objects (to be refiined by the chemistry community):

For each type of object, we should state which property is mandatory/optional. Consider if we should have out own schema (in hirited from “official” ones - with XSLT . Each main schema should have flavors (in parenthesis) …

Set of objects will use existing schema

Use cases

chemistry data in ZIP files

By nature, the XML format allows to combine the XML of a set of objects. For .zip file including chemistry files the list of objects should be included a manifest file (say chemObjmanifest.xml) listing all the identified objects included in the file with a link to each file.

chemistry objects uploaded from database

When chemist download data (say a spectrum, a chemical structure, etc) a manifest file could be associated to the object (in the zip file including the paylowad) If the file is dowloaded “alone” the metadate about the object could be included in the file (for jcamp, one could include fields corresponding to the metadate of the object - origin, in particular- Pdf also allows to include metadata). Even more general, an alternative would be for database to calculated the .md5 of all downloadable files (spectra, mol files, etc.) and have a registry of files so that the origin of a file could be retraced back using a RESTfull-API or a centralized service (datacite ?) to offers such a service.

Key aspects

Eg. A hand-made drawing is not “readable” by a computer program, a .cdx is readable by commercial software, .mol files can be read my multiple type of free sofware and web tools.

Relevance

The (chemistry) objects carry information that can be seached for used by

Facilited the use is the key aspect of open and FAIR data.

Note: that Research Object (RO) has no relevance to the specifically chemical ontology presented here.