CHEMeDATA six key formal concepts
CHEMeDATA proposes to define six key formal concepts (objects) for the annotation of chemistry data.
- 1 Substance
- 2 Chem. Equation formalize the transformation of reactants Substance(s) into products Substance(s)
- 3 Sample include Substances
- 4 Process transforms a Sample into another Sample. Reaction, extraction, purification, etc. are Process’s.
- 5 Analytical data provides information about a Sample
- 6 Assignment combines one (or a set of) Analytical data with properties of the other concepts.
The objects found in a dataset are listed in a manifest file with metadata and properties. A selection of objects and key metadata and properties can be registered with the DOI of the data set using the datacite “subject” field to make them “Findable”.
- Substance
- Formula (XnYm)
- properties
- color code:
- Image / red (low quality - red - requires drawing the structure from the image)
- Ambiguous / orange (not clear which part of the file is relevant - requires curation)
- OK /green (has an inchi, inchikey….)
- Sample
- Types:
- (s) Solution sample : Simple solution with a single pure solvent
- (m) Solution sample in a mixture of solvents: Mixture of solvents
- (w) Water solution : for water solution with buffer, salt, etc.
- (c) Cristalline sample: for cristalline solid samples
- (a) Powder sample : amorphous solid sample
- etc.
- properties
- composition (Substances, quantity)
- MInChI
- etc.
- color code:
- No description / red
- OK / green
- Types:
- Chem Equation
formalize the transformation of reactants Substance(s) into products Substance(s)
- Specify the cathegory of the reaction?
- Properties
- category (name…)
- reactants (Substance, stochiometry)
- products (Substance, stochiometry)
- conditions (…)
- Rinchi
- Color code:
- No equation or image/ red
- Ambiguous / orange (not clear which part of the file is relevant - requires curation)
- OK / green (has an Rinchi, inchi of reactant…)
- Specify the cathegory of the reaction?
- Process transforms a Sample into another Sample
- categories
- Chromoatography (HPLC, etc.)
- Purification? (Recristallisation, etc.)
- Reaction (RInChi),
- Plant extraction?
- etc. to be worked on!
- Properties
- Category (name…)
- Conditions
- Initial Sample
- Final Sample
- Chem Equation (for reactions only)
- categories
- Analysis provides information about a Sample
- Assignment combines one (or a set of) Analysis with one (or more) property of the other 5 object types
- IR
- Ms ? categories…
- etc.
- color code for assignmentNMRspectra:
- no spectral analysis / orange (no peak picking, no extraction of chemical shifts…)
- OK / green (OK has assignmentNMRdata)
- color code for assignmentNMRdata:
- No assignement / orange (has list of peaks otherwise would not exists, but no assignment of peaks to part of the substance)
- OK / green (has assignement of all reasonably assignable peaks)
- Validation / gold (has validation)
Goal 1: Provide a linked-data consolidation of the CHEMeDATA format
Move from the NMR-only assignment format of the NMReDATA initiative (based on .sdf files) towards linked data.
Goal 2: Provide a linked-data to identify chemistry objects in archive files
Exported OWLDoc can be found in the chemedata/playground folder.
Diverging implementation
