CHEMeDATA Initiative

Main website of the CHEMeDATA umbrella organization

How to generate good chemical data?

In most cases supplementary data are combined into .zip files.

Requirement type of data
Must do Save data as generated by the instruments or the software used to process them.
optional Add a .pdf files generated by software.
good idea Add a file in an open format (i.e. .mol when save a chemical structure).
To be avoided Have printed spectra only. This forces you to scans the spectra as pdf or images making them extremely difficult to be recognized as spectra.

Scans of hard paper should be provided only when no other data are available. In some favorable cases, data could be extracted from .pdf generated by software, but it is difficult to generalize and imporant metadata and parameters will certainly be lost.

If a compounds/samples was characterized using diverse methods (IR, NMR, etc.), have all these data in the same compound/sample folder so that the link between structure and spectra is obvious.

Schematically:

Don’t group spectra by method (IR, NMR, etc.).

┬method1┬compound1
│       ├compound2
│       └compound3
└method1┬compound1
        ├compound2
        └compound4

Prefer to group them by compounds/sample …

… and include a structure file (the .cdx, .mol, .dsf, etc.):

┬compound1┬method1
│         ├method2 
│         └structure.mol (provide the structure file!)
├compound2┬method1
│         ├method2 
│         └structure.mol (provide the structure file!)
└compound3┬method1
          ├method2 
          └structure.mol (provide the structure file!)

The number following “compound” should correspond to the compound number in the publication/thesis. This is bettern than name folders by compound names.

Demo of inline molecules…

more info