graphdot.dataset.qm7 module

graphdot.dataset.qm7.QM7(download_url='http://quantum-machine.org/data/qm7.mat', local_filename='qm7.mat', overwrite=False, ase=False)[source]

A 7165-molecule subset of the GDB-13 dataset. Molecules have up to 23 total atoms and 7 heavy atoms. Atomization energies are computed at the Perdew-Burke-Ernzerhof hybrid functional (PBE0) level.

References: - L. C. Blum, J.-L. Reymond, 970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13, J. Am. Chem. Soc., 131:8732, 2009. - M. Rupp, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld: Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning, Physical Review Letters, 108(5):058301, 2012

Parameters:
  • download_url (str) – URL to download the qm7.mat data file.
  • local_filename (str) – Name for local storage of the data file.
  • overwrite (bool) – Whether or not to overwrite the local file if one already exists.
  • ase (bool) – Whether to create ASE Atoms objects from the dataset.
Returns:

qm7 – A dataframe containing the data from QM7.

Return type:

DataFrame