Dataset

PropMolFlow_QM9_CNOFH_2025




Species content of dataset


Dataset viewer powered by Hugging Face

Name PropMolFlow_QM9_CNOFH_2025
Extended ID PropMolFlow_QM9_CNOFH_2025__Zeng-Jin-Karypis-Transtrum-Tadmor-Hennig-Roitberg-Martiniani-Liu__DS_6qqf55wad1mv_0
Description This DFT dataset is curated in response to the growing interest in property-guided molecule genaration using generative AI models. Typically, the properties of generated molecules are evaluated using machine learning (ML) property predictors trained on fully relaxed dataset. However, since generated molecules may deviate significantly from relaxed structures, these predictors can be highly unreliable for assessing their quality. This data provides DFT-evaluated properties, energy and forces for generated molecules. These structures are unrelaxed and can serve as a validation set for machine learning property predictors used in conditional molecule generation. It includes 10,773 molecules generated using PropMolFlow, a state-of-the-art conditional molecule generation model. PropMolFlow employs a flow matching process parameterized with an SE(3)-equivariant graph neural network. PropMolFlow models are trained on QM9 dataset. Molecules are generated by conditioning on six properties---polarizibility, gap, HOMO, LUMO, dipole moment and heat capacity at room temperature 298K---across two tasks: in-distribution and out-of-distribution generation. Full details are available in the corresponding paper.
Authors Cheng Zeng
Jirui Jin
George Karypis
Mark Transtrum
Ellad B. Tadmor
Richard G. Hennig
Adrian Roitberg
Stefano Martiniani
Mingjie Liu
DOI None
https://commons.datacite.org/doi.org/None
https://doi.datacite.org/dois/None
https://doi.org/None

Cite as: Zeng, C., Jin, J., Karypis, G., Transtrum, M., Tadmor, E. B., Hennig, R. G., Roitberg, A., Martiniani, S., and Liu, M. "PropMolFlow QM9 CNOFH 2025." ColabFit, 2025. https://doi.org/None.
For other citation formats, see the DataCite Fabrica page for this dataset.
Calculated Property Types atomic_forces
energy
Elements
C (34.87%)
F (0.18%)
H (54.17%)
N (4.48%)
O (6.3%)
Number of Configurations 10,773
Number of Atoms 205,304
Publication Link https://arxiv.org/abs/2505.21469
Configuration Sets by Name
Configuration Sets by ID
ColabFit ID DS_6qqf55wad1mv_0
Downloads 204
Files colabfitspec.json

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.