Dataset

rQM9




Species content of dataset


Dataset viewer powered by Hugging Face

Name :
rQM9
ColabFit ID :
Description :
133885 molecular structures from the QM9 with revised bond and charges in the SDF format. Bond information can be gathered from the metadata column of the parquet files, a map where the key bonds contains the bond indices as they appear in the final rows of an SDF molecule block. If additional charges are present, these are contained under the key charge_info. rQM9 is derived from DeepChem's QM9 SDF dataset and rectifies the original dataset's net-charge discrepancies and invalid bond orders by enforcing correct valency-charge configurations. Nevertheless, a subset of molecules remains problematic, as they either fail RDKit sanitization or fragment into multiple components. The zero-based indices of these unresolved molecules are provided in a NumPy file in the original data file.
Authors :
Cheng Zeng, Jirui Jin, George Karypis, Mark Transtrum, Ellad B. Tadmor, Richard G. Hennig, Adrian Roitberg, Stefano Martiniani, Mingjie Liu
DOI :
None https://commons.datacite.org/doi.org/None https://doi.datacite.org/dois/None https://doi.org/None Cite as: Zeng, C., Jin, J., Karypis, G., Transtrum, M., Tadmor, E. B., Hennig, R. G., Roitberg, A., Martiniani, S., and Liu, M. "rQM9." ColabFit, 2025. https://doi.org/None.
For other citation formats, see the DataCite Fabrica page for this dataset.
Num. Configurations :
133,885
Num. Atoms :
2,407,753
Downloads :
26
Calculated Property Types :
Not specified
Elements :
C (35.16%) F (0.14%) H (51.09%) N (5.8%) O (7.81%)
Methods :
DFT-B3LYP
Software :
Gaussian 09
Configuration Sets by Name :
Configuration Sets by ID :

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.