Dataset
OMol25_test
Download Original Data Files
8.3 GB
Species content of dataset
Name :
OMol25_test
ColabFit ID :
Files :
Description :
The test set of OMol25. OMol25 (Open Molecules 2025) is a large dataset of structures with up to 350 atoms, calculated at a high level of DFT theory (ωB97M-V/def2-TZVPD). This dataset is intended to provide a broad sampling of chemical complexity and structural diversity. OMol2 includes biomolecules, metal complexes, electrolytes, and community datasets that have been recalculated at this higher level of theory. Included community datasets are: ANI-2X, Transition-1X, ANI-1xBB, OrbNet Denali, SPICE2, and Solvated Protein Fragments. OMol25 also includes 30% of the GEOM dataset, with these systems optimized and a fraction of these having their initial positions randomly perturbed.
Authors :
Daniel S. Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G. Taylor, Muhammad R. Hasyim, Kyle Michel, Ilyes Batatia, Gábor Csányi, Misko Dzamba, Peter Eastman, Nathan C. Frey, Xiang Fu, Vahe Gharakhanyan, Aditi S. Krishnapriyan, Joshua A. Rackers, Sanjeev Raja, Ammar Rizvi, Andrew S. Rosen, Zachary Ulissi, Santiago Vargas, C. Lawrence Zitnick, Samuel M. Blau, Brandon M. Wood
DOI :
None
https://commons.datacite.org/doi.org/None
https://doi.datacite.org/dois/None
https://doi.org/None
Cite as: Levine, D. S., Shuaibi, M., Spotte-Smith, E. W. C., Taylor, M. G., Hasyim, M. R., Michel, K., Batatia, I., Csányi, G., Dzamba, M., Eastman, P., Frey, N. C., Fu, X., Gharakhanyan, V., Krishnapriyan, A. S., Rackers, J. A., Raja, S., Rizvi, A., Rosen, A. S., Ulissi, Z., Vargas, S., Zitnick, C. L., Blau, S. M., and Wood, B. M. "OMol25 test." ColabFit, 2025. https://doi.org/None.
For other citation formats, see the DataCite Fabrica page for this dataset.
For other citation formats, see the DataCite Fabrica page for this dataset.
Num. Configurations :
2,766,167
Num. Atoms :
342,021,649
Downloads :
0
Calculated Property Types :
No ColabFit properties included
Elements :
Ag (0.01%)
Al (0.01%)
Ar (0.0%)
As (0.01%)
Au (0.0%)
B (0.09%)
Ba (0.0%)
Be (0.01%)
Bi (0.0%)
Br (0.29%)
C (29.71%)
Ca (0.01%)
Cd (0.0%)
Ce (0.0%)
Cl (0.7%)
Co (0.0%)
Cr (0.0%)
Cs (0.01%)
Cu (0.01%)
Dy (0.0%)
Er (0.0%)
Eu (0.0%)
F (1.17%)
Fe (0.0%)
Ga (0.0%)
Gd (0.0%)
Ge (0.0%)
H (50.47%)
He (0.0%)
Hf (0.0%)
Hg (0.0%)
Ho (0.0%)
I (0.06%)
In (0.01%)
Ir (0.0%)
K (0.01%)
Kr (0.0%)
La (0.0%)
Li (0.01%)
Lu (0.0%)
Mg (0.02%)
Mn (0.0%)
Mo (0.0%)
N (6.29%)
Na (0.01%)
Nb (0.0%)
Nd (0.0%)
Ne (0.0%)
Ni (0.0%)
O (9.93%)
Os (0.0%)
P (0.35%)
Pb (0.0%)
Pd (0.0%)
Pm (0.0%)
Pr (0.0%)
Pt (0.0%)
Rb (0.0%)
Re (0.0%)
Rh (0.0%)
Ru (0.0%)
S (0.67%)
Sb (0.0%)
Sc (0.0%)
Se (0.01%)
Si (0.03%)
Sm (0.0%)
Sn (0.0%)
Sr (0.0%)
Ta (0.0%)
Tb (0.0%)
Tc (0.0%)
Te (0.0%)
Ti (0.0%)
Tl (0.0%)
Tm (0.0%)
V (0.01%)
W (0.0%)
Xe (0.0%)
Y (0.0%)
Yb (0.0%)
Zn (0.0%)
Zr (0.0%)
Methods :
DFT-ωB97M-V
Software :
ORCA 6.0.0
Publication Link :
Data Source Link :
Configuration Sets by Name :
Configuration Sets by ID :
Name: OMol25_test
Extended ID: OMol25_test__Levine-Shuaibi-Spotte-Smith-Taylor-Hasyim-Michel-Batatia-Csanyi-Dzamba-Eastman-Frey-Fu-Gharakhanyan-Krishnapriyan-Rackers-Raja-Rizvi-Rosen-Ulissi-Vargas-Zitnick-Blau-Wood__DS_7s21i0qyoj7p_0
Description: The test set of OMol25. OMol25 (Open Molecules 2025) is a large dataset of structures with up to 350 atoms, calculated at a high level of DFT theory (ωB97M-V/def2-TZVPD). This dataset is intended to provide a broad sampling of chemical complexity and structural diversity. OMol2 includes biomolecules, metal complexes, electrolytes, and community datasets that have been recalculated at this higher level of theory. Included community datasets are: ANI-2X, Transition-1X, ANI-1xBB, OrbNet Denali, SPICE2, and Solvated Protein Fragments. OMol25 also includes 30% of the GEOM dataset, with these systems optimized and a fraction of these having their initial positions randomly perturbed.
Authors:
Daniel S. Levine
Muhammed Shuaibi
Evan Walter Clark Spotte-Smith
Michael G. Taylor
Muhammad R. Hasyim
Kyle Michel
Ilyes Batatia
Gábor Csányi
Misko Dzamba
Peter Eastman
Nathan C. Frey
Xiang Fu
Vahe Gharakhanyan
Aditi S. Krishnapriyan
Joshua A. Rackers
Sanjeev Raja
Ammar Rizvi
Andrew S. Rosen
Zachary Ulissi
Santiago Vargas
C. Lawrence Zitnick
Samuel M. Blau
Brandon M. Wood
DOI: None
Calculated Property Types:
Elements:
Ag (0.01%)
Al (0.01%)
Ar (0.0%)
As (0.01%)
Au (0.0%)
B (0.09%)
Ba (0.0%)
Be (0.01%)
Bi (0.0%)
Br (0.29%)
C (29.71%)
Ca (0.01%)
Cd (0.0%)
Ce (0.0%)
Cl (0.7%)
Co (0.0%)
Cr (0.0%)
Cs (0.01%)
Cu (0.01%)
Dy (0.0%)
Er (0.0%)
Eu (0.0%)
F (1.17%)
Fe (0.0%)
Ga (0.0%)
Gd (0.0%)
Ge (0.0%)
H (50.47%)
He (0.0%)
Hf (0.0%)
Hg (0.0%)
Ho (0.0%)
I (0.06%)
In (0.01%)
Ir (0.0%)
K (0.01%)
Kr (0.0%)
La (0.0%)
Li (0.01%)
Lu (0.0%)
Mg (0.02%)
Mn (0.0%)
Mo (0.0%)
N (6.29%)
Na (0.01%)
Nb (0.0%)
Nd (0.0%)
Ne (0.0%)
Ni (0.0%)
O (9.93%)
Os (0.0%)
P (0.35%)
Pb (0.0%)
Pd (0.0%)
Pm (0.0%)
Pr (0.0%)
Pt (0.0%)
Rb (0.0%)
Re (0.0%)
Rh (0.0%)
Ru (0.0%)
S (0.67%)
Sb (0.0%)
Sc (0.0%)
Se (0.01%)
Si (0.03%)
Sm (0.0%)
Sn (0.0%)
Sr (0.0%)
Ta (0.0%)
Tb (0.0%)
Tc (0.0%)
Te (0.0%)
Ti (0.0%)
Tl (0.0%)
Tm (0.0%)
V (0.01%)
W (0.0%)
Xe (0.0%)
Y (0.0%)
Yb (0.0%)
Zn (0.0%)
Zr (0.0%)
Methods:
DFT-ωB97M-V
Software:
ORCA 6.0.0
Number of Configurations: 2,766,167
Number of Atoms: 342,021,649
Publication Link: https://arxiv.org/abs/2505.08762
Data Source Link: https://huggingface.co/facebook/OMol25
No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.