Dataset
OMol25_validation
Download Original Data Files
21.3 GB
Download Dataset Parquet Files
14.2 GB
Species content of dataset
Name :
OMol25_validation
ColabFit ID :
Files :
Description :
The validation set from OMol25. From the dataset creator: OMol25 represents the largest high quality molecular DFT dataset spanning biomolecules, metal complexes, electrolytes, and community datasets. OMol25 was generated at the ω B97M-V/def2-TZVPD level of theory.
Authors :
Daniel S. Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G. Taylor, Muhammad R. Hasyim, Kyle Michel, Ilyes Batatia, Gábor Csányi, Misko Dzamba, Peter Eastman, Nathan C. Frey, Xiang Fu, Vahe Gharakhanyan, Aditi S. Krishnapriyan, Joshua A. Rackers, Sanjeev Raja, Ammar Rizvi, Andrew S. Rosen, Zachary Ulissi, Santiago Vargas, C. Lawrence Zitnick, Samuel M. Blau, Brandon M. Wood
DOI :
10.60732/8baea040
https://commons.datacite.org/doi.org/10.60732/8baea040
https://doi.datacite.org/dois/10.60732%2F8baea040
https://doi.org/10.60732/8baea040
Cite as: Levine, D. S., Shuaibi, M., Spotte-Smith, E. W. C., Taylor, M. G., Hasyim, M. R., Michel, K., Batatia, I., Csányi, G., Dzamba, M., Eastman, P., Frey, N. C., Fu, X., Gharakhanyan, V., Krishnapriyan, A. S., Rackers, J. A., Raja, S., Rizvi, A., Rosen, A. S., Ulissi, Z., Vargas, S., Zitnick, C. L., Blau, S. M., and Wood, B. M. "OMol25 validation." ColabFit, 2025. https://doi.org/10.60732/8baea040.
For other citation formats, see the DataCite Fabrica page for this dataset.
For other citation formats, see the DataCite Fabrica page for this dataset.
Num. Configurations :
2,762,021
Num. Atoms :
283,298,012
Downloads :
0
Calculated Property Types :
atomic_forces
energy
Elements :
Ag (0.01%)
Al (0.02%)
Ar (0.0%)
As (0.02%)
Au (0.0%)
B (0.16%)
Ba (0.0%)
Be (0.01%)
Bi (0.0%)
Br (0.53%)
C (29.41%)
Ca (0.01%)
Cd (0.0%)
Ce (0.0%)
Cl (1.16%)
Co (0.0%)
Cr (0.01%)
Cs (0.01%)
Cu (0.01%)
Dy (0.0%)
Er (0.0%)
Eu (0.0%)
F (1.63%)
Fe (0.0%)
Ga (0.01%)
Gd (0.0%)
Ge (0.0%)
H (50.43%)
He (0.0%)
Hf (0.0%)
Hg (0.0%)
Ho (0.0%)
I (0.11%)
In (0.01%)
Ir (0.0%)
K (0.01%)
Kr (0.0%)
La (0.0%)
Li (0.02%)
Lu (0.0%)
Mg (0.02%)
Mn (0.0%)
Mo (0.0%)
N (5.14%)
Na (0.01%)
Nb (0.01%)
Nd (0.0%)
Ne (0.0%)
Ni (0.0%)
O (9.78%)
Os (0.0%)
P (0.49%)
Pb (0.01%)
Pd (0.01%)
Pm (0.0%)
Pr (0.0%)
Pt (0.01%)
Rb (0.0%)
Re (0.0%)
Rh (0.0%)
Ru (0.0%)
S (0.77%)
Sb (0.01%)
Sc (0.0%)
Se (0.01%)
Si (0.07%)
Sm (0.0%)
Sn (0.01%)
Sr (0.0%)
Ta (0.0%)
Tb (0.0%)
Tc (0.0%)
Te (0.0%)
Ti (0.0%)
Tl (0.0%)
Tm (0.0%)
V (0.01%)
W (0.0%)
Xe (0.0%)
Y (0.0%)
Yb (0.0%)
Zn (0.0%)
Zr (0.0%)
Methods :
DFT-ωB97M-V
Software :
ORCA
Publication Link :
Data Source Link :
Configuration Sets by Name :
Configuration Sets by ID :
Name: OMol25_validation
Extended ID: OMol25_validation__Levine-Shuaibi-Spotte-Smith-Taylor-Hasyim-Michel-Batatia-Csanyi-Dzamba-Eastman-Frey-Fu-Gharakhanyan-Krishnapriyan-Rackers-Raja-Rizvi-Rosen-Ulissi-Vargas-Zitnick-Blau-Wood__DS_lcjsp7ctc1hy_0
Description: The validation set from OMol25. From the dataset creator: OMol25 represents the largest high quality molecular DFT dataset spanning biomolecules, metal complexes, electrolytes, and community datasets. OMol25 was generated at the ω B97M-V/def2-TZVPD level of theory.
Authors:
Daniel S. Levine
Muhammed Shuaibi
Evan Walter Clark Spotte-Smith
Michael G. Taylor
Muhammad R. Hasyim
Kyle Michel
Ilyes Batatia
Gábor Csányi
Misko Dzamba
Peter Eastman
Nathan C. Frey
Xiang Fu
Vahe Gharakhanyan
Aditi S. Krishnapriyan
Joshua A. Rackers
Sanjeev Raja
Ammar Rizvi
Andrew S. Rosen
Zachary Ulissi
Santiago Vargas
C. Lawrence Zitnick
Samuel M. Blau
Brandon M. Wood
DOI: 10.60732/8baea040
Calculated Property Types:
atomic_forces
energy
Elements:
Ag (0.01%)
Al (0.02%)
Ar (0.0%)
As (0.02%)
Au (0.0%)
B (0.16%)
Ba (0.0%)
Be (0.01%)
Bi (0.0%)
Br (0.53%)
C (29.41%)
Ca (0.01%)
Cd (0.0%)
Ce (0.0%)
Cl (1.16%)
Co (0.0%)
Cr (0.01%)
Cs (0.01%)
Cu (0.01%)
Dy (0.0%)
Er (0.0%)
Eu (0.0%)
F (1.63%)
Fe (0.0%)
Ga (0.01%)
Gd (0.0%)
Ge (0.0%)
H (50.43%)
He (0.0%)
Hf (0.0%)
Hg (0.0%)
Ho (0.0%)
I (0.11%)
In (0.01%)
Ir (0.0%)
K (0.01%)
Kr (0.0%)
La (0.0%)
Li (0.02%)
Lu (0.0%)
Mg (0.02%)
Mn (0.0%)
Mo (0.0%)
N (5.14%)
Na (0.01%)
Nb (0.01%)
Nd (0.0%)
Ne (0.0%)
Ni (0.0%)
O (9.78%)
Os (0.0%)
P (0.49%)
Pb (0.01%)
Pd (0.01%)
Pm (0.0%)
Pr (0.0%)
Pt (0.01%)
Rb (0.0%)
Re (0.0%)
Rh (0.0%)
Ru (0.0%)
S (0.77%)
Sb (0.01%)
Sc (0.0%)
Se (0.01%)
Si (0.07%)
Sm (0.0%)
Sn (0.01%)
Sr (0.0%)
Ta (0.0%)
Tb (0.0%)
Tc (0.0%)
Te (0.0%)
Ti (0.0%)
Tl (0.0%)
Tm (0.0%)
V (0.01%)
W (0.0%)
Xe (0.0%)
Y (0.0%)
Yb (0.0%)
Zn (0.0%)
Zr (0.0%)
Methods:
DFT-ωB97M-V
Software:
ORCA
Number of Configurations: 2,762,021
Number of Atoms: 283,298,012
Publication Link: https://doi.org/10.48550/arXiv.2505.08762
Data Source Link: https://huggingface.co/facebook/OMol25
No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.