Dataset

Alex_MP-20_train




Species content of dataset



Name Alex_MP-20_train
Extended ID Alex_MP-20_train__Zeni-Pinsler-Zugner-Fowler-Horton-Fu-Wang-Shysheya-Crabbe-Ueda-Sordillo-Sun-Smith-Nguyen-Schulz-Lewis-Huang-Lu-Zhou-Yang-Hao-Li-Yang-Li-Tomioka-Xie__DS_uluw9723f2n4_0
Description The train split of the dataset Alex_MP-20. This dataset contains structures from the Alexandria (Schmidt et al. 2022) and MP-20 (Materials Project 2020) datasets. Data has been modified as follows: Exclude structures containing the elements Tc, Pm, or any element with atomic number 84 or higher. Relax structures with DFT using a PBE functional in order to have consistent energies. For the training set, remove any structure with more than 20 atoms inside the unit cell. For the training set, remove any structure with energy above the hull higher than 0.1 eV/atom.
Authors Claudio Zeni
Robert Pinsler
Daniel Zügner
Andrew Fowler
Matthew Horton
Xiang Fu
Zilong Wang
Aliaksandra Shysheya
Jonathan Crabbé
Shoko Ueda
Roberto Sordillo
Lixin Sun
Jake Smith
Bichlien Nguyen
Hannes Schulz
Sarah Lewis
Chin-Wei Huang
Ziheng Lu
Yichi Zhou
Han Yang
Hongxia Hao
Jielan Li
Chunlei Yang
Wenjie Li
Ryota Tomioka
Tian Xie
DOI None
https://commons.datacite.org/doi.org/None
https://doi.datacite.org/dois/None
https://doi.org/None

Cite as: Zeni, C., Pinsler, R., Zügner, D., Fowler, A., Horton, M., Fu, X., Wang, Z., Shysheya, A., Crabbé, J., Ueda, S., Sordillo, R., Sun, L., Smith, J., Nguyen, B., Schulz, H., Lewis, S., Huang, C., Lu, Z., Zhou, Y., Yang, H., Hao, H., Li, J., Yang, C., Li, W., Tomioka, R., and Xie, T. "Alex MP-20 train." ColabFit, 2025. https://doi.org/None.
For other citation formats, see the DataCite Fabrica page for this dataset.
Calculated Property Types electronic_band_gap
energy_above_hull
Elements
Ag (1.84%)
Al (1.41%)
As (0.89%)
Au (1.85%)
B (0.39%)
Ba (1.35%)
Be (0.36%)
Bi (0.96%)
Br (2.34%)
C (0.36%)
Ca (1.18%)
Cd (1.79%)
Ce (1.06%)
Cl (2.75%)
Co (0.8%)
Cr (0.3%)
Cs (1.17%)
Cu (1.62%)
Dy (1.7%)
Er (1.6%)
Eu (0.17%)
F (3.0%)
Fe (0.59%)
Ga (1.98%)
Gd (0.04%)
Ge (0.99%)
H (1.08%)
Hf (0.47%)
Hg (1.9%)
Ho (1.68%)
I (1.81%)
In (1.93%)
Ir (0.86%)
K (1.37%)
La (2.0%)
Li (2.01%)
Lu (0.24%)
Mg (1.27%)
Mn (0.61%)
Mo (0.24%)
N (0.83%)
Na (1.55%)
Nb (0.32%)
Nd (1.9%)
Ni (1.41%)
O (6.11%)
Os (0.3%)
P (0.77%)
Pb (1.2%)
Pd (2.1%)
Pr (1.93%)
Pt (1.41%)
Rb (1.31%)
Re (0.14%)
Rh (1.45%)
Ru (0.64%)
S (3.01%)
Sb (0.88%)
Sc (1.31%)
Se (2.96%)
Si (0.92%)
Sm (1.54%)
Sn (1.39%)
Sr (1.16%)
Ta (0.3%)
Tb (1.77%)
Te (2.19%)
Ti (0.59%)
Tl (2.23%)
Tm (1.61%)
V (0.42%)
W (0.17%)
Y (1.61%)
Yb (0.16%)
Zn (1.79%)
Zr (0.65%)
Number of Configurations 540,162
Number of Atoms 5,184,565
Publication Link https://doi.org/10.1038/s41586-025-08628-5
Data Source Link https://github.com/microsoft/mattergen
Configuration Sets by Name
Configuration Sets by ID
ColabFit ID DS_uluw9723f2n4_0
Downloads 4
Files colabfitspec.json

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.