Dataset
Alex_MP-20_test
Species content of dataset
Dataset viewer powered by Hugging Face
Name :
Alex_MP-20_test
ColabFit ID :
Files :
Description :
The test split of the dataset Alex_MP-20. This dataset contains structures from the Alexandria (Schmidt et al. 2022) and MP-20 (Materials Project 2020) datasets. Data has been modified as follows: Exclude structures containing the elements Tc, Pm, or any element with atomic number 84 or higher. Relax structures with DFT using a PBE functional in order to have consistent energies. For the training set, remove any structure with more than 20 atoms inside the unit cell. For the training set, remove any structure with energy above the hull higher than 0.1 eV/atom.
Authors :
Claudio Zeni, Robert Pinsler, Daniel Zügner, Andrew Fowler, Matthew Horton, Xiang Fu, Zilong Wang, Aliaksandra Shysheya, Jonathan Crabbé, Shoko Ueda, Roberto Sordillo, Lixin Sun, Jake Smith, Bichlien Nguyen, Hannes Schulz, Sarah Lewis, Chin-Wei Huang, Ziheng Lu, Yichi Zhou, Han Yang, Hongxia Hao, Jielan Li, Chunlei Yang, Wenjie Li, Ryota Tomioka, Tian Xie
DOI :
10.60732/4df848c7
https://commons.datacite.org/doi.org/10.60732/4df848c7
https://doi.datacite.org/dois/10.60732%2F4df848c7
https://doi.org/10.60732/4df848c7
Cite as: Zeni, C., Pinsler, R., Zügner, D., Fowler, A., Horton, M., Fu, X., Wang, Z., Shysheya, A., Crabbé, J., Ueda, S., Sordillo, R., Sun, L., Smith, J., Nguyen, B., Schulz, H., Lewis, S., Huang, C., Lu, Z., Zhou, Y., Yang, H., Hao, H., Li, J., Yang, C., Li, W., Tomioka, R., and Xie, T. "Alex MP-20 test." ColabFit, 2025. https://doi.org/10.60732/4df848c7.
For other citation formats, see the DataCite Fabrica page for this dataset.
For other citation formats, see the DataCite Fabrica page for this dataset.
Num. Configurations :
67,521
Num. Atoms :
647,769
Downloads :
112
Calculated Property Types :
electronic_band_gap
energy_above_hull
Elements :
Ag (1.8%)
Al (1.42%)
As (0.86%)
Au (1.87%)
B (0.4%)
Ba (1.4%)
Be (0.35%)
Bi (0.98%)
Br (2.34%)
C (0.36%)
Ca (1.19%)
Cd (1.82%)
Ce (1.03%)
Cl (2.75%)
Co (0.83%)
Cr (0.3%)
Cs (1.16%)
Cu (1.63%)
Dy (1.74%)
Er (1.62%)
Eu (0.16%)
F (3.08%)
Fe (0.61%)
Ga (2.02%)
Gd (0.05%)
Ge (1.02%)
H (1.1%)
Hf (0.48%)
Hg (1.9%)
Ho (1.64%)
I (1.81%)
In (1.88%)
Ir (0.91%)
K (1.35%)
La (2.01%)
Li (1.96%)
Lu (0.23%)
Mg (1.31%)
Mn (0.62%)
Mo (0.24%)
N (0.86%)
Na (1.54%)
Nb (0.34%)
Nd (1.87%)
Ni (1.42%)
O (5.97%)
Os (0.29%)
P (0.79%)
Pb (1.21%)
Pd (2.12%)
Pr (1.97%)
Pt (1.41%)
Rb (1.28%)
Re (0.13%)
Rh (1.48%)
Ru (0.65%)
S (3.02%)
Sb (0.85%)
Sc (1.29%)
Se (2.95%)
Si (0.88%)
Sm (1.58%)
Sn (1.39%)
Sr (1.13%)
Ta (0.3%)
Tb (1.8%)
Te (2.1%)
Ti (0.62%)
Tl (2.22%)
Tm (1.59%)
V (0.4%)
W (0.16%)
Y (1.62%)
Yb (0.15%)
Zn (1.75%)
Zr (0.66%)
Methods :
DFT-PBE
Software :
VASP
Publication Link :
Data Source Link :
Configuration Sets by Name :
Configuration Sets by ID :
Name: Alex_MP-20_test
Extended ID: Alex_MP-20_train_test__Zeni-Pinsler-Zugner-Fowler-Horton-Fu-Wang-Shysheya-Crabbe-Ueda-Sordillo-Sun-Smith-Nguyen-Schulz-Lewis-Huang-Lu-Zhou-Yang-Hao-Li-Yang-Li-Tomioka-Xie__DS_5rjlk0wubpsf_0
Description: The test split of the dataset Alex_MP-20. This dataset contains structures from the Alexandria (Schmidt et al. 2022) and MP-20 (Materials Project 2020) datasets. Data has been modified as follows: Exclude structures containing the elements Tc, Pm, or any element with atomic number 84 or higher. Relax structures with DFT using a PBE functional in order to have consistent energies. For the training set, remove any structure with more than 20 atoms inside the unit cell. For the training set, remove any structure with energy above the hull higher than 0.1 eV/atom.
Authors:
Claudio Zeni
Robert Pinsler
Daniel Zügner
Andrew Fowler
Matthew Horton
Xiang Fu
Zilong Wang
Aliaksandra Shysheya
Jonathan Crabbé
Shoko Ueda
Roberto Sordillo
Lixin Sun
Jake Smith
Bichlien Nguyen
Hannes Schulz
Sarah Lewis
Chin-Wei Huang
Ziheng Lu
Yichi Zhou
Han Yang
Hongxia Hao
Jielan Li
Chunlei Yang
Wenjie Li
Ryota Tomioka
Tian Xie
DOI: 10.60732/4df848c7
Calculated Property Types:
electronic_band_gap
energy_above_hull
Elements:
Ag (1.8%)
Al (1.42%)
As (0.86%)
Au (1.87%)
B (0.4%)
Ba (1.4%)
Be (0.35%)
Bi (0.98%)
Br (2.34%)
C (0.36%)
Ca (1.19%)
Cd (1.82%)
Ce (1.03%)
Cl (2.75%)
Co (0.83%)
Cr (0.3%)
Cs (1.16%)
Cu (1.63%)
Dy (1.74%)
Er (1.62%)
Eu (0.16%)
F (3.08%)
Fe (0.61%)
Ga (2.02%)
Gd (0.05%)
Ge (1.02%)
H (1.1%)
Hf (0.48%)
Hg (1.9%)
Ho (1.64%)
I (1.81%)
In (1.88%)
Ir (0.91%)
K (1.35%)
La (2.01%)
Li (1.96%)
Lu (0.23%)
Mg (1.31%)
Mn (0.62%)
Mo (0.24%)
N (0.86%)
Na (1.54%)
Nb (0.34%)
Nd (1.87%)
Ni (1.42%)
O (5.97%)
Os (0.29%)
P (0.79%)
Pb (1.21%)
Pd (2.12%)
Pr (1.97%)
Pt (1.41%)
Rb (1.28%)
Re (0.13%)
Rh (1.48%)
Ru (0.65%)
S (3.02%)
Sb (0.85%)
Sc (1.29%)
Se (2.95%)
Si (0.88%)
Sm (1.58%)
Sn (1.39%)
Sr (1.13%)
Ta (0.3%)
Tb (1.8%)
Te (2.1%)
Ti (0.62%)
Tl (2.22%)
Tm (1.59%)
V (0.4%)
W (0.16%)
Y (1.62%)
Yb (0.15%)
Zn (1.75%)
Zr (0.66%)
Methods:
DFT-PBE
Software:
VASP
Number of Configurations: 67,521
Number of Atoms: 647,769
Publication Link: https://doi.org/10.1038/s41586-025-08628-5
Data Source Link: https://github.com/microsoft/mattergen
No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.