Dataset

Massive_Atomic_Diversity_MAD_bench_mad




Species content of dataset



Name Massive_Atomic_Diversity_MAD_bench_mad
Extended ID Massive_Atomic_Diversity_MAD_bench_mad__Mazitov-Chorna-Fraux-Bercx-Pizzi-De-Ceriotti__DS_dc4lwyrm55p4_0
Description The MAD benchmark dataset, containing a selection of MAD test, MPtrj, Alexandria, SPICE, MD22 and OC2020 datasets, computed with MAD DFT settings. Part of the MAD (Massive Atomic Diversity) dataset family. From the creators: Starting from relatively small sets of stable structures, the dataset is built to contain “massive atomic diversity” (MAD) by aggressively distorting these configurations, with near-complete disregard for the stability of the resulting configurations. The electronic structure details, on the other hand, are chosen to maximize consistency rather than to obtain the most accurate prediction fora given structure, or to minimize computational effort. The MAD dataset we present here, despite containing fewer than 100k structures, has already been shown to enable training universal interatomic potentials that are competitive with models trained on traditional datasets with two to three orders of magnitude more structures.
Authors Arslan Mazitov
Sofiia Chorna
Guillaume Fraux
Marnik Bercx
Giovanni Pizzi
Sandip De
Michele Ceriotti
DOI 10.60732/b1f21e20
https://commons.datacite.org/doi.org/10.60732/b1f21e20
https://doi.datacite.org/dois/10.60732%2Fb1f21e20
https://doi.org/10.60732/b1f21e20

Cite as: Mazitov, A., Chorna, S., Fraux, G., Bercx, M., Pizzi, G., De, S., and Ceriotti, M. "Massive Atomic Diversity MAD bench mad." ColabFit, 2025. https://doi.org/10.60732/b1f21e20.
For other citation formats, see the DataCite Fabrica page for this dataset.
Calculated Property Types atomic_forces
cauchy_stress
energy
Elements
Ag (0.6%)
Al (1.58%)
As (0.71%)
Au (0.5%)
B (0.73%)
Ba (0.49%)
Be (0.15%)
Bi (0.21%)
Br (0.64%)
C (18.74%)
Ca (0.75%)
Cd (0.66%)
Ce (0.18%)
Cl (1.07%)
Co (0.55%)
Cr (0.37%)
Cs (0.27%)
Cu (0.96%)
Dy (0.29%)
Er (0.21%)
Eu (0.04%)
F (1.33%)
Fe (0.67%)
Ga (0.93%)
Gd (0.04%)
Ge (0.67%)
H (20.78%)
He (0.0%)
Hf (0.72%)
Hg (0.4%)
Ho (0.29%)
I (0.55%)
In (1.07%)
Ir (0.64%)
K (0.52%)
Kr (0.0%)
La (0.26%)
Li (0.6%)
Lu (0.14%)
Mg (0.61%)
Mn (0.54%)
Mo (0.32%)
N (6.04%)
Na (0.72%)
Nb (0.35%)
Nd (0.09%)
Ni (1.04%)
O (10.92%)
Os (0.31%)
P (1.19%)
Pb (0.43%)
Pd (1.02%)
Pm (0.01%)
Pr (0.07%)
Pt (0.8%)
Rb (0.26%)
Re (0.46%)
Rh (0.95%)
Ru (0.66%)
S (2.09%)
Sb (0.58%)
Sc (0.44%)
Se (1.44%)
Si (0.91%)
Sm (0.09%)
Sn (0.99%)
Sr (0.65%)
Ta (0.39%)
Tb (0.13%)
Tc (0.22%)
Te (0.8%)
Ti (1.24%)
Tl (0.52%)
Tm (0.25%)
V (0.54%)
W (0.17%)
Xe (0.01%)
Y (0.69%)
Yb (0.19%)
Zn (0.98%)
Zr (0.59%)
Number of Configurations 1,884
Number of Atoms 44,748
Publication Link https://doi.org/10.48550/arXiv.2506.19674
Data Source Link https://doi.org/10.24435/materialscloud:vd-e8
Configuration Sets by Name
Configuration Sets by ID
ColabFit ID DS_dc4lwyrm55p4_0
Downloads 0
Files colabfitspec.json

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.