Dataset

Massive_Atomic_Diversity_MAD-1.5_r2SCAN_Test




Species content of dataset


Name :
Massive_Atomic_Diversity_MAD-1.5_r2SCAN_Test
Authors :
Cesare Malosso, Filippo Bigi, Paolo Pegolo, Joseph W. Abbott, Philip Loche, Mariana Rossi, Michele Ceriotti, Arslan Mazitov
Description :
Test split of the MAD-1.5 (Massive Atomic Diversity version 1.5) dataset, a highly curated collection designed for training broadly applicable atomistic machine-learning models across the full periodic table. MAD-1.5 extends the original MAD dataset with targeted enrichment strategies covering 102 chemical elements (all isotopes with half-life above one day). All 216,803 structures are computed with a single standardized all-electron DFT workflow using the r2SCAN meta-GGA functional in FHI-aims (version 250806), with tight basis sets, 8 Angstrom^-1 k-point density, Gaussian smearing of 0.05 eV, and SCF convergence thresholds of 1e-6 eV (energy), 1e-4 eV/Angstrom (forces), and 1e-5 e*a0^-3 (electron density). The dataset spans molecules (monomers, dimers, trimers, molecular crystals), bulk crystals, surfaces, nanoclusters, and low-dimensional structures organized into 14 subsets. Quality is ensured by two-step outlier removal: heuristic filtering of structures with forces >100 eV/Angstrom, followed by LLPR uncertainty-based filtering. The test split (~10% of cleaned data, excluding monomers, dimers, and trimers which are fixed in the training split) uses a stratified split method consistent with the training and validation splits. Subset-resolved MAE for PET-MAD-1.5-S on this test set is 11.09 meV/atom (energy) and 36.81 meV/Angstrom (forces). A companion PBE-functional dataset (Massive_Atomic_Diversity_MAD-1.5_PBE) was used during model training with separate prediction heads.
Cite As :
Malosso, C., Bigi, F., Pegolo, P., Abbott, J. W., Loche, P., Rossi, M., Ceriotti, M., and Mazitov, A. "Massive Atomic Diversity MAD-1.5 r2SCAN Test." ColabFit, 2026. https://doi.org/None.
ColabFit ID :
Date Added :
2026-05-21
License :
CC-BY-4.0
Downloads :
0
Num. Configurations :
18,314
Num. Atoms :
321,704
Calculated Property Types :
atomic_forces atomization_energy cauchy_stress energy
Elements :
Ac (0.08%) Ag (0.4%) Al (0.9%) Am (0.08%) Ar (0.09%) As (0.67%) At (0.08%) Au (0.3%) B (1.13%) Ba (0.46%) Be (0.34%) Bi (0.48%) Bk (0.08%) Br (0.88%) C (11.5%) Ca (0.78%) Cd (0.52%) Ce (0.16%) Cf (0.08%) Cl (2.16%) Cm (0.08%) Co (0.58%) Cr (0.38%) Cs (0.74%) Cu (0.77%) Dy (0.07%) Er (0.09%) Es (0.08%) Eu (0.07%) F (3.85%) Fe (0.69%) Fm (0.08%) Fr (0.07%) Ga (0.57%) Gd (0.06%) Ge (0.85%) H (16.64%) He (0.09%) Hf (0.44%) Hg (0.31%) Ho (0.06%) I (0.92%) In (0.53%) Ir (0.3%) K (1.24%) Kr (0.1%) La (0.21%) Li (0.81%) Lu (0.17%) Md (0.08%) Mg (0.55%) Mn (0.49%) Mo (0.56%) N (5.82%) Na (1.05%) Nb (0.58%) Nd (0.08%) Ne (0.1%) Ni (0.84%) No (0.07%) Np (0.08%) O (18.69%) Os (0.19%) P (1.63%) Pa (0.08%) Pb (0.54%) Pd (0.46%) Pm (0.07%) Po (0.1%) Pr (0.09%) Pt (0.4%) Pu (0.07%) Ra (0.08%) Rb (0.54%) Re (0.26%) Rh (0.34%) Rn (0.09%) Ru (0.28%) S (2.94%) Sb (0.66%) Sc (0.44%) Se (1.41%) Si (1.11%) Sm (0.07%) Sn (0.65%) Sr (0.74%) Ta (0.44%) Tb (0.07%) Tc (0.17%) Te (0.76%) Th (0.11%) Ti (0.53%) Tl (0.44%) Tm (0.09%) U (0.1%) V (0.54%) W (0.35%) Xe (0.12%) Y (0.63%) Yb (0.13%) Zn (0.7%) Zr (0.58%)
Methods :
DFT-r2SCAN
Software :
FHI-aims v250806
Spec File :
Configuration Sets by Name :
Configuration Sets by ID :
Dataset viewer powered by Hugging Face

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.