Dataset

AIMNet2




Species content of dataset


Name :
AIMNet2
Authors :
Kamal Singh Nayal, Ilkwon Cho, Runtian Nick Gao, Peikun Zheng, Olexandr Isayev
Description :
AIMNet2(2025) is the extended training dataset for the AIMNet2 (second generation atoms-in-molecules network) neural network interatomic potential, curated to improve the model's description of noncovalent interactions (NCIs) including hydrogen bonding, pi-pi stacking, dispersion, sigma-hole, ionic, and electrostatic contacts. The dataset covers neutral and charged closed-shell molecular systems composed of up to 14 non-metal elements (H, B, C, N, O, F, Si, P, S, Cl, As, Se, Br, I) with up to 193 atoms per system. Structures were drawn from three complementary sources: (a) molecular geometries from SPICE v2.0.1 (solvated systems, amino acid-ligand pairs, water clusters) and the CREMP dataset (macrocyclic peptides); (b) small neutral and charged molecules from PubChem sampled via normal mode sampling and metadynamics-guided geometry exploration; (c) dimer geometries assembled from Cambridge Structural Database (CSD) monomers (up to 14 supported elements, fewer than 200 atoms) and pre-optimized with AIMNet2-wB97M-D3(2023) to remove steric clashes while preserving configurational diversity. All quantum chemical calculations used ORCA 6.0.1 with the composite B97-3c DFT functional under restricted Kohn-Sham (RKS) formalism. SCF convergence was enforced with TightSCF and SlowConv; RIJCOSX integral acceleration and DEFGRID2 integration grid were applied throughout. AIMNet2(2025) was initialized from AIMNet2(2023) weights and continually pretrained on this dataset without weight freezing or regularization, using a multi-task loss over energy (w=1.0), forces (w=0.2), and Hirshfeld partial charges (w=0.5).
Cite As :
Nayal, K. S., Cho, I., Gao, R. N., Zheng, P., and Isayev, O. "AIMNet2." ColabFit, 2026. https://doi.org/None.
ColabFit ID :
Date Added :
2026-05-26
License :
MIT
Downloads :
0
Num. Configurations :
3,764,666
Num. Atoms :
130,288,462
Calculated Property Types :
atomic_forces energy
Elements :
As (0.02%) B (0.16%) Br (0.46%) C (36.77%) Cl (0.91%) F (1.2%) H (43.68%) I (0.34%) N (6.94%) O (7.69%) P (0.21%) S (1.37%) Se (0.11%) Si (0.15%)
Methods :
DFT-B97-3c
Software :
ORCA 6.0.1
Spec File :
Configuration Sets by Name :
Configuration Sets by ID :
Dataset viewer powered by Hugging Face

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.