Download Dataset XYZ file

Name solvated_protein_fragments_JCTC_2019
Extended ID solvated_protein_fragments_JCTC_2019_UnkeMeuwly__DS_ctjgc03xdauc_0
Description The solvated protein fragments dataset was generated as a partner benchmark dataset, along with SN2, for measuring the performance of machine learning models, in particular PhysNet, at describing chemical reactions, long-range interactions, and condensed phase systems. The dataset contains structures for all possible "amons" (hydrogen-saturated covalently bonded fragments) of up to eight heavy atoms (C, N, O, S) that can be derived from chemical graphs of proteins containing the 20 natural amino acids connected via peptide bonds or disulfide bridges. For amino acids that can occur in different charge states due to (de)protonation (i.e., carboxylic acids that can be negatively charged or amines that can be positively charged), all possible structures with up to a total charge of +-2e are included. In total, the dataset provides reference energies, forces, and dipole moments for 2,731,180 structures calculated at the revPBE-D3(BJ)/def2-TZVP level of theory using ORCA 4.0.1.
Authors Oliver T. Unke
Markus Meuwly
DOI 10.60732/c4731f07

Cite as: Unke, O. T., and Meuwly, M. "solvated protein fragments JCTC 2019." ColabFit, 2023.
For other citation formats, see the DataCite Fabrica page for this dataset.
Elements C (19.27%)
H (63.04%)
N (4.88%)
O (11.69%)
S (1.13%)
Number of Data Objects 2,731,180
Number of Configurations 2,731,180
Number of Atoms 58,395,272
Configuration Sets by Name (None)
Configuration Sets by ID (None)
Data Objects Too many to display
ColabFit ID DS_ctjgc03xdauc_0
Files colabfitspec.json

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.