Dataset

solvated_protein_fragments_JCTC_2019




Species content of dataset


Name :
solvated_protein_fragments_JCTC_2019
Authors :
Oliver T. Unke, Markus Meuwly
Description :
The solvated protein fragments dataset was generated as a partner benchmark dataset, along with SN2, for measuring the performance of machine learning models, in particular PhysNet, at describing chemical reactions, long-range interactions, and condensed phase systems. The dataset contains structures for all possible "amons" (hydrogen-saturated covalently bonded fragments) of up to eight heavy atoms (C, N, O, S) that can be derived from chemical graphs of proteins containing the 20 natural amino acids connected via peptide bonds or disulfide bridges. For amino acids that can occur in different charge states due to (de)protonation (i.e., carboxylic acids that can be negatively charged or amines that can be positively charged), all possible structures with up to a total charge of +-2e are included. In total, the dataset provides reference energies, forces, and dipole moments for 2,731,180 structures calculated at the revPBE-D3(BJ)/def2-TZVP level of theory using ORCA 4.0.1.
Cite As :
Unke, O. T., and Meuwly, M. "solvated protein fragments JCTC 2019." ColabFit, 2023. https://doi.org/10.60732/c4731f07.
ColabFit ID :
Date Added :
2023-10-20
License :
CC-BY-4.0
Downloads :
39
Num. Configurations :
2,730,942
Num. Atoms :
58,390,211
Calculated Property Types :
atomic_forces energy
Elements :
C (19.27%) H (63.04%) N (4.88%) O (11.69%) S (1.13%)
Methods :
DFT-revPBE+D3(BJ)
Software :
ORCA 4.0.1
Spec File :
Configuration Sets by Name :
Configuration Sets by ID :
Dataset viewer powered by Hugging Face

No uploaded content is transferred in ownership from the original creators to ColabFit. All content is distributed under the license specified by its contributor who has stated that he or she has the authority to share it under the specified license.