A dataset defines a collection of configurations, including creators, relevant links, a DOI and aggregated data
Other file schemas
The following table provides a description of the columns found in untarred dataset parquet downloads: <dataset-id>/ds.parquet
.
Key | Explanation | Column Type |
---|---|---|
id | Unique identifier for the dataset. | string |
hash | Hash over dataset values. | string |
name | Name of the dataset. | string |
last_modified | Date when this dataset was last modified in the database. | timestamp |
software | Software used for property calculations. | Array of strings |
methods | Level(s) of theory used for property calculations. | Array of strings |
nconfigurations | Count of configurations in the dataset. | integer |
nproperty_objects | Count of property objects in the dataset. | long |
nsites | Sum of atomic site counts across all configurations. | long |
nelements | Count of distinct elements in the dataset. | integer |
elements | Elemental symbols of distinct atomic species present in the dataset. | Array of strings |
total_elements_ratios | Ratio of each atomic species across the entire dataset. Order matches elements . |
Array of doubles |
nperiodic_dimensions | undefined | Array of integers |
dimension_types | undefined | Nested arrays of integers |
energy_count | Count of configurations with energy calculations. | long |
energy_mean | Mean energy value across configurations. | double |
energy_variance | Variance of energy values across configurations. | double |
atomization_energy_count | Count of configurations with atomization energy calculations. | long |
adsorption_energy_count | Count of configurations with adsorption energy calculations. | long |
energy_above_hull_count | Count of configurations with energy above hull calculations. | long |
formation_energy_count | Count of configurations with formation energy calculations. | long |
atomic_forces_count | Count of configurations with atomic forces calculations. | long |
electronic_band_gap_count | Count of configurations with electronic band gap calculations. | long |
cauchy_stress_count | Count of configurations with Cauchy stress calculations. | long |
authors | List of authors who contributed to the dataset. | Array of strings |
description | Description of the dataset. | string |
extended_id | Extended identifier for the dataset. | string |
license | License under which the dataset is distributed. | string |
links | Links related to the dataset, including original publications and data repositories. | string |
publication_year | Year the dataset was published to ColabFit. | string |
doi | Digital Object Identifier for the dataset. | string |
equilibrium | Whether the dataset contains only equilibrium structures. | boolean |
colabfit_publication_date | Date when the dataset was published to ColabFit. | timestamp |