Parquet File Schema

Configuration Set


A configuration set is a subset of configurations that share some characteristic.
These may be defined for configurations containing certain elements, created by distinct methods, representing distinct crystal types, etc.

Other file schemas


The following table provides a description of the columns for the file <dataset-id>/ds.parquet.

Key Explanation Column Type
id Unique identifier for the configuration set. string
hash Hash over configuration set values. string
last_modified Date when this configuration set was last modified in the database. timestamp
nconfigurations Count of configurations in the set. integer
nperiodic_dimensions undefined
array<integer>Array of integers
dimension_types undefined
array<array<integer>>Nested arrays of integers
nsites Sum of total atomic sites across all configurations in the set. long
nelements Count of distinct elements in the set. integer
elements List of elemental symbols present in the set.
array<string>Array of strings
total_elements_ratios Ratios of each element across the entire set. Order matches elements
array<double>Array of doubles
description Description of the configuration set. string
name Name of the configuration set. string
dataset_id ID of the parent dataset. string
ordered Whether the configuration set is ordered. boolean
extended_id Extended identifier for the configuration set. string