Description

Machine learning dataset of probe particle model CO-tip atomic force microscopy (AFM) simulation images of ice clusters on the Cu(111) surface and the corresponding atomic structure files. The dataset consists of a total of 1837 different structures which are divided into training, validation, and test sets as 1469/110/258, and for each structure the simulations are performed 10 times with different randomized simulation parameters to yield a total of 18370 simulations. Each simulation consist of 15 images at different tip-sample distances at 0.1Å step. The dataset is saved in a compressed .tar.gz archive. The decompressed archive has samples stored in the webdataset shard format. Each shard is a tar file with a number of samples. The tar files are named in the format `Ice-K-{param}_{set}_{shard}.tar`, where {param} number stands for the different sets of randomized simulation parameters, {set} stands for either train, validation, or test set, and {shard} is the shard number. Each sample consists of a number of AFM simulation images and an xyz structure file. The image files are named in the format `{x}.{y}.png`, where {x} is the sample number, {y} is the index for the different height slices in the simulations. The height slices are numbered from 0 to 14, such that 0 is the farthest distance and 14 is the closest distance. Similarly, the corresponding xyz structure files are named `{x}.xyz`. The comment line (second line) in the xyz files has information about the parameters used for in the simulation for the sample. The ice structures were obtained from a neural-network potential optimization, and subsequently the Hartree potentials for the structures were obtained through a density functional theory calculation using the optB86b-vdW density functional in Vienna Ab-initio Simulation Package. The AFM simulations utilize the Lennard-Jones force field for the Pauli repulsion and van der Waals interactions and tip convolution with the Hartree potential for the electrostatic interaction.
Date made available27 Oct 2023
PublisherZenodo

Dataset Licences

  • CC-BY-4.0

Cite this