Machine learning dataset of probe particle model CO-tip atomic force microscopy (AFM) simulation images of bilayer ice clusters on the Au(111) surface and the corresponding atomic structure files. The dataset consists of a total of 1198 different structures which are divided into training, validation, and test sets as 958/72/168, and for each structure the simulations are performed 10 times with different randomized simulation parameters to yield a total of 11980 simulations. Each simulation consist of 15 images at different tip-sample distances at 0.1Å step. The dataset is saved in a compressed .tar.gz archive. The decompressed archive has samples stored in the webdataset shard format. Each shard is a tar file with a number of samples. The tar files are named in the format `Ice-K-{param}_{set}_{shard}.tar`, where {param} number stands for the different sets of randomized simulation parameters, {set} stands for either train, validation, or test set, and {shard} is the shard number. Each sample consists of a number of AFM simulation images and an xyz structure file. The image files are named in the format `{x}.{y}.png`, where {x} is the sample number, {y} is the index for the different height slices in the simulations. The height slices are numbered from 0 to 14, such that 0 is the farthest distance and 14 is the closest distance. Similarly, the corresponding xyz structure files are named `{x}.xyz`. The comment line (second line) in the xyz files has information about the parameters used for in the simulation for the sample. The ice structures were obtained from a neural-network potential optimization, and subsequently the Hartree potentials for the structures were obtained through a density functional theory calculation using the optB86b-vdW density functional in Vienna Ab-initio Simulation Package. The AFM simulations utilize the Lennard-Jones force field for the Pauli repulsion and van der Waals interactions and tip convolution with the Hartree potential for the electrostatic interaction.
Date made available28 Oct 2023

Dataset Licences

  • CC-BY-4.0

Cite this