# Structure Similarity

The similarity between two structures i and j is assessed on the basis of local coordination information from all sites in the two structures. [1] [2]

## Site Fingerprints

The similarity calculation begins with computing a crystal site fingerprint, vsite, for each site in the two structures. The fingerprint is a 12-dimensional vector in which an element at position k (e.g., 4) provides the percentage of how much the given site should be considered k-fold coordinated (i.e., w(CN=4)):

$\mathbf{v}^\mathrm{site} = [w(\mathrm{CN}=1), w(\mathrm{CN}=2), w(\mathrm{CN}=3), \dots, w(\mathrm{CN}=12)]^\mathrm{T}$

So, we are testing the coordination percentages up to 12-fold coordination.

## Structure Fingerprints

The fingerprints from sites in a given structure are subsequently statistically processed to yield the minimum, maximum, mean, and standard deviation of each coordination percentage. The resultant ordered vector defines a structure fingerprint, vstruct:

$\mathbf{v}^\mathrm{struct} = [min(w(\mathrm{CN}=1)), max(w(\mathrm{CN}=1)), mean(w(\mathrm{CN}=1)), std(w(\mathrm{CN}=1)), min(w(\mathrm{CN}=2)), \dots, min(w(\mathrm{CN}=12)), max(w(\mathrm{CN}=12)), mean(w(\mathrm{CN}=12)), std(w(\mathrm{CN}=12))]^\mathrm{T}$

## Structure Distance

Finally, structure similarity is determined by the distance, d, between two structure fingerprints vistruct and vjstruct:

$d = || \mathbf{v}_{i}^\mathrm{struct} - \mathbf{v}_{j}^\mathrm{struct} ||$

A small distance value indicates high similarity between two structures, whereas a large distance (around 1) suggests that the structures are very dissimilar. Note that the structure fingerprint vectors are normalized before calculating the distance measure.

