# Bond length Change Error manual

An unsual difference in bond length between the DFT computation and the original ICSD entry can indicate either an error in the initial ICSD structure or a problem with the DFT approximation. We used the change in bond length as an indicator of large relaxation and of possible error in our data set. This analysis is similar to the one we performed on volumes.

## Bond Length Change Distribution in GGA

### The bond length computing procedure

To monitor the difference in bond length between the experimental and computed data, we need to automatically determine what are the chemical bonds in a given crystal structure. We have taken a geometric approach to define the nearest neighbors of a given atom using the Voronoi construction.[1] After defining the nearest neighbor of each atoms in the ICSD unit cell, we monitor how the distance between each atom and his neighbors (i.e., the bond length) changed after DFT relaxation. We have then for each atom nearest neighbor pair a change in distance that can be expressed by a change in bond length (in %).

### The fitting of the distribution

Figure 1: Histogram of the maximum bond length change observed for each entry computed with GGA

We assigned a maximum bond length change for each computed GGA entry in the materials project's database. Figure 1 indicates how those bond length change (in %) are distributed. The distribution is centered on 0. We obtained the best fit to the data with a t-location scale distribution. A normal would not model the heavy tail of the data. Using a maximum likelihood approach we fitted the 3 parameters necessary to define the t-location distribution.

### The tagging of the entries

Using the previously fitted t-location distribution, we defined as outliers any data point out of the 5th and 95th percentile. The 5th percentile is situated at -2.1% and the 95th percentile at 2.1%. Figure 2 shows the cumulative histogram of the data (in blue), and the fitted distribution (in red). The 5th and 95th percentile are marked by a black dashed line.

Figure 2: Cumulative histogram (in blue) and the fitted t-location distribution (in red) for the maximum bond length change observed for each entry computed with GGA. The dashed black lines indicate the 5th and 95th percentiles

### Some examples of outliers detected by the bond length change

Of course, large volume changes will often induce large changes in bonds. However, there are also cases in which large changes in bonds occur after DFT relaxation but without major volume change. In this section, we present a few examples of compounds which have large bond length changes when relaxed with GGA but did not show unusual volume changes.

#### Ba2CoO4

task_id=16887, icsd_id=92321 The change in Co-O bond length is around 35%. Looking at the structure more carefully, the ICSD entry shows a very long Co-O bond while the DFT relaxation brought the O much closer and form the more common regular tetrahedral local environment for Co4+ (see Figure 3). It is likely that the ICSD entry had an error in atomic positions.

Figure 3: Large change in bond length between the ICSD and computed entry for Ba2CoO4

#### SnF2

task_id: 7456 and icsd_id: 14194

Figure 4: Large change in bond length between the ICSD and computed entry for SnF2

Very large relaxations are observed. For instance, there is a F atom that moved dramatically after DFT relaxation. It is difficult to say if the error comes from the measurement of from DFT but this might be a DFT problem as the measurement is tagged as high quality data in the ICSD and comes from single crystal diffraction.

#### RbInMo2O8

task_id: 7402 , icsd_id: 10186 Here, we have smaller bond length changes but still outlying the distribution. The Mo-O bond is around 1.9A in experiments and 1.8A computationally. This is a quite unusual difference of 5%. Ionic radii are closer to the computed value for Mo6+ (1.35A+0.41A=1.76A).

## Citation

To cite the Materials Project, please reference the following work:

• ﻿A. Jain, G. Hautier, C. J. Moore, S. P. Ong, C. C. Fischer, T. Mueller, K. A. Persson, and G. Ceder, A high-throughput infrastructure for density functional theory calculations, Computational Materials Science, vol. 50, 2011, pp. 2295-2310.

## References

1. ﻿M. O’Keefe, Acta Crystallographica Section A 35, 772-775 (1979).

## Authors

1. Geoffroy Hautier