Hostname: page-component-76fb5796d-vfjqv Total loading time: 0 Render date: 2024-04-26T12:08:23.152Z Has data issue: false hasContentIssue false

Robocrystallographer: automated crystal structure text descriptions and analysis

Published online by Cambridge University Press:  15 July 2019

Alex M. Ganose
Affiliation:
Lawrence Berkeley National Laboratory, Energy Technologies Area, 1 Cyclotron Road, Berkeley, CA 94720, USA
Anubhav Jain*
Affiliation:
Lawrence Berkeley National Laboratory, Energy Technologies Area, 1 Cyclotron Road, Berkeley, CA 94720, USA
*
Address all correspondence to Anubhav Jain at ajain@lbl.gov
Get access

Abstract

Our ability to describe crystal structure features is of crucial importance when attempting to understand structure–property relationships in the solid state. In this paper, the authors introduce robocrystallographer, an open-source toolkit for analyzing crystal structures. This package combines new and existing open-source analysis tools to provide structural information, including the local coordination and polyhedral type, polyhedral connectivity, octahedral tilt angles, component-dimensionality, and molecule-within-crystal and fuzzy prototype identification. Using this information, robocrystallographer can generate text-based descriptions of crystal structures that resemble descriptions written by human crystallographers. The authors use robocrystallographer to investigate the dimensionalities of all compounds in the Materials Project database and highlight its potential in machine learning studies.

Type
Artificial Intelligence Research Letters
Copyright
Copyright © Materials Research Society 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Bragg, W.H.: The significance of crystal structure. J. Chem. Soc. Trans. 121, 2766 (1922).Google Scholar
2.Van De Walle, A.: A complete representation of structure-property relationships in crystals. Nat. Mater. 7, 455458 (2008).Google Scholar
3.Pierson, H.O.: Handbook of Carbon, Graphite, Diamonds and Fullerenes: Processing, Properties and Applications (William Andrew, New York, 2012).Google Scholar
4.von Hippel, A.: Ferroelectricity, domain structure, and phase transitions of barium titanate. Rev. Mod. Phys. 22, 221237 (1950).Google Scholar
5.Burdett, J.K. and Lee, S.: Peierls distortions in two and three dimensions and the structures of AB solids. J. Am. Chem. Soc. 105, 10791083 (1983).Google Scholar
6.Scanlon, D.O., Dunnill, C.W., Buckeridge, J., Shevlin, S.A., Logsdail, A.J., Woodley, S.M., Catlow, R.A., Powell, M.J., Palgrave, R.G., Watson, G.W., Keal, T.W., Sherwood, P., Walsh, A., and Sokol, A.A.: Band alignment of rutile and anatase TiO2. Nat. Mater 12, 798801 (2013).Google Scholar
7.Zunger, A.: Inverse design in search of materials with target functionalities. Nat. Rev. Chem. 2, 0121 (2018).Google Scholar
8.Gorai, P., Toberer, E.S., and Stevanović, V.: Computational identification of promising thermoelectric materials among known quasi-2D binary compounds. J. Mater. Chem. A 4, 1111011116 (2016).Google Scholar
9.Larsen, P.M., Pandey, M., Strange, M., and Jacobsen, K.W.: Definition of a scoring parameter to identify low-dimensional materials components. (2018). arXiv:1808.02114 1–11.Google Scholar
10.Himanen, L., Rinke, P., and Foster, A.S.: Materials structure genealogy and high-throughput topological classification of surfaces and 2D materials. npj Comput. Mater. 4, 110 (2018).Google Scholar
11.Ashton, M., Paul, J., Sinnott, S.B., and Hennig, R.G.: Topology-scaling identification of layered solids and stable exfoliated 2D materials. Phys. Rev. Lett. 118, 16 (2017).Google Scholar
12.Togo, A. and Tanaka, I.: Spglib: a software library for crystal symmetry search. (2018). arXiv:1808.01590 1–11.Google Scholar
13.Mehl, M.J., Hicks, D., Toher, C., Levy, O., Hanson, R.M., Hart, Gus, and Curtarolo, S.: The AFLOW library of crystallographic prototypes: part 1. Comput. Mater. Sci 136, S1S828 (2017).Google Scholar
14.Waroquiers, D., Gonze, Xavier, Rignanese, G.-M., Welker-Nieuwoudt, C., Rosowski, F., Göbel, M., Schenk, S., Degelmann, P., André, R., Glaum, R., and Hautier, G.: Statistical analysis of coordination environments in oxides. Chem. Mater 29, 83468360 (2017).Google Scholar
15.Zimmermann, N.E.R., Horton, M.K., Jain, A., and Haranczyk, M.: Assessing local structure motifs using order parameters for motif recognition, interstitial identification, and diffusion path characterization. Front. Mater. 4, 113 (2017).Google Scholar
16.Ong, S.P., Richards, W.D., Hautier, G., Kocher, M., Cholia, S., Gunter, D., Chevrier, V.L., Persson, K.A., and Ceder, G.: Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci 68, 314319 (2013).Google Scholar
17.Ward, L., Dunn, A., Fahaninia, A., Zimmermann, N.E.R., Bajaj, S., Wang, Q., Montoya, J., Chen, J., Bystrom, K., Dylla, M., Chard, K., Asta, M., Persson, K.A., Snyder, G.J., Foster, I., and Jain, A.: Matminer: an open source toolkit for materials data mining. Comput. Mater. Sci 152, 6069 (2018).Google Scholar
18.O'Boyle, N.M., Banck, M., James, C.A., Morley, C., Vandermeersch, T., and Hutchison, G.: Open babel: an open chemical toolbox. J. Cheminform 3, 33 (2011).Google Scholar
19.Swain, M.: PubChemPy. https://github.com/mcs07/PubChemPy (accessed January 11, 2019).Google Scholar
20.Kim, S., Thiessen, P.A., Bolton, E.E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B.A., and Wang, J.: Pubchem substance and compound databases. Nucleic Acids Res 44, D1202D1213 (2016).Google Scholar
21.Pymatgen. http://pymatgen.org (accessed January 14, 2019): 2019.Google Scholar
22.Voronoi, G.: Nouvelles applications des paramètres continus à la théorie des formes quadratiques. Premier mémoire. Sur quelques propriétés des formes quadratiques positives parfaites. J. Reine Angew. Math. 133, 97178 (1908).Google Scholar
23.Giesecke, G. and Pfister, H.: Präzisionsbestimmung der Gitterkonstanten von AIIIBv -verbindungen. Acta Crystallogr. 11, 369371 (1958).Google Scholar
24.Frazer, B.C. and Brown, P.J.: Antiferromagnetic structure of CrVO4 and the anhydrous sulfates of divalent Fe, Ni, and Co. Phys. Rev. 125, 12831291 (1962).Google Scholar
25.Kholodkovskaya, L.N., Akselrud, L.G., Kusainova, A.M., Dolgikh, V.A., and Popovkin, B.A.: Bicuseo: synthesis and crystal structure. Mater. Sci. Forum 133–136, 693696 (1993).Google Scholar
26.Roos, M. and Meyer, G.: Kristallstrukturen von NH4GaF4 und NH4GaF4·NH3. Zeitschr. Anorg. Allg. Chem. 625, 18431847 (1999).Google Scholar
27.Jain, A., Ong, S.P., Hautier, G., Chen, W., Richards, W.D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., and Ceder, G.: Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater 1, 011002 (2013).Google Scholar
28.de Jong, M., Chen, W., Angsten, T., Jain, A., Notestine, R., Gamst, A., Sluiter, M., Ande, C.K., van der Zwagg, S., Plata, J.J., and Toher, C.: Charting the complete elastic properties of inorganic crystalline compounds. Sci. Data 2, 150009 (2015).Google Scholar
29.Ward, L., Agrawal, A., Choudhary, A., and Wolverton, C.: A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput. Mater. 2, 16028 (2016).Google Scholar
30.Faber, F., Lindmaa, A., Von Lilienfeld, O.A., and Armiento, R.: Crystal structure representations for machine learning models of formation energies. Int. J. Quantum Chem. 115, 10941101 (2015).Google Scholar
31.Tzanis, G., Berberidis, C., and Vlahavas, I.: Machine Learning and Data Mining in Bioinformatics. Machine Learning (IGI Global, Pennsylvania, 2011).Google Scholar
32.Urbanowicz, R.J., Olson, R.S., Schmitt, P., Meeker, M., and Moore, J.H.: Benchmarking relief-based feature selection methods for bioinformatics data mining. J. Biomed. Inform. 85, 168188 (2017).Google Scholar
33.Swain, M.C. and Cole, J.M.: Chemdataextractor: a toolkit for automated extraction of chemical information from the scientific literature. J. Chem. Inf. Model. 56, 18941904 (2016).Google Scholar
34.Kim, E., Huang, K., Saunders, A., McCallum, A., Ceder, G., and Olivetti, E.: Materials synthesis insights from scientific literature via text extraction and machine learning. Chem. Mater 29, 94369444 (2017).Google Scholar
35.Gomaa, W.H. and Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68, 1318 (2013).Google Scholar
Supplementary material: File

Ganose and Jain supplementary material

Ganose and Jain supplementary material 1

Download Ganose and Jain supplementary material(File)
File 2.2 MB