Discovering novel solid-state lithium-ion conductors through unsupervised learning
Predictions of new solid-state Li-ion conductors are challenging due to the diverse chemistries and compositions involved. Here we tackle this challenge by discovering new Li-ion compounds with room-temperature conductivity higher than 0. 1 mS·cm-1 through unsupervised learning.
It is common expectation that ions in crystalline solids are confined at lattice sites with poor capability of migration. In the 1830s, Faraday discovered Ag2S and PbF2, the first examples to defy this expectation. These two compounds acquire the power of electronic conduction via the fast migration of ions at temperatures below their melting points. After nearly two centuries, dozens of materials have been found to exhibit the same capability of ionic conduction. In a few materials, the ionic conductivities even reach the levels comparable to those in liquid states, promoting much interest to use these solids, which are in many cases less flammable, as electrolytes for electrochemical devices.
The success of artificial intelligence in the fields of game playing and computer vision has inspired many researchers to search for better solid electrolytes under the guidance of “big data” that has accumulated over the years. This started our journey to discover new solid-state Li-ion conductors. Ironically, in our case the data is not really “big”: decades of studies have identified only about ten families of good Li-ion conductors and a few more bad examples.
Because of the predicament of being short of both positive and negative examples, we were forced to deviate from attempting to predict the conductivity through accurate supervision of a sufficiently large training set. Instead, we used unsupervised learning to distinguish good and bad examples for the next phase of screening. To explain our idea, we must first clarity the exact meaning of the data scarcity. In fact, knowing “little” about Li-ion conductors does not mean we have little knowledge of these materials. We indeed have “big data” such as the compositions, structures, and densities of these Li-containing compounds. What prevents us from linking such information to the conduction property is the “small”, relatively limited conductivity information that has been measured for training.
Figure 1. Illustration of unsupervised materials discovery. Assume our goal is to find a red object. The information used in unsupervised learning is the abundant information of features (shape). The property information (color) is used to prioritize promising groups for the next phase of screening. The whole process includes unsupervised clustering of materials based on their shapes, identifying that the group of circle shapes has the clustering of known red objects, and screening candidates from the circle group.
The unsupervised materials discovery takes several steps (see the figure): feature representation and unsupervised clustering; validation of the statistical difference of each cluster when projected on the property space, and verification of the discovery by screening materials in groups where good examples congregate. Relying on abundant feature information for the clustering, the unsupervised learning directly circumvents the challenge brought by the scarcity of property data. In addition, the unsupervised model does not necessitate the inclusion of all features that affect the target property. In fact, covering all possible physical factors affecting the conductivity across different length scales is not feasible, since many of them, such as grain size and processing conditions, may not be well characterized or even reported in the literature.
In our recent work, we applied unsupervised materials discovery to find solid-state Li-ion conductors. The clustering was based on representations of Li-containing compounds as X-ray diffraction patterns of their anionic lattices. In the clustering step, all known Li-ion conductors, despite having distinct crystalline lattices and chemical compositions, were clustered into two groups. The candidates from these two groups were thoroughly screened using ab initio molecular dynamics (AIMD) simulations to give explicit predictions of their conductivities. While AIMD itself is computationally demanding, the much-narrowed list made the cost of screening affordable. In less than one year, the screening predicted sixteen new promising Li-ion conductors, comprised of new structures, chemistries, and compositions significantly different from known systems, to be characterized experimentally.
With this success of discovering new solid-state Li-ion conductors, we hope our unsupervised learning scheme is demonstrative of a powerful alternative to the most-widely adapted supervised approach for the discovery of other functional materials, especially at conditions of scarce materials data.
These results were recently published in Nature Communications, “Unsupervised Discovery of Solid-State Lithium Ion Conductors”, Ying Zhang, Xingfeng He, Zhiqian Chen, Qiang Bai, Adelaide M. Nolan, Charles A. Roberts, Debasish Banerjee, Tomoya Matsunaga, Yifei Mo & Chen Ling, Nature Communications (2019) doi: 10.1038/s41467-019-13214-1. Read here for more details.