New Catalysis Descriptor Discovered by Interpretable Machine Learning

Symbolic regression, a kind of interpretable machine learning approach, used to derive a catalysis descriptor predicting new oxide perovskites with improved oxygen evolution activity as corroborated by experimental validation.
New Catalysis Descriptor Discovered by Interpretable Machine Learning

There has been a long-standing interest in the field of catalysis to identify descriptors. Conventional descriptors from human knowledge, with typical volcano scaling, have shown their influential impact in the field (Figure 1). For example, seminal work of eg occupancy as a descriptor [Science 334, 1383 (2011)] in oxide perovskite catalyst has stimulated subsequent works such as the descriptor of O p band level [Nat. Comm. 4, 2439 (2013)] and the combined descriptors of t2g, eg occupancies and pd interaction [Nat. Comm. 11, 652 (2020)].

Figure 1. A short summary of development for catalysis descriptors.

Descriptors are the concise relationships between structure (composition) and properties. We note that conventional descriptors (t2g, eg occupancies, O p band level and pd interaction) are based on human knowledge of physics and chemistry, i.e. the interactions between the adsorbate and catalyst should be neither too strong nor too weak (Sabatier principle), leading to volcano scaling. Here, human-knowledge-based descriptors and volcano scaling have been challenged by a machine learning approach in our paper in Nature Communications, "Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts".

Figure 2. a. Schematic plot and b. flowchart of symbolic regression, a glass-box machine learning approach.

In this work, we synthesized and characterized over thirty oxides (23 perovskites and 11 non-perovskites), which were then studied by the symbolic regression, an interpretable and glass-box machine learning approach (Figure 2). We derived an unprecedentedly simple descriptor, μ/t, where μ and t are the octahedral and tolerance factors, respectively. The performance of μ/t is comparable to conventional descriptor of eg occupancy (Figure 3). Since both μ and t are the function of ionic radii only, such descriptor makes catalysts design refrain from DFT calculations and therefore much efficient and easy. The descriptor is then used to screen out four oxide perovskites with high oxygen evolution reaction (OER) activity among 3,545 candidates. Among them, Cs-containing oxide perovskites have never been reported in literature but successfully synthesized under the guidance of the new descriptor.

Figure 3. Comparison of eg and μ/t descriptors based on independent experimental data. a, Fig. 2 from [Science 334, 1383 (2011)] reproduced with permission from the American Association for the Advancement of Science. b, Reformatted plot according to descriptor μ/t. The MAE (Pearson correlation coefficient) for a and b were 20.6 meV (0.923) and 21.0 meV (0.928) respectively.

The work, titled “Simple Descriptor Derived from Symbolic Regression Accelerating the Discovery of New Perovskite Catalysts” appears 07/14/20 in the journal Nature Communications (

Please sign in or register for FREE

If you are a registered user on Nature Portfolio Engineering Community, please sign in