Enhancing Materials Property Prediction by Leveraging Computational and Experimental Data using Deep Transfer Learning
Kamal Choudhary, Dipendra Jha, Ankit Agrawal, Alok Choudhary, Wei-keng Liao, Francesca M. Tavazza, Carelyn E. Campbell
The availability of huge collections of data from DFT-computations has spurred the interest of materials scientists in applying machine learning techniques to build models for fast prediction of materials properties. Such modeling practice has helped to accelerate the process of materials discovery by providing significantly faster methods to scan materials candidates, therefore reducing the search space for DFT-computation and experiments. However, since experimental datasets are limited in size, if even available, such predictive models are generally built and evaluated using DFT-computed datasets. Consequently, in additions to their prediction error for DFT-computation, they automatically inherit the DFT-computation error against experiments. Here, we demonstrate that existing large DFT-computational datasets can be leveraged, together with the available experimental data, using deep learning, to build robust prediction models which are closer to experimental observations. We focus on learning the formation energy of materials from their compositions, using a deep neural network which enables transfer learning from an existing, large DFT simulation dataset (OQMD), to other, smaller DFT datasets (JARVIS and the Materials Project) and experimental observations. On an experimental dataset of 1,963 observations, the proposed approach yields a mean absolute error (MAE) of 0.063 eV/atom, which is significantly better than existing ML prediction modeling based on DFT-computations and comparable to the DFT-computation itself.