Abstract
        We present a complete set of chemo-structural descriptors to significantly extend the applicability of machine learning (ML) in material screening and mapping the energy landscape for multicomponent systems. These descriptors allow differentiating between structural prototypes, which is not possible using the commonly used chemical-only descriptors. Specifically, we demonstrate that the combination of pairwise radial, nearest-neighbor, bond-angle, dihedral-angle, and core-charge distributions plays an important role in predicting formation energies, band gaps, static refractive indices, magnetic properties, and modulus of elasticity for three-dimensional materials as well as exfoliation energies of two-dimensional (2D)-layered materials. The training data consist of 24 549 bulk and 616 monolayer materials taken from the JARVIS-DFT database. We obtained very accurate ML models using a gradient- boosting algorithm. Then we use the trained models to discover exfoliable 2D-layered materials satisfying specific property requirements. Additionally, we integrate our formation-energy ML model with a genetic algorithm for structure search to verify if the ML model reproduces the density- functional-theory convex hull. This verification establishes a more stringent evaluation metric for the ML model than what is commonly used in data sciences. Our learned model is publicly available on the JARVIS-ML website (
https://www.ctcms.nist.gov/jarvisml), property predictions of generalized materials.