Cryogenic electron microscopy (Cryo-EM) has become one of the critical measurement instruments for understanding molecular structures in biomanufacturing applications, for instance, during the design and mass production of high-quality vaccines for coronavirus (SARS-CoV-2). However, the utility of Cryo-EM depends on the sample preparation, the quality of Cryo-EM images, and our ability to process and interpret terabyte-sized image collections acquired per biological sample. There is a need to design quality metrics for assessing 2D and 3D image-based measurements of samples. In our work, we focus on the image-based measurement quality of corona virus vaccines based on packaging messenger ribonucleic acid (mRNA) in lipid nanoparticles (LNPs).
Manufacturing a vaccine based on packaging mRNA in lipid nanoparticles requires a full characterization of concentrations and distributions of mRNA and LNPs, as well as all manufacturing conditions. Such experiments generate hundreds of terabytes (TBs) of images and pose challenges on trusted automated measurements and scalable computational algorithms. Our goal is to design quality metrics that would allow researchers to select images of high quality and eliminate any derived measurements that are affected by:
sample contamination (crystals of ice, partial surface freeze, presence of carbon edges),
distribution of LNPs (empty field of view, leopard or splotchy artifact patterns),
imaging (out-of-focus images, small signal-to-noise ratio), and
algorithmic model and sub-optimal parameter selection (contaminated and insufficient training data in supervised models, wrong assumptions in unsupervised models, hidden hard-coded parameters).
Our approach is leveraging supervised methods that are based on training data. The training data vary in their representation and content (annotated images with LNPs, image classes depending on quality). The current challenges lie in designing supervised artificial intelligence (AI) models that can leverage small training data, assign quality-based image ranks, identify LNPs (i.e., particle picking), classify LNPs to with and without mRNA, and ultimately reconstruct 3D molecular structures to understand interactions with other molecules.