Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Beyond Sequence Similarity: The Case for Function-Based Screening of Nucleic Acid Synthesis

Published

Author(s)

Gary Able, Tessa Alexanian, Jacob Beal, James Diggans, Kevin Flyangolts, Gene D. Godbold, Eric Horvitz, Bin Hu, Caitlin Jagla, Rassin Lababidi, Brittany Rife Magalis, Sebastian Rivera, Bruce Wittmann, Samuel Forry, David Ross, Sheng Lin-Gibson, Samuel Curtis

Abstract

Synthetic nucleic acids are a key input to modern biotechnology, yet they represent dual-use materials that require robust screening to mitigate biosecurity risks. The prevailing screening paradigm identifies sequences of concern (SoCs) through sequence similarity to a database of controlled pathogens and toxins. This paradigm is being challenged by artificial intelligence (AI)-enabled biodesign tools that can decouple biomolecular function from reliance on known sequences, generating genes and proteins that may evade sequence-based detection. We highlight the critical need for function-based screening approaches that can detect sequences capable of hazardous biological functions, regardless of similarity to known SoCs. We examine the challenges and feasibility of function-based screening with an initial focus on proteins, arguing that while protein sequence space is combinatorially vast, the subset of biologically functional proteins are constrained by biophysical, biochemical, and information-theoretic requirements that can be learned and modeled. We then propose a concrete implementation framework organized along a continuum of complexity. We advocate a hybrid architecture in which function-specific models complement existing sequence-based screening, starting with toxins as the most tractable targets before expanding to more complex pathogenic functions. We describe a prioritized development strategy in which secure research institutions generate training data and develop models, while screening tool developers and synthesis providers integrate and deploy these models for production screening. Over time, screening can incorporate increasingly generalized function prediction as models and ontologies mature. We conclude by identifying open challenges, including standardizing definitions of functions of concern, building evaluation infrastructure, and expanding experimental characterization of hazardous functions, proposing research priorities to address them.
Citation
Frontiers in Bioengineering and Biotechnology

Keywords

sequence screening, sequence of concern

Citation

Able, G. , Alexanian, T. , Beal, J. , Diggans, J. , Flyangolts, K. , Godbold, G. , Horvitz, E. , Hu, B. , Jagla, C. , Lababidi, R. , Magalis, B. , Rivera, S. , Wittmann, B. , Forry, S. , Ross, D. , Lin-Gibson, S. and Curtis, S. (2026), Beyond Sequence Similarity: The Case for Function-Based Screening of Nucleic Acid Synthesis, Frontiers in Bioengineering and Biotechnology, [online], https://doi.org/10.3389/fbioe.2026.1832724, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=961809 (Accessed May 21, 2026)
Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created May 13, 2026, Updated May 20, 2026
Was this page helpful?