Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PROJECTS/PROGRAMS

NIST Genome Editing Lexicon

Summary

Genome editing technologies are transforming biosciences and biotechnology and are being actively utilized to advance product development including medicine. There is a need for standardization in terms and definitions for this field to support accurate communication of concepts, data, and results.

Standards in the field of genome editing will harmonize and accelerate effective communication, technology development, qualification, and evaluation of genome editing products. This lexicon was developed to provide a unified reference set of terms and technical definitions that standardizes their use and meaning to serve the needs of the biotechnology community. It is expected to improve confidence in and clarify scientific communication, data reporting, and data interpretation in the genome editing field.

It is recognized that in rare instances exceptions may exist for some definitions within specific applications.

This Lexicon has been developed into an ISO Standard for Genome Editing Vocabulary: ISO 5058-1:2021 Biotechnology — Genome editing — Part 1: Vocabulary. This lexicon is also available as an ontology through BioPortal.

Description

The definitions are worded with the intention that additional context may be added with supplementary language when they are used. It is also recognized that genome editing is a rapidly evolving biotechnology and additional terms and definitions will be needed as genome editing technologies mature.

ORGANIZATION OF THE LEXICON

This document provides a vocabulary that standardizes the use and meaning of terms associated with genome editing. This document is organized into categories and sub-categories as follows:

1. genome editing concepts

2. genome editing tools

2.1 general

2.2 CRISPR specific

2.3 meganuclease specific

2.4 megaTAL specific

2.5 TALEN specific

2.6 ZFN specific

3. Genome editing outcomes

Terms within categories are listed alphabetically. In the Genome editing tools section, the sub-category “General” contains terms that apply to all types of genome editing tools. Additional sub-categories contain terms specific to the sub-category of genome editing technology: “CRISPR specific”, “meganuclease specific”, “megaTAL specific”, “TALEN specific” and “ZFN specific”. A glossary listing all terms alphabetically precedes the Terms and definitions.

It is also recognized that genome editing is a rapidly developing and evolving biotechnology, and additional terms and definitions will be needed as genome editing technologies mature.

Glossary of terms in alphabetical order

Term	Term number
Cas nuclease	2.2.1
Cas nuclease target site	2.2.2
CRISPR associated nuclease	2.2.1
CRISPR RNA	2.2.3
CRISPR target strand	2.2.8
crRNA	2.2.3
Cys2His2 zinc finger	2.6.1
DNA edit	3.1
DNA, RNA, or epigenome edit	3.1
edit	3.1
epigenome edit	3.1
gene editing	1.1
genome editing	1.2
genome editing off-target	1.4
genome editing target	1.6
genome editing target specificity	1.5
genome engineering	1.3
gRNA	2.2.4
guide RNA	2.2.4
HDR	3.2
homology-directed repair	3.2
indel	3.3
InDel mutation	3.3
intended edit	3.4
meganuclease	2.3.1
meganuclease linker	2.3.2
meganuclease single chain	2.3.3
meganuclease target site	2.3.4
megaTAL	2.4.1
megaTAL linker	2.4.2
megaTAL target site	2.4.3
microhomology-mediated end joining repair	3.5
MMEJ	3.5
NHEJ	3.6
non-homologous end joining	3.6
off-target	1.4
PAM	2.2.5
protospacer adjacent motif	2.2.5
repair template	2.1.1
repeat variable diresidue	2.5.1
ribonucleoprotein	2.2.6
RNA edit	3.1
RNP	2.2.6
RVDs	2.5.1
sequence-specific nuclease	2.1.3
sgRNA	2.2.7
single-guide RNA	2.2.7
site-directed DNA modification enzyme	2.1.2
site-directed nuclease	2.1.3
specificity	1.5
TALEN	2.5.2
TALEN linker	2.5.3
TALEN target site	2.5.4
target	1.6
target strand	2.2.8
tracrRNA	2.2.9
trans-activating CRISPR RNA	2.2.9
transcription activator-like effector nuclease	2.5.2
unintended edit	3.7
ZFN	2.6.2
ZFN linker	2.6.3
ZFN recognition helix	2.6.4
ZFN target site	2.6.5
ZFP	2.6.6
zinc finger	2.6.1
zinc finger nuclease	2.6.2
zinc finger protein	2.6.6

Terms and definitions

1. Genome editing concepts

1.1

gene editing

techniques for genome engineering (1.3) that involve nucleic acid damage, repair mechanisms, replication and/or recombination for incorporating site-specific modification(s) into a gene or genes

Note 1 to entry: Gene editing is a subclass of genome editing (1.2).

Note 2 to entry: There are various genome editing tools (see 1.2)

1.2

genome editing

techniques for genome engineering (1.3) that involve nucleic acid damage, repair mechanisms, replication and/or recombination for incorporating site-specific modification(s) into a genomic DNA

Note 1 to entry: Gene editing (1.1) is a subclass of genome editing.

Note 2 to entry: There are various genome editing tools (see 1.2)

1.3

genome engineering

process of introducing intentional changes to genomic nucleic acid

Note 1 to entry: Gene editing (1.1) and genome editing (1.2) are techniques used in genome engineering.

1.4

off-target
genome editing off-target

genomic position and/or nucleic acid sequence distinct from the target (1.6).

EXAMPLE:

Off-target binding, off-target cleavage, off-target edit, off-target sequence change.

Note 1 to entry: An off-target edit is an example of an unintended edit (3.7).

1.5

specificity
genome editing target specificity

extent to which an editing agent or procedure acts only on its intended target (1.6)

Note 1 to entry: When using this term, the procedure is defined, the intended target is defined, the action or outcome is measured and reported, and limits of detection are reported.

1.6

target
genome editing target

nucleic acid sequence subject to intentional binding, modification and/or cleavage during a genome editing (1.2) process

Note 1 to entry: See also off-target (1.4), Cas nuclease target site (2.2.2), meganuclease target site (2.3.4), megaTAL target site (2.4.3), TALEN target site (2.5.4) and ZFN target site (2.6.5).

2. Genome editing tools

2.1 General

2.1.1

repair template

nucleic acid sequence used to direct cellular DNA repair pathways to incorporate specific DNA sequence changes at or near a target (1.6)

2.1.2

site-directed DNA modification enzyme

enzyme capable of modifying DNA at a specific sequence

EXAMPLE: Site-directed nuclease (2.1.3), site-directed adenosine deaminase.

2.1.3

site-directed nuclease
sequence-specific nuclease

enzyme capable of cleaving the phosphodiester bond between adjacent nucleotides in a nucleic acid polymer at a specific sequence

2.2 CRISPR specific

2.2.1

Cas nuclease
CRISPR associated nuclease

enzyme that is a component of CRISPR systems that is capable of breaking the phosphodiester bonds between nucleotides

EXAMPLE:

Cas3, Cas9, Cas12a, Cas13, CasX.

Note 1 to entry: Some but not all Cas nucleases interact with a gRNA (2.2.4). See also crRNA (2.2.3), sgRNA (2.2.7) and tracrRNA (2.2.9).

2.2.2

Cas nuclease target site

nucleotide sequence comprising the PAM (2.2.5), in most cases, and a region that hybridizes to the target sequence specific guide of a Cas RNP (2.2.6)

2.2.3

crRNA
CRISPR RNA

polyribonucleotide that includes sequence complementarity to the target (1.6) and a sequence that interacts with a Cas protein and optionally tracrRNA (2.2.9)

Note 1 to entry: crRNA is a component of gRNA (2.2.4) or a complete gRNA, depending on the CRISPR system.

Note 2 to entry: In some CRISPR systems, a portion of the crRNA will base-pair with the tracrRNA (e.g. Cas9). Other CRISPR systems lack tracrRNA (e.g. Cas12a/Cpf1). In systems that do not require tracrRNA, the gRNA is called a “CRISPR RNA” or simply “crRNA”.

2.2.4

gRNA
guide RNA

polyribonucleotide containing regions sufficient for productive interaction with a Cas nuclease (2.2.1) or variant to direct interaction with the specific target (1.6)

Note 1 to entry: See crRNA (2.2.3), tracrRNA (2.2.9) and sgRNA (2.2.7).

Note 2 to entry: For Cas9-type proteins, the natural gRNA comprises a crRNA that imparts sequence specificity and the tracrRNA that interacts with and activates the protein. This is sometimes referred to as a “dual guide”. Other Cas proteins can have different gRNA structures.

Note 3 to entry: sgRNA for Cas9 proteins are non-naturally occurring polyribonucleotides where the crRNA and tracrRNA are fused with an artificial linker.

Note 4 to entry: In some cases, chemical modifications of the polyribonucleotide are used, such as modifications to the phosphodiester linkages, bases or sugar moieties. These can include substitution of DNA (2′-deoxy) or 2′-methoxy nucleotides for RNA nucleotides, etc.

2.2.5

PAM
protospacer adjacent motif

short nucleotide motif in the targeted region of nucleic acid required for guided Cas nuclease (2.2.1) or variant binding

Note 1 to entry: PAMs are distinct from, but in close proximity to, nucleic acid sequence targeted by gRNA (2.2.4).

2.2.6

RNP
ribonucleoprotein

complex comprising protein bound to RNA

Note 1 to entry: In the context of CRISPR-based genome editing (1.2), RNP refers to the complex of Cas protein(s) and gRNA (2.2.4).

2.2.7

sgRNA
single-guide RNA

fusion of crRNA (2.2.3) and tracrRNA (2.2.9)

Note 1 to entry: See gRNA (2.2.4).

2.2.8

target strand
CRISPR target strand

single-stranded nucleic acid sequence that is complementary to the gRNA (2.2.4) of a Cas protein or variant

2.2.9

tracrRNA
trans-activating CRISPR RNA

polyribonucleotide that base-pairs with the crRNA (2.2.3) and interacts with a Cas nuclease (2.2.1) to enable sequence-specific interaction of the target (1.6)

Note 1 to entry: tracrRNA is an optional component of gRNA (2.2.4).

2.3 Meganuclease specific

2.3.1

meganuclease

variant of the LAGLIDADG subtype of homing endonucleases engineered to recognize a 15 to 40 base pair DNA target (1.6) different from the site recognized by the parent endonuclease

Note 1 to entry: The LAGLIDADG consensus sequence represents an alpha helix that serves as a dimerization interface and key component in the DNA cleavage site in this family of meganucleases.

2.3.2

meganuclease linker

natural or artificially derived polypeptide sequence that links two LAGLIDADG domains to one another to form a single polypeptide chain

Note 1 to entry: The LAGLIDADG consensus sequence represents an alpha helix that serves as a dimerization interface and key component in the DNA cleavage site in this family of meganucleases (2.3.1).

2.3.3

meganuclease single chain

meganuclease (2.3.1) composed of two LAGLIDADG domains joined by either a natural or artificially derived polypeptide linker in order to be expressed as a single polypeptide chain

Note 1 to entry: The LAGLIDADG consensus sequence represents an alpha helix that serves as a dimerization interface and key component in the DNA cleavage site in this family of meganucleases.

2.3.4

meganuclease target site

DNA sequence recognized by meganucleases (2.3.1)

Note 1 to entry: Meganuclease target sites are 15 to 40 base pair DNA sequence consisting of two equal length half sites separated by a 4 base pair middle sequence (also known as “central 4”). Cleavage occurs at the junction of the half sites and the middle site on each DNA strand leaving a 4 nucleotide 3′ overhang.

2.4 megaTAL specific

2.4.1

megaTAL

artificial chimeric nucleases composed of an array of transcription activator-like (TAL) effector (TALE)^[¹^] DNA binding domains, a megaTAL linker (2.4.2) and a meganuclease (2.3.1)

2.4.2

megaTAL linker

amino acid sequence that links an array of TAL DNA binding domains and a meganuclease (2.3.1)

2.4.3

megaTAL target site

intended DNA binding site of a megaTAL (2.4.1), encompassing the DNA sequence targeted by both the TAL array and the meganuclease (2.3.1)

2.5 TALEN specific

2.5.1

RVDs
repeat variable diresidue

two amino acid sequence in TAL repeats that imparts DNA binding specificity

2.5.2

TALEN
transcription activator-like effector nuclease

artificial nuclease composed of an endodeoxyribonuclease fused to DNA-binding domains of TALEs^[¹^] that cleaves DNA at a defined distance from TALE recognition sequences

Note 1 to entry: A TALEN can refer to a pair of TALE-FokI fusion proteins that dimerize on opposite strands of DNA adjacent to a target (1.6) for cleavage.

2.5.3

TALEN linker

polypeptide sequence that links an array of TAL DNA binding domains and an endodeoxyribonuclease, typically FokI

2.5.4

TALEN target site

DNA sequence recognized by TALENs (2.5.2)

Note 1 to entry: Typical TALEN target sites are recognized by a pair of TALENs and contain a central spacer region flanked by upstream and downstream sequences that are each recognized by one TALEN. This pair is designed in such a way that two TALEN nuclease domains dimerize to cleave DNA within the spacer region.

2.6 ZFN specific

2.6.1

zinc finger
Cys2His2 zinc finger

DNA binding domain that folds via coordination of zinc into a compact structure consisting of two beta strands and one alpha-helix (β β α)

Note 1 to entry: Zinc finger DNA binding domains typically contain 28 amino acids.

2.6.2

ZFN
zinc finger nuclease

chimeric protein consisting of an array of zinc fingers (2.6.1) linked to a DNA cleavage domain

Note 1 to entry: FokI is prevalently used as the DNA cleavage domain bound to a zinc finger.

Note 2 to entry: Binding of two ZFNs to a pair of appropriately spaced DNA target sites enables nuclease domain dimerization and DNA cleavage between the targets.

2.6.3

ZFN linker

polypeptide sequence that links an array of zinc finger (2.6.1) binding domains and a DNA cleavage domain

Note 1 to entry: FokI is prevalently used as the DNA cleavage domain bound to a zinc finger.

2.6.4

ZFN recognition helix

seven residue positions within a zinc finger (2.6.1) that are most directly responsible for its DNA binding preference

Note 1 to entry: The seven residues comprise the first six residues of the alpha helix, along with the residue immediately preceding the N-terminal of the helix. They are typically referred to as positions +1 to +6 (within the alpha helix) and position (−1) (immediately preceding the helix).

2.6.5

ZFN target site

DNA sequence recognized by a pair of ZFNs (2.6.2)

Note 1 to entry: Typical ZFN target sites contain a central spacer region flanked by DNA sequences that are each recognized by an array of zinc fingers (2.6.1) oriented such that the ZFN nuclease domains dimerize and cleave within the spacer.

2.6.6

ZFP
zinc finger protein

DNA binding protein consisting of a tandem array of multiple zinc fingers (2.6.1)

3. Genome editing outcomes

3.1

edit
DNA edit
RNA edit
epigenome edit
DNA, RNA or epigenome edit

modification to nucleic acid sequence resulting from the application of genome editing (1.2) components

EXAMPLE:

Insertion, deletion, substitution, deamination, methylation, demethylation.

Note 1 to entry: Genome editing components can include a nuclease and repair template (2.1.1).

3.2

HDR
homology-directed repair

mechanism of recombinational DNA repair^[²^] where repair is templated by a polynucleotide with regions corresponding to sequences flanking the target (1.6)

EXAMPLE:

Single-stranded DNA oligonucleotide templated HDR.

Note 1 to entry: Repair templates (2.1.1) can be exogenously introduced to achieve sequence changes in genome editing (1.2) approaches.

3.3

indel
InDel mutation

sequence change caused by the insertion and/or deletion of nucleotides

3.4

intended edit

designed modification to a target (1.6) resulting from the application of genome editing (1.2) components

Note 1 to entry: See edit (3.1).

Note 2 to entry: Genome editing components can include a nuclease and repair template (2.1.1).

3.5

MMEJ
microhomology-mediated end joining repair

mechanism of DNA end-joining repair^[³^] where the DNA ends are rejoined to each other using short regions of homology flanking the initiating double-stranded break to align the ends for repair

Note 1 to entry: MMEJ repair of DNA breaks in genome editing (1.2) approaches can result in deletion between pairs of microhomology regions.

Note 2 to entry: Short regions of homology for MMEJ are typically 2 to 25 base pairs.

3.6

NHEJ
non-homologous end joining

mechanism of DNA end-joining repair^[³^] in which DNA ends are joined in a homology-independent manner

Note 1 to entry: NHEJ repair of DNA breaks in genome editing (1.2) workflows can result in indel (3.3) formation.

3.7

unintended edit

modification to nucleic acid at the target (1.6) that is not the designed change or at an off-target (1.4) resulting from the application of genome editing (1.2) components

Note 1 to entry: See edit (3.1).

Note 2 to entry: Genome editing components can include a nuclease and repair template (2.1.1).

Symbols and abbreviated terms

bp – base pairs

DNA – deoxyribonucleic acid

CRISPR – clustered regularly interspaced short palindromic repeats

RNA – ribonucleic acid

TAL – Transcription Activator-like

TALE – transcription activator-like effector

REFERENCES

[1] U.S. National Library of Medicine. MeSH Descriptor Data 2020: Transcription Activator-Like Effectors. Available from: https://meshb.nlm.nih.gov/record/ui?name=TRANSCRIPTION%20ACTIVATOR-LIKE%20EFFECTORS

[2] U.S. National Library of Medicine. MeSH Descriptor Data 2020: Recombinational DNA Repair. Available from: https://meshb.nlm.nih.gov/record/ui?ui=D059767

[3] U.S. National Library of Medicine. MeSH Descriptor Data 2020: DNA End-Joining Repair. Available from: https://meshb.nlm.nih.gov/record/ui?ui=D059766

Biotechnology, Bioinformatics, Genome editing, Genomics, Health, Cell and gene therapy and Precision medicine

Created November 18, 2020, Updated March 26, 2025

Was this page helpful?