Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

A multi-dimensional evaluation of the 'NIST 1032' sample set across four forensic Y-STR multiplexes



Becky Steffen, Tuende Huszar, Lisa Borsuk, Peter Vallone, Katherine Gettings


This manuscript reports Y-chromosomal short tandem repeat (Y-STR) haplotypes for 1032 male U.S. population samples across 30 Y-STR loci characterized by three capillary electrophoresis (CE) length-based kits (PowerPlex Y23 System, Yfiler Plus PCR Amplification Kit, and Investigator Argus Y-28 QS Kit) and one sequence-based kit (ForenSeq DNA Signature Prep Kit): DYF387S1, DYS19, DYS385 a/b, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS449, DYS456, DYS458, DYS460, DYS481, DYS505, DYS518, DYS522, DYS533, DYS549, DYS570, DYS576, DYS612, DYS627, DYS635, DYS643, and Y-GATA-H4. The length-based Y-STR haplotypes include six loci that are not reported in the sequence-based kit (DYS393, DYS449, DYS456, DYS458, DYS518, and DYS627), whereas three loci included in the sequence-based kit are not present in length-based kits (DYS505, DYS522, and DYS612). For the latter, a custom multiplex was used to generate CE length-based data, allowing 1032 samples to be evaluated for concordance across the 30 Y-STR loci included in these four commercial Y-STR typing kits. Discordances between typing methods were analyzed further to assess underlying causes such as primer binding site mutations and flanking region insertions/deletions. Allele-level frequency and statistical information is provided for sequenced loci, excluding the multi-copy loci DYF387S1 and DYS385 a/b, for which locus-specific haplotype-level frequencies are provided instead. The resulting data reveals the degree of information gained through sequencing: 88 % of sequenced Y-STR loci contain additional sequence-based alleles compared to length-based data, with the DYS389II locus containing the most additional alleles (51) observed by sequencing. Despite these allelic increases, only minimal improvement was observed in haplotype resolution by sequence, with all four commercial kits providing a similar ability to differentiate length-based haplotypes in this sample set. Finally, a subset of 369 male samples were compared to their corresponding additionally sequenced father samples, revealing the sequence basis for the 50 length-based changes observed, and no additional sequence-based mutations. GenBank accession numbers are reported for each unique sequence, and associated records are available in the STRSeq Y-Chromosomal STR Loci NCBI BioProject, accession PRJNA380347. Haplotype data is updated in the Y-STR Haplotype Reference Database (YHRD) for the 'NIST 1032' data set to now achieve the level of maximal haplotype of YHRD. All supplementary files including revisions to previously published Y-STR data are available in the NIST Public Data Repository: U.S. population data for human identification markers, DOI 10.18434/t4/1500024.
Forensic Science International: Genetics


short tandem repeat, Y-STR typing, DNA sequencing, NGS, MPS, haplotype frequency


Steffen, B. , Huszar, T. , Borsuk, L. , Vallone, P. and Gettings, K. (2021), A multi-dimensional evaluation of the 'NIST 1032' sample set across four forensic Y-STR multiplexes, Forensic Science International: Genetics, [online], (Accessed July 22, 2024)


If you have any questions about this publication or are having problems accessing it, please contact

Created December 28, 2021, Updated November 29, 2022