Quantifying Pairwise Similarity for Complex Polymers

Jiale Shi; Nathan Rebello; Dylan Walsh; Michael Deagen; Bruno Salomao Leao; Debra Audus; Bradley Olsen

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Quantifying Pairwise Similarity for Complex Polymers

Published

September 6, 2023

Author(s)

Jiale Shi, Nathan Rebello, Dylan Walsh, Michael Deagen, Bruno Salomao Leao, Debra Audus, Bradley Olsen

Abstract

Defining the similarity between chemical entities is an essential task in polymer informatics, enabling ranking, clustering, and classification. Despite its importance, pairwise chemical similarity for polymers remains an open problem. Here, a similarity function for polymers with well-defined backbones is designed based on polymers' stochastic graph representations generated from canonical BigSMILES, a structurally-based line notation for describing macromolecules. The stochastic graph representations are separated into three parts: repeat units, end groups, and polymer topology. The earth mover's distance is utilized to calculate the similarity of the repeat units and end groups, while the graph edit distance is used to calculate the similarity of the topology. These three values can be linearly or nonlinearly combined to yield an overall pairwise chemical similarity score for polymers that is largely consistent with the chemical intuition of expert users and is adjustable based on the relative importance of different chemical features for a given similarity problem. This method gives a reliable solution to quantitatively calculate the pairwise chemical similarity score for polymers and represents a vital step toward building search engines and quantitative design tools for polymer data.

Citation

Macromolecules

Pub Type

Journals

Download Paper

https://doi.org/10.1021/acs.macromol.3c00761

Local Download

Polymers, Materials and Data and informatics

Citation

Shi, J. , Rebello, N. , Walsh, D. , Deagen, M. , Salomao Leao, B. , Audus, D. and Olsen, B. (2023), Quantifying Pairwise Similarity for Complex Polymers, Macromolecules, [online], https://doi.org/10.1021/acs.macromol.3c00761, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=936441 (Accessed July 17, 2026)

Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created September 6, 2023, Updated September 15, 2023

Was this page helpful?