Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Extracting Mathematical Concepts from Text



Jacob Collard, Valeria de Paiva, Brendan Fong, Eswaran Subrahmanian


We investigate some different systems for extracting mathematical entities from texts in the mathematical field of category theory, as a first step for constructing a mathematical knowledge graph. We consider four different term extractors and compare their results. This small experiment showcases some of the issues with the construction and evaluation of specific domain knowledge graphs. We also make available two open corpora in research mathematics, in particular in category theory: a small corpus of 755 abstracts from the journal TAC (3188 sentences), and a bigger corpus from the nLab community wiki (15,000 sentences).
Proceedings Title
Proceedings of the 29th International Conference on Computational Linguistics
Conference Dates
October 12-17, 2022
Conference Location
Gyeongju, KR
Conference Title
29th International Conference on Computational Linguistics


category theory, natural language processing, evaluation, mathematics, technical language processing, key phrase extraction


Collard, J. , de Paiva, V. , Fong, B. and Subrahmanian, E. (2022), Extracting Mathematical Concepts from Text, Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, KR, [online], (Accessed April 22, 2024)
Created October 12, 2022, Updated May 10, 2023