Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Discovering Mathematical Objects of Interest - A Study of Mathematical Notations



Howard S. Cohl, Andre Greiner Petter, Moritz Schubotz, Corinna Breitinger, Fabian Muller, Akiko Aizawa, Bela Gipp


Mathematical notation, i.e., the writing system used to communicate concepts in mathematics, encodes valuable information for a variety of information search and retrieval systems. Yet, mathematical notations remain mostly unutilized by today's systems. In this paper, we present the first in-depth study on the distributions of mathematical notation in two large scientific corpora: the open-access arXiv (2.5B mathematical objects) and the mathematical reviewing service zbMATH (61M mathematical objects). Our study lays a foundation for future research projects on mathematical information retrieval for large scientific corpora. Further, we demonstrate the relevance of our results to a variety of use-cases. For example, to assist semantic extraction systems, to improve scientific search engines, and to facilitate specialized math recommendation systems. The contributions of our presented research are as follows: (1) we present the first distributional analysis of mathematical formulae on arXiv and zbMATH; (2) we retrieve relevant mathematical objects for given textual search queries (i.e., linking $P_{n}^{(\alpha, \beta)}\!\left(x\right)$ with `Jacobi polynomial'); (3) we extend zbMATH's search engine by providing relevant mathematical formulae; and (4) we exemplify the applicability of the results by presenting auto-completion for math inputs as the first contribution to math recommendation systems. To expedite future research projects, we make our source code and the data available.
Proceedings Title
Proceedings of the Web Conference 2020
Conference Dates
April 20-24, 2020
Conference Location
Conference Title
The Web Conference 2020


Mathematical Objects of Interest, Mathematical Information Retrieval, Distributions of Mathematical Objects, Term Frequency-Inverse Document Frequency, Mathematical Search Engine


Cohl, H. , Greiner, A. , Schubotz, M. , Breitinger, C. , Muller, F. , Aizawa, A. and Gipp, B. (2020), Discovering Mathematical Objects of Interest - A Study of Mathematical Notations, Proceedings of the Web Conference 2020, Taipei, -1, [online], (Accessed April 19, 2024)
Created March 31, 2020, Updated May 4, 2021