Abdou Youssef
Department of Computer Science, George Washington University and NIST
Thursday, August 10, 15:00 - 16:00
Building 101, Lecture Room D
Gaithersburg
Thursday, August 10, 13:00 - 14:00
Room 4549
Boulder
Abstract: Nearly all mathematical literature is now online and mostly in natural-language form. Therefore, math content processing presents some of the same challenges faced in natural language processing (NLP), such as math disambiguation and math semantics determination. These challenges must be surmounted to enable more effective math knowledge discovery & management, automated presentation-to-computation (P2C) conversion, automated math reasoning, and more. To meet this goal, considerable math language processing (MLP) technology is needed. This talk will present a project being undertaken at NIST and GW to develop a sophisticated part-of-math (POM) tagger. This tagger, upon reading a mathematical expression, determines the nature, role, and other semantics of each mathematical symbol and sub-expression in the input expression. In the process, considerable and challenging ambiguities are encountered and disambiguated using mathematical-linguistic and machine learning techniques, including classification, topic modeling, and context modeling. The project outcomes will enable researchers to develop new advanced applications such as: (1) techniques for computer-aided semantic enrichment of digital math libraries; (2) automated conversion of math expressions from natural form to (i) a machine-computable form and (ii) a formal form suitable for automated reasoning; (3) math question-answering capabilities at the manuscript level and collection level; (4) richer math GUIs; (5) more accurate math OCR; (6) more effective math search; and other applications.
Bio: Abdou Youssef is a professor and former chair of the Department of Computer Science at the George Washington University, where he has been on the faculty for 30 years. He is also a faculty appointment in the ACM Division of ITL at NIST, where he has played a leading role in the research and development of math search for the DLMF. A holder of a Bachelor’s degree in Math from the Lebanese University, and a Master’s and PhD degrees in Computer Science from Princeton University, Dr. Youssef has published over 125 papers in a number of research areas including math search, math language processing, image/video/audio processing, and high-performance computing. His current research interests include math language processing, and Big Data applications of Machine Learning and Natural Language Processing.
Note: Visitors from outside NIST must contact Cathy Graham; (301) 975-3800; at least 24 hours in advance.