NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Multilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage
Published
Author(s)
Yi-Kai Liu, John M. Conroy, Sashka T. Davis, Jeff Kubina, Dianne P. O'Leary, Judith D. Schlesinger
Abstract
In this paper we present three term weighting approaches for multi-lingual document summarization and give results on the DUC 2002 data as well as on the 2013 Multilingual Wikipedia feature articles data set. We introduce a new interval-bounded nonnegative matrix factorization. We use this new method, latent semantic analysis (LSA), and latent Dirichlet allocation (LDA) to give three term- weighting methods for multi-document multi-lingual summarization. Results on DUC and TAC data, as well as on the MultiLing 2013 data, demonstrate that these methods are very promising, since they achieve oracle coverage scores in the range of humans for 6 of the 10 test languages.
Liu, Y.
, Conroy, J.
, Davis, S.
, Kubina, J.
, O'Leary, D.
and Schlesinger, J.
(2013),
Multilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage, MultiLing 2013, Sofia, -1, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=914103
(Accessed October 27, 2025)