Marilyn Yazmin Vazquez Landrove
George Mason University
Tuesday, July 24, 15:00 - 16:00
Building 227 Room A202
Gaithersburg
Tuesday, July 24, 13:00 - 14:00
Building 1 Room 4072
Boulder
Host: Gunay Dogan
Abstract: Data clustering is a fundamental task for discovering patterns in data, and is central to machine learning. Often a given high-dimensional data set lives on a lower-dimensional manifold, and is sampled according to a probability measure. Then the clusters can be defined as peaks in the sampled probability density, and a clustering algorithm would need to identify the peaks in the density to compute the clusters. The challenges in this approach are non-uniform sampling of the density and bridges between peaks of the density. To solve these problems, we propose a new clustering algorithm that divides the clustering problem into three steps: picking the right threshold on the sample density to separate the peaks, clustering the points that passed the threshold, and classifying the remaining points. We explain the key details of these steps, and provide theoretical assurances on the performance. As an important application, we show how to apply this method to segment microstructure images by considering the images as a point-cloud of image patches. We present results on 2D microscopic images of various materials.
Bio: Marilyn Yazmin Vazquez Landrove received a Bachelor of Science and a B.A. and a B.S. from California State University, Long Beach in 2013. She received her M.S. in Mathematics from George Mason University in 2015 and is finishing her Ph.D. at the same institution.
Note: Visitors from outside NIST must contact Cathy Graham; (301) 975-3800; at least 24 hours in advance.