The problem of optimally removing a set of vertices from a graph to minimize the size of the largest resultant component is known to be NP-complete. Prior work has provided near optimal heuristics with a high time complexity that function on up to hundreds of nodes and less optimal but faster techniques that function on up to thousands of nodes. In this work, we analyze how to perform vertex partitioning on massive graphs of tens of millions of nodes. We use a previously known and very simple heuristic technique: iteratively removing the node of largest degree and all of its edges. This approach has an apparent quadratic complexity since, upon removal of a node and adjoining set of edges, the node degree calculations must be updated prior to choosing the next node. However, we describe a linear time complexity solution using an array whose indices map to node degree and whose values are hash tables indicating the presence or absence of a node at that degree value. We empirically demonstrate linear scalability on random graphs of up to 15 000 nodes and evaluate our memory usage vs. runtime tradeoffs. We then demonstrate tractability on massive graphs through execution on a graph with 34 million nodes representing Internet wide router connectivity.
International Journal of Computer Science: Theory and Application
, Harang, R.
and Gueye, A.
Linear Time Vertex Partitioning on Massive Graphs, International Journal of Computer Science: Theory and Application, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=919446
(Accessed December 2, 2023)