|
|
|
|
density-based clustering in mapreduce with guarantees on parallel time, space, and solution quality
|
|
|
|
|
|
|
|
نویسنده
|
aghamolaei sepideh ,ghodsi mohammad
|
|
منبع
|
transactions on combinatorics - 2025 - دوره : 14 - شماره : 3 - صفحه:135 -156
|
|
چکیده
|
A well-known clustering problem called density-based spatial clustering of applications with noise (dbscan) involves computing the solutions of at least one disk range query per input point, computing the connected components of a graph, and bichromatic fixed-radius nearest neighbor. mapreduce class is a model where a sublinear number of machines, each with sublinear memory, run for a polylogarithmic number of parallel rounds. most of these problems either require quadratic time in the sequential model or are hard to compute in a constant number of rounds in mapreduce. in the euclidean plane, dbscan algorithms with near-linear time and a randomized parallel algorithm with a polylogarithmic number of rounds exist. we solve dbscan in the euclidean plane in a constant number of rounds in mapreduce, assuming the minimum number of points in range queries is constant and each connected component fits inside the memory of a single machine and has a constant diameter.
|
|
کلیدواژه
|
massively parallel algorithms ,range searching ,unit disk graph ,near neighbors ,clustering
|
|
آدرس
|
sharif university of technology, department of computer engineering, iran, sharif university of technology, department of computer engineering, iran. institute for research in fundamental sciences (ipm), school of computer science, iran
|
|
پست الکترونیکی
|
ghodsi@sharif.edu
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|