As 2018 Winter quarter begins, our team published a new research article in Transactions in GIS:
Gengchen Mai, Krzysztof Janowicz, Yingjie Hu, Song Gao. ADCN: An Anisotropic Density-Based Clustering Algorithm for Discovering Spatial Point Patterns with Noise. Transactions in GIS. DOI:10.1111/tgis.12313
The illustration of ADCN algorithm, an opensource implementation, a test environment as well as the test data are available here. This research article is a major component of the master thesis of Gengchen Mai who just got his M.A. degree last year.
Here is the abstract of the paper:
Density-based clustering algorithms such as DBSCAN have been widely used for spatial knowledge discovery as they offer several key advantages compared with other clustering algorithms. They can discover clusters with arbitrary shapes, are robust to noise, and do not require prior knowledge (or estimation) of the number of clusters. The idea of using a scan circle centered at each point with a search radius Eps to find at least MinPts points as a criterion for deriving local density is easily understandable and sufficient for exploring isotropic spatial point patterns. However, there are many cases that cannot be adequately captured this way, particularly if they involve linear features or shapes with a continuously changing density, such as a spiral.In such cases, DBSCAN tends to either create an increasing number of small clusters or add noise points into large clusters. Therefore, in this article, we propose a novel anisotropic density-based clustering algorithm (ADCN). To motivate our work, we introduce synthetic and real-world cases that cannot be handled sufficiently by DBSCAN(or OPTICS). We then present our clustering algorithm and test it with a wide range of cases. We demonstrate that our algorithm can perform equally as well as DBSCAN in cases that do not benefit explicitly from an anisotropic perspective, and that it outperforms DBSCAN in cases that do. Finally, we show that our approach has the same time complexity as DBSCAN and OPTICS, namely O(n log n) when using a spatial index and O(n2) otherwise. We provide an implementation and test the runtime over multiple cases.