Rate This Document
Findability
Accuracy
Completeness
Readability

Clustering

Scenarios

Clustering is widely used. Specifically, in business, it helps market analysts distinguish different consumer groups from the consumer database and summarize the consumption patterns or habits of each group of consumers. For example, if the K-means algorithm is used to measure a distance between two vectors in a sample, an excessively large dimension means an excessively large amount of data involved in the computation, which causes severe computing resource consumption. Based on the Kunpeng architecture's hardware advantages, Kunpeng BoostKit exploits the characteristics of Kunpeng cache blocks to improve the cache hit ratio and reduce the latency by maintaining the continuity of memory access and computing.

Principles

Density-based spatial clustering of applications with noise (DBSCAN) is a density-based spatial clustering algorithm that requires that the number of objects contained in a certain area in the clustering space be greater than or equal to a given threshold. DBSCAN can effectively process noise and discover spatial clustering of any shape.