Improved visual clustering of large multi-dimensional data sets.
Arquivos
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Resumo
Lowering computational cost of data analysis techniques is an essential step towards including the user in the process and achieving scalability of algorithms for large scale visualization. In this paper we present an improved algorithm for visual clustering of large multi-dimensional data sets. This algorithm is a version with lower computational cost of the IPCLUS algorithm. The original algorithm is an approach that deals efiiciently with multi—dimensionality using various projections of the data in order to perform multi-space clustering, pruning outliers through direct user interaction. The algorithm presented here, named HC-Enhanced, adds a scalability level to the approach without reducing clustering quality. Additionally, an algorithm to improve clusters is added to the approach. A number of test cases is presented with good results.