HARTIGAN
Clustering Algorithm Datasets


HARTIGAN is a dataset directory which contains test data for clustering algorithms.

The data files are all text files, and have a common, simple format:

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

Related Data and Programs:

MDS, a dataset directory which contains datasets for M-dimensional scaling;

PCL, a dataset directory which contains datasets from a gene expression experiment on Arabidopsis, which are candidates for data cluster analysis;

SAMMON, a dataset directory which contains six sets of M-dimensional data for cluster analysis.

SOKAL_ROHLF, a dataset directory which contains biological datasets considered by Sokal and Rohlf.

SPAETH, a dataset directory which contains datasets for cluster analysis;

SPAETH2, a dataset directory which contains datasets for cluster analysis;

STATS, a dataset directory which contains datasets for computational statistics;

TRIOLA, a dataset directory which contains datasets used for statistical analysis.

Reference:

  1. John Hartigan,
    Clustering Algorithms,
    Wiley, 1975,
    LC: QA278.H36,
    ISBN: 0-471-35645-X.

Datasets:

You can go up one level to the DATASETS directory.


Last revised on 06 March 2012.