SAMMON_DATA
Sample M-Dimensional Datasets for Cluster Analysis


SAMMON_DATA is a MATLAB program which generates 6 files of test data for multivariate data clustering.

Usage:

sammon_data

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

Languages:

SAMMON_DATA is available in a MATLAB version.

Related Data and Programs:

ASA113, a MATLAB library which implements the Banfield and Bassill clustering algorithm using transfers and swaps.

ASA136, a MATLAB library which implements the Hartigan and Wong clustering algorithm.

CLUSTER_ENERGY, a FORTRAN90 program which groups data into a given number of clusters to minimize the energy.

KMEANS, a MATLAB library which contains several different algorithms for the K-Means problem.

MARTINEZ, a dataset directory which contains datasets for computational statistics;

MDS, a dataset directory which contains datasets for M-dimensional scaling;

PCL, a dataset directory which contains datasets from a gene expression experiment on Arabidopsis, which are candidates for data cluster analysis;

SPAETH, a dataset directory which contains datasets for cluster analysis;

SPAETH, a FORTRAN90 library which can cluster data according to various principles.

SPAETH2, a dataset directory which contains datasets for cluster analysis;

SPAETH2, a FORTRAN90 library which can cluster data according to various principles.

Reference:

  1. Ronald Fisher,
    The use of multiple measurements in taxonomic problems,
    Annual Eugenics,
    Volume 7, part II, 1936, pages 179-188.
  2. John Sammon,
    A nonlinear mapping for data structure analysis,
    IEEE Transactions on Computers,
    Volume C-18, Number 5, May 1969, pages 401-409.

Source Code:

Examples and Tests:

List of Routines:

You can go up one level to the MATLAB source codes.


Last revised on 20 September 2010.