SPAETH2
Cluster Analysis Tools
SPAETH2
is a FORTRAN90 library which
analyzes data by grouping it
into clusters.
The current implementation
of the code is "under development": some things work, and some
don't.
Licensing:
The computer code and data files made available on this web page
are distributed under
the GNU LGPL license.
Languages:
SPAETH2 is available in
a FORTRAN77 version and
a FORTRAN90 version.
Related Data and Programs:
ASA058,
a FORTRAN90 library which
implements the Kmeans algorithm of Sparks.
ASA136,
a FORTRAN90 library which
implements the Hartigan and Wong clustering algorithm.
CITIES,
a dataset directory which
contains sets of information about cities and the distances
between them;
CITIES,
a FORTRAN90 library which
handles various problems associated with a set of "cities" on a map.
KMEANS,
a FORTRAN90 library which
contains several different algorithms for the KMeans problem.
LAU_NP,
a FORTRAN90 library which
implements heuristic algorithms for various NPhard combinatorial problems.
POINT_MERGE,
a FORTRAN90 library which
considers N points in M dimensional space, and counts or indexes
the unique or "tolerably unique" items.
SPAETH,
a FORTRAN90 library which
can cluster data according to various principles.
SPAETH,
a dataset directory which
contains datasets for cluster analysis;
SPAETH2,
a dataset directory which
contains datasets for cluster analysis;
Reference:

Helmuth Spaeth,
Cluster Analysis Algorithms
for Data Reduction and Classification of Objects,
Ellis Horwood, 1980,
QA278 S6813.

Helmuth Spaeth,
Cluster Dissection and Analysis,
Theory, FORTRAN Programs, Examples,
Ellis Horwood, 1985,
QA278 S68213.
Source Code:
Examples and Tests:
List of Routines:

CH_CAP capitalizes a single character.

CH_EQI is a case insensitive comparison of two characters for equality.

CH_TO_DIGIT returns the integer value of a base 10 digit.

CLUDIA clusters data for which a distance matrix has been supplied.

CLUSTA solves the multiple location problem in N dimensions.

CLUSTER_CENTROIDS determines the centroids of a clustering.

CLUSTER_MEDIANS determines the medians of a clustering.

CLUSTER_MEDIAN_DISTANCE finds the cluster median distance.

CLUSTER_POPULATION sets the cluster populations from the assignment array.

CLUSTER_VARIANCE determines the variances associated with a clustering.

COLPER seeks a column permutation which maximizes the "bond energy".

DATA_D_READ reads a real data set stored in a file.

DATA_D_PRINT prints a real data set.

DATA_D_SHOW makes a typewriter plot of a real data set.

DATA_SIZE counts the size of a data set stored in a file.

DIF_INVERSE returns the inverse of the second difference matrix.

DISMEA constructs a set of hierarchical clusters.

DIVGOW constructs a set of hierarchical clusters by doubling.

EMEANS clusters data using a variant of the KMeans algorithm for L1 norms.

GET_UNIT returns a free FORTRAN unit number.

HIERCL implements seven agglomerative hierarchical clustering methods.

HMEANS clusters data using the HMeans algorithm.

I4_FACTORIAL computes the factorial N!

I4_SWAP swaps two integer values.

I4VEC_INDICATOR sets an integer vector to the indicator vector A(I)=I.

I4VEC_PERML generates permutations of a vector in lexicographic order.

I4VEC_PERMS generates permutations of a vector in lexicographic order.

JOINER uses a very simple cluster assignment algorithm.

KMEANS clusters data using the KMeans algorithm.

LEADER uses a very simple cluster assignment algorithm.

LINKER contructs a minimal tree for a symmetric distance matrix.

ORDERED clusters onedimensional ordered data into NC clusters.

PROFILE seeks an optimal variable ordering for a set of data.

R8_SWAP swaps two R8's.

R8MAT_DET computes the determinant of an R8MAT.

R8MAT_PRINT prints an R8MAT.

R8VEC_ASCENDS determines if a double precision vector is (weakly) ascending.

R8VEC_SORT_BUBBLE_A ascending bubble sorts an R8VEC.

RANDP randomly partitions a set of M items into N clusters.

S_TO_R8 reads an R8 from a string.

S_WORD_COUNT counts the number of "words" in a string.

STANDN solves the single location problem in N dimensions.

TIMESTAMP prints the current YMDHMS date as a time stamp.

TRANSF transforms a data set to have zero mean and unit variance.

URAND returns a pseudorandom number uniformly distributed in [0,1].

WMEANS clusters data using the determinant criterion.

ZWEIGO organizes a set of data into two clusters.
You can go up one level to
the FORTRAN90 source codes.
Last revised on 13 November 2006.