math5001_2023 programs
math5001_2023 programs,
programs, data, and plots for a lecture on
Clustering Data with the K-Means Algorithm.
-
faithful_wait.py,
make a histogram of the waiting times.
-
faithful_both.py,
plot eruption and wait times together,
rescale the data, and write to new files.
faithful_guess.py,
guess two centers, then cluster the data.
-
faithful_lloyd.py,
perform four iterations on Lloyd's clustering algorithm.
-
faithful_kmeans.py,
cluster the data with the scikit-learn kmeans() function.
-
faithful_kmeans2.py,
cluster the data with the scipy kmeans2() function.
-
faithful_energy.py,
plot the change in cluster energy as we increase the number of clusters.
-
faithful_classify.py,
use our model to classify new data.
-
faithful_simulate.py,
use our model to simulate new data.
-
blobs_energy.py,
plot the change in cluster energy as we increase the number of clusters
for a set of "blobs" data.
-
cluster_demo.py,
run some illustrations for the cluster lecture.
-
faithful_wait.png,
a histogram of the waiting times.
-
faithful_raw.png,
a plot of the raw data.
-
faithful_normalized.png,
a plot of the normalized data.
-
faithful_standardized.png,
a plot of the standardized data.
-
faithful_guess_centers.png,
our guesses for cluster centers.
-
faithful_guess_clusters.png,
the clustering induced by our guessed cluster centers.
-
faithful_lloyd_iteration0.png,
Lloyd's iteration 0.
-
faithful_lloyd_iteration1.png,
Lloyd's iteration 1.
-
faithful_lloyd_iteration2.png,
Lloyd's iteration 2.
-
faithful_lloyd_iteration3.png,
Lloyd's iteration 3.
-
faithful_kmeans.png,
clustering with kmeans().
-
faithful_kmeans2.png,
clustering with kmeans2().
-
faithful_energy.png,
clustering energy with increasing k.
-
faithful_classify.png,
classifying new data.
-
faithful_simulate.png,
a new set of simulated data.
-
blobs_cluster_one.png,
applying one cluster to two blobs of data.
-
blobs_data.png,
a set of blob data.
-
blobs_energy.png,
the decrease in cluster energy with increasing k, for blob data.
-
data_blobs.png,
100 points in "blobs".
-
data_grid.png,
100 grid points in the unit square.
-
data_normal.png,
100 normal random values with mean 0 and standard deviation 1.
-
data_uniform.png,
100 random points in unit square.
Last revised on 30 September 2023.