sklearn_src

sklearn, software that uses scikit-learn, which is a Python-based library for machine learning computations.

blob_classify_kernelized_svm, a scikit-learn code which uses a kernelized support vector machine to classify an artificial dataset of groups of "blobs".
blob_classify_logistic_multi, a scikit-learn code which uses multiple applications of logistic regression to classify an artificial dataset of three groups of "blobs".
blob_cluster_kmeans, a scikit-learn code which uses the k-means algorithm to cluster blob data.
cancer_classify_decision, a scikit-learn code which uses a decision tree algorithm to classify the breast cancer dataset, comparing the training and testing accuracy as the depth of the tree is varied.
cancer_classify_forest, a scikit-learn code which uses the random forest algorithm to classify the breast cancer dataset.
cancer_classify_gradboost, a scikit-learn code which uses the gradient boosting algorithm to classify the breast cancer dataset.
cancer_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to classify the breast cancer dataset, comparing the training and testing accuracy as the number of neighbors is increased.
cancer_classify_logistic, a scikit-learn code which uses logistic regression to classify the breast cancer dataset, investigating the influence of the C parameter.
cancer_classify_mlp, a scikit-learn code which uses a multilayer perceptron to classify the breast cancer dataset.
cancer_classify_svm_rbf, a scikit-learn code which uses the support vector algorithm with RBF kernel on the cancer dataset, showing that the data should be rescaled to avoid overfitting.
cancer_scale_minmax, a scikit-learn code which uses the min-max scaling to preprocess the cancer dataset.
cancer_visualize_histogram, a scikit-learn code which displays all 30 features of the cancer dataset as histograms of feature frequence for malignant versus benign cases.
cancer_visualize_pca, a scikit-learn code which uses principal component analysis (PCA) of the cancer dataset to visualize the difference between malignant and benign cases.
circle_classify_gradboost, a scikit-learn code which uses the gradient boost algorithm to classify the artificial circle dataset, and then determines the prediction uncertainties.
digits_visualize_pca, a scikit-learn code which uses principal component analysis (PCA) of the digits dataset to visualize the grouping of data.
digits_visualize_tsne, a scikit-learn code which uses t-distributed stochastic neighbor embedding (tsne) of the digits dataset to visualize the grouping of data.
faces_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to match new faces with images in the faces dataset.
faces_classify_nmf, a scikit-learn code which uses the nonnegative matrix factorizatoin algorithm to match new faces with images in the faces dataset.
faces_classify_pca, a scikit-learn code which uses principal component analysis (PCA) to match new faces with images in the faces dataset.
handcrafted_classify_svm_rbf, a scikit-learn code which uses the support vector algorithm with RBF kernel on the handcrafted dataset.
forge_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to choose one of two classes for each of 26 items in the forge dataset, involving two features.
forge_classify_svm, a scikit-learn code which uses the support vector machine (SCM) classifier to choose one of two classes for each of 26 items in the forge dataset, involving two features.
handcrafted_classify_svm_rbf, a scikit-learn code which uses the support vector algorithm with RBF kernel on the handcrafted dataset.
housing_data_fetch, a scikit-learn code which fetches a housing dataset from GitHub and stores it locally.
iris_classify_gradboost, a scikit-learn code which uses the gradient boost algorithm to classify the iris dataset, and then determines the prediction uncertainties.
iris_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to classify the species of iris specimens based on a set of 150 sets of four measurements sepal and petal width and length.
logistic_regression, a scikit-learn code which use logistic regression to classify data.
moon_classify_forest, a scikit-learn code which uses the random forest algorithm to classify samples of the artificial moon dataset.
moon_classify_mlp, a scikit-learn code which uses a multilayer perceptron method to classify samples of the artificial moon dataset.
ram_regression_decision, a scikit-learn code which uses a decision tree algorithm to perform regression on the RAM price dataset.
ram_regression_linear, a scikit-learn code which uses linear regresssion to perform regression on the RAM price dataset.
signal_classify_nmf, a scikit-learn code which uses non-negative matrix factorization (nmf) to match new signals to items in the signal dataset.
study_classify_logistic, a scikit-learn code which uses the logistic regression algorithm to classify the outcome of students based on study time.
tester, a BASH script which runs the tests.
wave_regression_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to form a regression predictor for the wave dataset.
wave_regression_ols, a scikit-learn code which uses the ordinary least squares algorithm to form a regression predictor for the wave dataset.

Last revised on 28 March 2024.