Datasets


  1. adjacency, a dataset directory which contains adjacency matrices associated with an undirected graph.
  2. alphabet_lowercase, a dataset directory which contains large images of the 26 lowercase alphabetic characters.
  3. alphabet_uppercase, a dataset directory which contains large images of the 26 uppercase alphabetic characters.
  4. bam, a dataset directory which ???
  5. beale_cipher, a dataset directory which contains the text of the three Beale cipher documents, which are supposed to indicate the location of a hoard of gold and silver.
  6. bin_packing, a dataset directory which contains examples of the bin packing problem, in which a number of objects are to be packed in the minimum possible number of uniform bins;
  7. birthdays, a dataset directory which contains data related to birthdays, such as the birthdays of members of hockey teams, and the number of babies born in the US on each calendar day over an interval of several years.
  8. boston_housing, a dataset directory which stores training and test data about housing prices in Boston. This dataset is also available as a builtin dataset in keras.
  9. burgers, a dataset directory which contains 40 solutions of the Burgers equation at equally spaced times from 0 to 1, with values at 41 equally spaced nodes in [0,1];
  10. case1_flow, a dataset directory which lists 401 solutions of a flow problem in a channel;
  11. cats, a dataset directory which contains jpg images of cats.
  12. cavity_flow, a dataset directory which contains 500 time steps of Navier-Stokes flow in a driven cavity;
  13. ccs, a data directory which contains examples of sparse matrices stored as Compressed Column Storage (CCS) files, a three-file format;
  14. census, a dataset directory which contains US census data;
  15. chain_letters, a dataset directory which contains examples of a chain letter;
  16. change_making, a dataset directory which contains test data for the change making problem;
  17. cities, a dataset directory which contains sets of information about cities and the distances between them;
  18. clustering, a dataset directory which can be used with clustering algorithms;
  19. color, a dataset directory which contains information about colors in terms of RGB values.
  20. crs, a dataset directory which contains examples of sparse matrices stored in Compressed Row Storage (CRS) format, a three-file format;
  21. csv, a data directory which contains examples of CSV files, a flat file format of Comma Separated Values.
  22. cvt, a dataset directory which contains examples of Centroidal Voronoi Tessellations;
  23. cvtp, a dataset directory which contains examples of CVTP's, Centroidal Voronoi Tessellations defined on a periodic domain, which is usually a rectangle or hyperrectangle.
  24. dates, a dataset directory which contains lists of dates in certain calendars.
  25. dogs, a dataset directory which contains images of dogs.
  26. draft_lottery, a dataset directory which contains the numbers assigned to each birthday, for the Selective Service System lotteries for 1970 through 1976.
  27. faces, a dataset directory which contains 10 photographs of each of 40 people, for use in facial recognition experiments.
  28. faces_angela_merkel, a dataset directory which contains images of Angela Merkel for facial recognition applications.
  29. faces_arnold_schwarzenegger, a dataset directory which contains images of Arnold Schwarzenegger for facial recognition applications.
  30. faces_emma_stone, a dataset directory which contains images of Emma Stone for facial recognition applications.
  31. faces_matt_damon, a dataset directory which contains images of Matt Damon for facial recognition applications.
  32. faces_michael_caine, a dataset directory which contains images of Michael Caine for facial recognition applications.
  33. faces_sylvester_stallone, a dataset directory which contains images of Sylvester Stallone for facial recognition applications.
  34. faces_taylor_swift, a dataset directory which contains images of Taylor Swift for facial recognition applications.
  35. fasta, a dataset directory which contains examples of FASTA sequence data;
  36. fastq, a dataset directory which contains examples of FASTQ sequence data;
  37. faure, a dataset directory which contains examples of the Faure quasirandom sequence;
  38. fingerprints, a dataset directory which contains a few images of fingerprints.
  39. fna, a dataset directory which ???.
  40. ge, a dataset directory which contains matrices stored in General (GE) format;
  41. generalized_assignment, a dataset directory which contains test data for the generalized assignment problem;
  42. german, a dataset directory which contains some short texts in German;
  43. gfd2, a dataset directory which ???;
  44. graffiti, a dataset directory which ???;
  45. graphics_examples, a dataset directory which contains examples of data used to illustrate or test various graphics procedures for presenting and analyzing data.
  46. grid, a dataset directory which ???;
  47. halton, a dataset directory which contains examples of the Halton quasirandom sequence;
  48. hammersley, a dataset directory which contains examples of the Hammersley quasirandom sequence;
  49. hartigan, a dataset directory which contains datasets for testing clustering algorithms;
  50. hbsmc, a dataset directory which contains the Harwell Boeing Sparse Matrix Collection (HBSMC);
  51. hex_grid, a dataset directory which ???;
  52. ihs, a dataset directory which contains examples of the Improved Distributed Hypercube Sampling quasirandom sequence;
  53. imagej, a dataset directory which contains image data suitable for use with the ImageJ program.
  54. incidence, a dataset directory which contains incidence matrices associated with a directed graph.
  55. inout_flow, a dataset directory which contains 500 time steps of Navier-Stokes flow in a region with specified inflow and outflow;
  56. inout_flow2, a dataset directory which contains more time steps of Navier-Stokes flow in a region with specified inflow and outflow;
  57. interpolation, a dataset directory which contains datasets to be interpolated.
  58. iswr, a dataset directory which contains example datasets used for statistical analysis.
  59. knapsack_01, a dataset directory which contains test data for the 0/1 knapsack problem;
  60. knapsack_multiple, a dataset directory which contains test data for the multiple knapsack problem;
  61. latin_center, a dataset directory which contains examples of the Latin Center Square quasirandom sequence;
  62. latin_edge, a dataset directory which contains examples of the Latin Edge Square quasirandom sequence;
  63. latin_random, a dataset directory which contains examples of the Latin Random Square quasirandom sequence;
  64. lcvt, a dataset directory which contains examples of Latinized Centroidal Voronoi Tessellations;
  65. lcvtp, a dataset directory which contains examples of LCVTP's, that is, "Latinized" Centroidal Voronoi Tessellations on a periodic domain;
  66. lhs, a dataset directory which contains datasets related to Latin Hypercube Sampling;
  67. lp, a dataset directory which contains datasets for linear programming, used for programs such as CPLEX and GUROBI;
  68. martinez, a dataset directory which contains datasets for computational statistics;
  69. mds, a dataset directory which contains datasets for M-dimensional scaling;
  70. mhd_control, a dataset directory which contains datasets for a magneto-hydrodyamics control problem.
  71. mps, a dataset directory which contains datasets for linear programming;
  72. mpsc, a dataset directory which contains compressed datasets for linear programming;
  73. ngrams, a dataset directory which contains information about the observed frequency of "ngrams" (particular sequences of n letters) in English text.
  74. niederreiter2, a dataset directory which contains examples of the Niederreiter quasirandom sequence using a base of 2;
  75. oa, a dataset directory which contains datasets for orthogonal arrays;
  76. partition_problem, a dataset directory which contains examples of the partition problem, in which a set of numbers is given, and it is desired to break the set into two subsets with equal sum.
  77. pcl, a dataset directory which contains datasets from a gene expression experiment on Arabidopsis;
  78. polygon, a dataset directory which contains examples of polygons;
  79. population, a dataset directory which contains listings of populations.
  80. presidents, a dataset directory which lists various facts about US presidents.
  81. product_rule_gl, a dataset directory which contains M-dimensional quadrature rules formed as products of 1D Gauss-Legendre rules.
  82. product_rule_tanh_sinh, a dataset directory which ???;
  83. propack, a dataset directory which contains matrices in Harwell-Boeing format, used for testing the SVD package propack();
  84. quad_mesh, a dataset directory which contains examples of quad meshes.
  85. quadrature_rules, a dataset directory which contains quadrature rules for 1D intervals, 2D rectangles or M-dimensional rectangular regions, stored as a file of abscissas, a file of weights, and a file of region limits.
  86. quadrature_rules_ccn, a dataset directory which contains quadrature rules for integration on [-1,+1], using a nested Clenshaw-Curtis rule.
  87. quadrature_rules_chebyshev1, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Gauss-Chebyshev type 1 rule.
  88. quadrature_rules_chebyshev2, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Gauss-Chebyshev type 2 rule.
  89. quadrature_rules_clenshaw_curtis, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Clenshaw Curtis rule.
  90. quadrature_rules_gegenbauer, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Gauss-Gegenbauer rule.
  91. quadrature_rules_gen_hermite, a dataset directory which contains quadrature rules for integration on an infinite interval, using a generalized Gauss-Hermite rule.
  92. quadrature_rules_gen_laguerre, a dataset directory which contains quadrature rules for integration on a semi-infinite interval, using a generalized Gauss-Laguerre rule.
  93. quadrature_rules_halton, a dataset directory which contains quadrature rules for M-dimensional unit cubes, based on a Halton quasirandom sequence. stored as a file of abscissas, a file of weights, and a file of region limits.
  94. quadrature_rules_hermite_physicist, a dataset directory which contains Gauss-Hermite quadrature rules, for integration on the interval (-oo,+oo), with the "physicist" weight function exp(-x*x).
  95. quadrature_rules_hermite_probabilist, a dataset directory which contains Gauss-Hermite quadrature rules, for integration on the interval (-oo,+oo), with the "probabilist" weight function exp(-x*x/2).
  96. quadrature_rules_hermite_unweighted, a dataset directory which contains Gauss-Hermite quadrature rules, for integration on the interval (-oo,+oo), with no weight function.
  97. quadrature_rules_jacobi, a dataset directory which contains Gauss-Jacobi quadrature rules for the interval [-1,+1] with weight function (1-x)^ALPHA * (1+x)^BETA.
  98. quadrature_rules_laguerre, a dataset directory which contains Gauss-Laguerre quadrature rules for integration on the interval [A,+oo), with weight function exp(-x).
  99. quadrature_rules_latin_center, a dataset directory which contains quadrature rules for M-dimensional unit cubes, based on centered Latin hypercubes. stored as a file of abscissas, a file of weights, and a file of region limits.
  100. quadrature_rules_legendre, a dataset directory which contains Gauss-Legendre quadrature rules for the interval [-1,+1].
  101. quadrature_rules_patterson, a dataset directory which contains Gauss-Patterson quadrature rules for the interval [-1,+1].
  102. quadrature_rules_pyramid, a dataset directory which contains quadrature rules for a pyramid with a square base.
  103. quadrature_rules_tet, a dataset directory which contains quadrature rules for tetrahedrons, stored as a file of abscissas, a file of weights, and a file of vertices.
  104. quadrature_rules_tri, a dataset directory which contains quadrature rules for triangles, stored as a file of abscissas, a file of weights, and a file of vertices.
  105. quadrature_rules_uniform, a dataset directory which contains quadrature rules for M-dimensional unit cubes, based on a uniform pseudorandom sequence. stored as a file of abscissas, a file of weights, and a file of region limits.
  106. quadrature_rules_wedge, a dataset directory which contains quadrature rules for a wedge (triangle x line).
  107. regression, a dataset directory which contains datasets for testing linear regression;
  108. romero, a dataset directory which collects 12 sets of 2D Latin Square points that were used as initial generators for a CVT computation.
  109. sam, a dataset directory which ???;
  110. sammon, a dataset directory which contains examples of six kinds of M-dimensional datasets for cluster analysis.
  111. sample_2d, a dataset directory which collects examples of sample point sets in the unit square.
  112. sgb, a dataset directory which contains files used as input data for demonstrations and tests of Donald Knuth's Stanford Graph Base.
  113. sgmg, a dataset directory which contains M-dimensional Smolyak sparse grids based on a mixture of 1D rules, and with a choice of exponential and linear growth rates for the 1D rules.
  114. sgmga, a dataset directory which contains SGMGA files (Sparse Grid Mixed Growth Anisotropic), that is, M-dimensional Smolyak sparse grids based on a mixture of 1D rules, and with a choice of exponential and linear growth rates for the 1D rules and anisotropic weights for the dimensions.
  115. sobol, a dataset directory which contains samples of the Sobol quasirandom sequence;
  116. sokal_rohlf, a dataset directory which contains biological datasets considered by Sokal and Rohlf.
  117. spaeth, a dataset directory which contains datasets for cluster analysis;
  118. spaeth2, a dataset directory which contains datasets for cluster analysis;
  119. sparse_grid_cce, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Clenshaw Curtis Exponential growth rule;
  120. sparse_grid_ccl, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Clenshaw Curtis Linear growth rule;
  121. sparse_grid_ccs, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Clenshaw Curti Slow growth rule;
  122. sparse_grid_composite, a dataset directory which contains M-dimensional Smolyak sparse grids based on the composite midpoint rule;
  123. sparse_grid_f2, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Fejer 2 Exponential growth rule;
  124. sparse_grid_f2s, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Fejer 2 Slow growth rule;
  125. sparse_grid_gle, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Legendre Exponential growth rule;
  126. sparse_grid_gll, a dataset directory which contains M-dimensional Smolyak sparse grids based on the 1D Gauss-Legendre Linear growth rule;
  127. sparse_grid_glo, a dataset directory which contains M-dimensional Smolyak sparse grids based on the 1D Gauss-Legendre Linear (Odd) growth rule;
  128. sparse_grid_gpe, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Patterson Exponential growth rule;
  129. sparse_grid_gps, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Patterson Slow growth rule;
  130. sparse_grid_hermite, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Hermite rule;
  131. sparse_grid_laguerre, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Laguerre rule;
  132. sparse_grid_mixed, a dataset directory which contains M-imensional Smolyak sparse grids based on a mixture of 1D rules.
  133. sparse_grid_ncc, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Newton Cotes Closed rule;
  134. sparse_grid_nco, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Newton Cotes Open rule;
  135. sphere_design_rule is a dataset directory which contains files defining point sets on the surface of the unit sphere, known as "designs", which can be useful for estimating integrals on the surface, among other uses.
  136. sphere_grid, a dataset directory which contains grids of points, lines, triangles or quadrilaterals on a sphere;
  137. sphere_lebedev_rule, a dataset directory which contains sets of Lebedev points on a sphere which can be used for quadrature rules of a known precision;
  138. sphere_maximum_determinant, a dataset directory which contains files defining maximum determinant rules on the unit sphere, which can be used for interpolation and quadrature;
  139. square_hex_grid, a dataset directory which contains files defining hexagonal arrays of grid points over the interior of a square in 2D.
  140. st, a dataset directory of examples of Sparse Triplet (ST) files, a sparse matrix file format, storing just (I,J,A(I,J)), and using zero-based indexing.
  141. st1, a dataset directory of examples of Sparse Triplet (ST1) files, a sparse matrix file format, storing just (I,J,A(I,J)), and using one-based indexing.
  142. states, a dataset directory which contains some information about the individual American states.
  143. stats, a dataset directory which contains some examples of statistical datasets.
  144. subset_sum, a dataset directory which contains examples of the subset sum problem, in which a set of numbers is given, and it is desired to find at least one subset that sums to a given target value.
  145. svdpack, a dataset directory which contains matrices in Harwell-Boeing format, used for testing the singular value decomposition library svdpack();
  146. symbols, a dataset directory which contains large images of numbers and symbols.
  147. tcell_flow, a dataset directory which contains 500 time steps of Navier-Stokes flow in a T-cell;
  148. test_approx, a dataset directory which contains sets of data (x,y) for which an approximating formula is desired.
  149. test_con, a dataset directory which contains sequences of points that lie on M-dimensional curves defined by sets of nonlinear equations;
  150. tet_mesh_order4, a dataset directory of examples of order 4 tetrahedral meshes.
  151. tet_mesh_order10, a dataset directory of examples of order 10 tetrahedral meshes.
  152. tet_mesh_order20, a dataset directory of examples of order 20 tetrahedral meshes.
  153. tetrahedrons, a dataset directory which contains examples of tetrahedrons.
  154. tetrahedron_samples, a dataset directory which contains examples of sets of sample points from tetrahedrons.
  155. text, a dataset directory which contains some short texts in English, such as the Gettysburg Address;
  156. time_series, a data directory of examples of time series, which are simply records of the values of some quantity at a sequence of times.
  157. timelines, a data directory of examples of timelines, that is, dates or durations or lifetimes meant to be displayed in chronological order.
  158. triangle_samples, a dataset directory which contains sets of sample points from triangles.
  159. triangles, a dataset directory which contains examples of triangles.
  160. triangulation_order3, a dataset directory which contains examples of order 3 triangulations, a linear triangulation of a set of 2D points, using a pair of files to list the node coordinates and the 3 nodes that make up each triangle;
  161. triangulation_order4, a dataset directory which contains examples of order 4 triangulations, a triangulation of a set of 2D points, using a pair of files to list the node coordinates and the 4 nodes that define each triangle (3 vertices and the centroid);
  162. triangulation_order6, a dataset directory which contains examples of order 6 triangulations, a quadratic triangulation of a set of 2D points, using a pair of files to list the node coordinates and the 6 nodes that make up each triangle; Six-node triangles are used when a higher degree approximation is desired; they may also be used as isoparametric elements that model curved boundaries;
  163. triola, a dataset directory which contains datasets used for statistical analysis.
  164. tsp, a dataset directory which contains examples of the traveling salesperson problem.
  165. uniform, a dataset directory which contains examples of a uniform pseudorandom sequence;
  166. van_der_corput, a dataset directory which contains examples of one-dimensional van der Corput sequences, for various bases;
  167. words, a dataset directory which contains lists of words;
  168. xls, a data directory which contains examples of XLS files, used by the Microsoft Excel spreadsheet program.


Last revised on 02 May 2019.