Datasets


Here are a number of datasets for regression analysis, CVT basis calculations, K-means analysis, and so on.

The directories include:

  1. alphabet_lowercase, a dataset directory which contains large images of the 26 lowercase alphabetic characters.
  2. alphabet_uppercase, a dataset directory which contains large images of the 26 uppercase alphabetic characters.
  3. beale_cipher, a dataset directory which contains the text of the three Beale cipher documents, which are supposed to indicate the location of a hoard of gold and silver.
  4. bin_packing, a dataset directory which contains examples of the bin packing problem, in which a number of objects are to be packed in the minimum possible number of uniform bins;
  5. burgers, a dataset directory which contains 40 solutions of the Burgers equation at equally spaced times from 0 to 1, with values at 41 equally spaced nodes in [0,1];
  6. case1_flow, a dataset directory which 401 solutions of a flow problem in a channel;
  7. cavity_flow, a dataset directory which contains 500 time steps of Navier-Stokes flow in a driven cavity;
  8. census, a dataset directory which contains US census data;
  9. chain_letters, a dataset directory which contains examples of a chain letter;
  10. change_making, a dataset directory which contains test data for the change making problem;
  11. cities, a dataset directory which contains sets of information about cities and the distances between them;
  12. cvt, a dataset directory which contains examples of Centroidal Voronoi Tessellations;
  13. cvt_mod, a dataset directory which contains examples of Centroidal Voronoi Tessellations (CVT's) on a "logical torus" or "wrap around" unit hypercube;
  14. dates, a dataset directory which contains lists of dates in certain calendars.
  15. draft_lottery, a dataset directory which contains the numbers assigned to each birthday, for the Selective Service System lotteries for 1970 through 1976.
  16. faure, a dataset directory which contains examples of the Faure quasirandom sequence;
  17. fingerprints, a dataset directory which contains a few images of fingerprints.
  18. generalized_assignment, a dataset directory which contains test data for the generalized assignment problem;
  19. graphics_examples, a dataset directory which contains examples of data used to illustrate or test various graphics procedures for presenting and analyzing data.
  20. halton, a dataset directory which contains examples of the Halton quasirandom sequence;
  21. hammersley, a dataset directory which contains examples of the Hammersley quasirandom sequence;
  22. hartigan, a dataset directory which contains datasets for testing clustering algorithms;
  23. hbsmc, a dataset directory which contains the Harwell Boeing Sparse Matrix Collection (HBSMC);
  24. ihs, a dataset directory which contains examples of the Improved Distributed Hypercube Sampling quasirandom sequence;
  25. imagej, a dataset directory which contains image data suitable for use with the ImageJ program.
  26. inout_flow, a dataset directory which contains 500 time steps of Navier-Stokes flow in a region with specified inflow and outflow;
  27. interpolation, a dataset directory which contains datasets to be interpolated.
  28. knapsack_01, a dataset directory which contains test data for the 0/1 knapsack problem;
  29. knapsack_multiple, a dataset directory which contains test data for the multiple knapsack problem;
  30. latin_center, a dataset directory which contains examples of the Latin Center Square quasirandom sequence;
  31. latin_edge, a dataset directory which contains examples of the Latin Edge Square quasirandom sequence;
  32. latin_random, a dataset directory which contains examples of the Latin Random Square quasirandom sequence;
  33. lcvt, a dataset directory which contains examples of Latinized Centroidal Voronoi Tessellations;
  34. lcvt_mod, a dataset directory which contains examples of "Latinized" Centroidal Voronoi Tessellations on a logical torus;
  35. lhs, a dataset directory which contains datasets related to Latin Hypercube Sampling;
  36. martinez, a dataset directory which contains datasets for computational statistics;
  37. mds, a dataset directory which contains datasets for M-dimensional scaling;
  38. niederreiter2, a dataset directory which contains examples of the Niederreiter quasirandom sequence using a base of 2;
  39. partition_problem, a dataset directory which contains examples of the partition problem, in which a set of numbers is given, and it is desired to break the set into two subsets with equal sum.
  40. pcl, a dataset directory which contains datasets from a gene expression experiment on Arabidopsis;
  41. polygon, a dataset directory which contains examples of polygons;
  42. product_rule_gl, a dataset directory which contains M-dimensional quadrature rules formed as products of 1D Gauss-Legendre rules.
  43. quad_mesh, a dataset directory which contains examples of quad meshes.
  44. quadrature_rules, a dataset directory which contains quadrature rules for 1D intervals, 2D rectangles or M-dimensional rectangular regions, stored as a file of abscissas, a file of weights, and a file of region limits.
  45. quadrature_rules_ccn, a dataset directory which contains quadrature rules for integration on [-1,+1], using a nested Clenshaw-Curtis rule.
  46. quadrature_rules_chebyshev1, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Gauss-Chebyshev type 1 rule.
  47. quadrature_rules_chebyshev2, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Gauss-Chebyshev type 2 rule.
  48. quadrature_rules_clenshaw_curtis, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Clenshaw Curtis rule.
  49. quadrature_rules_gegenbauer, a dataset directory which contains quadrature rules for integration on [-1,+1], using a Gauss-Gegenbauer rule.
  50. quadrature_rules_gen_hermite, a dataset directory which contains quadrature rules for integration on an infinite interval, using a generalized Gauss-Hermite rule.
  51. quadrature_rules_gen_laguerre, a dataset directory which contains quadrature rules for integration on a semi-infinite interval, using a generalized Gauss-Laguerre rule.
  52. quadrature_rules_halton, a dataset directory which contains quadrature rules for M-dimensional unit cubes, based on a Halton quasirandom sequence. stored as a file of abscissas, a file of weights, and a file of region limits.
  53. quadrature_rules_hermite_physicist, a dataset directory which contains Gauss-Hermite quadrature rules, for integration on the interval (-oo,+oo), with the "physicist" weight function exp(-x*x).
  54. quadrature_rules_hermite_probabilist, a dataset directory which contains Gauss-Hermite quadrature rules, for integration on the interval (-oo,+oo), with the "probabilist" weight function exp(-x*x/2).
  55. quadrature_rules_hermite_unweighted, a dataset directory which contains Gauss-Hermite quadrature rules, for integration on the interval (-oo,+oo), with no weight function.
  56. quadrature_rules_jacobi, a dataset directory which contains Gauss-Jacobi quadrature rules for the interval [-1,+1] with weight function (1-x)^ALPHA * (1+x)^BETA.
  57. quadrature_rules_laguerre, a dataset directory which contains Gauss-Laguerre quadrature rules for integration on the interval [A,+oo), with weight function exp(-x).
  58. quadrature_rules_latin_center, a dataset directory which contains quadrature rules for M-dimensional unit cubes, based on centered Latin hypercubes. stored as a file of abscissas, a file of weights, and a file of region limits.
  59. quadrature_rules_legendre, a dataset directory which contains Gauss-Legendre quadrature rules for the interval [-1,+1].
  60. quadrature_rules_patterson, a dataset directory which contains Gauss-Patterson quadrature rules for the interval [-1,+1].
  61. quadrature_rules_pyramid, a dataset directory which contains quadrature rules for a pyramid with a square base.
  62. quadrature_rules_tet, a dataset directory which contains quadrature rules for tetrahedrons, stored as a file of abscissas, a file of weights, and a file of vertices.
  63. quadrature_rules_tri, a dataset directory which contains quadrature rules for triangles, stored as a file of abscissas, a file of weights, and a file of vertices.
  64. quadrature_rules_uniform, a dataset directory which contains quadrature rules for M-dimensional unit cubes, based on a uniform pseudorandom sequence. stored as a file of abscissas, a file of weights, and a file of region limits.
  65. quadrature_rules_wedge, a dataset directory which contains quadrature rules for a wedge (triangle x line).
  66. regression, a dataset directory which contains datasets for testing linear regression;
  67. sammon, a dataset directory which contains examples of six kinds of M-dimensional datasets for cluster analysis.
  68. sgb, a dataset directory which contains files used as input data for demonstrations and tests of Donald Knuth's Stanford Graph Base.
  69. sgmg, a dataset directory which contains M-dimensional Smolyak sparse grids based on a mixture of 1D rules, and with a choice of exponential and linear growth rates for the 1D rules.
  70. sgmga, a dataset directory which contains SGMGA files (Sparse Grid Mixed Growth Anisotropic), that is, M-dimensional Smolyak sparse grids based on a mixture of 1D rules, and with a choice of exponential and linear growth rates for the 1D rules and anisotropic weights for the dimensions.
  71. sobol, a dataset directory which contains samples of the Sobol quasirandom sequence;
  72. sokal_rohlf, a dataset directory which contains biological datasets considered by Sokal and Rohlf.
  73. spaeth, a dataset directory which contains datasets for cluster analysis;
  74. spaeth2, a dataset directory which contains datasets for cluster analysis;
  75. sparse_grid_cce, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Clenshaw Curtis Exponential growth rule;
  76. sparse_grid_ccl, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Clenshaw Curtis Linear growth rule;
  77. sparse_grid_ccs, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Clenshaw Curti Slow growth rule;
  78. sparse_grid_composite, a dataset directory which contains M-dimensional Smolyak sparse grids based on the composite midpoint rule;
  79. sparse_grid_f2, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Fejer 2 Exponential growth rule;
  80. sparse_grid_f2s, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Fejer 2 Slow growth rule;
  81. sparse_grid_gle, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Legendre Exponential growth rule;
  82. sparse_grid_gll, a dataset directory which contains M-dimensional Smolyak sparse grids based on the 1D Gauss-Legendre Linear growth rule;
  83. sparse_grid_glo, a dataset directory which contains M-dimensional Smolyak sparse grids based on the 1D Gauss-Legendre Linear (Odd) growth rule;
  84. sparse_grid_gpe, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Patterson Exponential growth rule;
  85. sparse_grid_gps, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Patterson Slow growth rule;
  86. sparse_grid_hermite, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Hermite rule;
  87. sparse_grid_laguerre, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Gauss-Laguerre rule;
  88. sparse_grid_mixed, a dataset directory which contains M-imensional Smolyak sparse grids based on a mixture of 1D rules.
  89. sparse_grid_ncc, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Newton Cotes Closed rule;
  90. sparse_grid_nco, a dataset directory which contains M-dimensional Smolyak sparse grids based on the Newton Cotes Open rule;
  91. sphere_grid, a dataset directory which contains grids of points, lines, triangles or quadrilaterals on a sphere;
  92. sphere_design_rule is a dataset directory which contains files defining point sets on the surface of the unit sphere, known as "designs", which can be useful for estimating integrals on the surface, among other uses.
  93. sphere_lebedev_rule, a dataset directory which contains sets of Lebedev points on a sphere which can be used for quadrature rules of a known precision;
  94. sphere_maximum_determinant, a dataset directory which contains files defining maximum determinant rules on the unit sphere, which can be used for interpolation and quadrature;
  95. states, a dataset directory which contains some information about the individual American states.
  96. stats, a dataset directory which contains some examples of statistical datasets.
  97. subset_sum, a dataset directory which contains examples of the subset sum problem, in which a set of numbers is given, and it is desired to find at least one subset that sums to a given target value.
  98. symbols, a dataset directory which contains large images of numbers and symbols.
  99. tcell_flow, a dataset directory which contains 500 time steps of Navier-Stokes flow in a T-cell;
  100. test_approx, a dataset directory which contains sets of data (x,y) for which an approximating formula is desired.
  101. test_con, a dataset directory which contains sequences of points that lie on M-dimensional curves defined by sets of nonlinear equations;
  102. tet_mesh_order4, a dataset directory of examples of order 4 tetrahedral meshes.
  103. tet_mesh_order10, a dataset directory of examples of order 10 tetrahedral meshes.
  104. tetrahedrons, a dataset directory which contains examples of tetrahedrons.
  105. text, a dataset directory which contains actual "texts", such as the Gettysburg Address;
  106. time_series, a data directory of examples of time series, which are simply records of the values of some quantity at a sequence of times.
  107. triangles, a dataset directory which contains examples of triangles.
  108. triangulation_order3, a dataset directory which contains examples of order 3 triangulations, a linear triangulation of a set of 2D points, using a pair of files to list the node coordinates and the 3 nodes that make up each triangle;
  109. triangulation_order4, a dataset directory which contains examples of order 4 triangulations, a triangulation of a set of 2D points, using a pair of files to list the node coordinates and the 4 nodes that define each triangle (3 vertices and the centroid);
  110. triangulation_order6, a dataset directory which contains examples of order 6 triangulations, a quadratic triangulation of a set of 2D points, using a pair of files to list the node coordinates and the 6 nodes that make up each triangle; Six-node triangles are used when a higher degree approximation is desired; they may also be used as isoparametric elements that model curved boundaries;
  111. triola, a dataset directory which contains datasets used for statistical analysis.
  112. tsp, a dataset directory which contains examples of the traveling salesperson problem.
  113. uniform, a dataset directory which contains examples of a uniform pseudorandom sequence;
  114. van_der_corput, a dataset directory which contains examples of one-dimensional van der Corput sequences, for various bases;
  115. words, a dataset directory which contains lists of words;

You can go up one level to the main web page.


Last revised on 24 April 2014.