lcvt, a C++ code which creates Latin Centroidal Voronoi Tessellation (CVT) datasets.
A Latin Square dataset is typically a two dimensional dataset of N points in the unit square, with the property that, if both the x and y axes are divided up into N equal subintervals, exactly one dataset point has an x or y coordinate in each subinterval. Latin squares can easily be extended to the case of M dimensions, and may be pedantically called Latin Hypersquares or Latin Hypercubes in such a case. Statisticians like Latin Squares, as do experiment designers, and and people who need to approximate scalar functions of many variables.
The fact that the projection of a Latin Square dataset onto any coordinate axis is either exactly evenly spaced, or approximately so (depending on the algorithm), turns out to be an attractive feature for many uses.
However, a CVT dataset in a regular domain, such as the unit hypercube, has the tendency for the projections of the points to cluster together in any coordinate axis. This program is mainly an attempt to explore whether a dataset can be computed using techniques similar to those of a CVT, but with the constraint (whether imposed or expected) that the point projections do not clump up.
The approach used here is quite simple. First we compute a CVT in M dimensions, comprising N points. We assume that the bounding region is the unit hypercube. We are now going to adjust the coordinates of the points to achieve the Latin Hypercube property. For each coordinate direction, we simply sort the points by that coordinate, and then overwrite the original values by the values we'd expect to get for a centered Latin Hypercube, namely, 1/(2*N), 3/(2*N), ..., (2*N-1)/(2*N).
Now this process guarantees that we get a Latin Hypercube. Our hope is that the process of adjusting the point coordinates does not too severely damage the nice dispersion properties inherent in the CVT point placement.
The computer code and data files described and made available on this web page are distributed under the MIT license
lcvt is available in a C++ version and a FORTRAN90 version and a MATLAB version
BOX_BEHNKEN, a C++ code which computes a Box-Behnken design, that is, a set of arguments to sample the behavior of a function of multiple parameters;
CVT, a C++ code which computes a Centroidal Voronoi Tessellation.
FAURE, a C++ code which computes elements of a Faure quasirandom sequence.
HALTON, a C++ code which computes elements of a Halton quasirandom sequence.
HAMMERSLEY, a C++ code which computes elements of a Hammersley quasirandom sequence.
IHS, a C++ code which computes elements of an improved distributed Latin hypercube dataset.
LATIN_CENTER, a C++ code which computes elements of a Latin Hypercube dataset, choosing center points.
LATIN_EDGE, a C++ code which computes elements of a Latin Hypercube dataset, choosing edge points.
LATIN_RANDOM, a C++ code which computes elements of a Latin Hypercube dataset, choosing points at random.
LATINIZE, a C++ code which "latinizes" a dataset.
LCVT, a dataset directory which contains a collection of sample LCVT datasets.
NIEDERREITER2, a C++ code which computes elements of a Niederreiter quasirandom sequence with base 2.
NORMAL, a C++ code which computes elements of a normal pseudorandom sequence.
SOBOL, a C++ code which computes elements of a Sobol quasirandom sequence.
UNIFORM, a C++ code which computes elements of a uniform pseudorandom sequence.
VAN_DER_CORPUT, a C++ code which computes elements of a van der Corput quasirandom sequence.