HAMMERSLEY
Hammersley Datasets

HAMMERSLEY is a dataset directory which contains points generated by the M-dimensional Hammersley sequence.

Routines in this library select elements of a "leaped" subsequence of the Hammersley sequence. The subsequence elements are indexed by a quantity called STEP, which starts at 0. The STEP-th subsequence element is simply the Hammersley sequence element with index

        SEED(1:NDIM) + STEP * LEAP(1:NDIM).

The arguments that the user may set include:

NDIM, the spatial dimension,
default: NDIM = 1,
required: 1 <= NDIM;
STEP, the subsequence index.
default: STEP = 0,
required: 0 <= STEP.
SEED(1:NDIM), the Hammersley sequence index corresponding to STEP = 0.
default: SEED(1:NDIM) = (0, 0, ... 0),
required: 0 <= SEED(1:NDIM);
LEAP(1:NDIM), the succesive jumps in the Hammersley sequence.
default: LEAP(1:NDIM) = (1, 1, ..., 1).
required: 1 <= LEAP(1:NDIM).
BASE(1:NDIM), the Hammersley bases.
default: BASE(1:NDIM) = (2, 3, 5, 7, 11... ),
or (-N, 2, 3, 5, 7, 11,...) if N is known;
required: 1 < BASE(I) for any van der Corput dimension I, or BASE(I) < 0 to generate the fractional sequence J/|BASE(I)|.

In the standard NDIM-dimensional Hammersley sequence, it is assumed that N, the number of values to be generated, is known beforehand. The first dimension of entries in the sequence will have the form J/N for J from 1 to N. The remaining dimensions are computed using the 1-dimensional van der Corput sequence, using successive primes as bases.

In a generalized Hammersley sequence, each coordinate is allowed to be a fractional or van der Corput sequence. For any fractional sequence, the denominator is arbitrary. However, it is extremely desirable that the values in all coordinates stay between 0 and 1. This happens automatically for any van der Corput sequence, but for fractional sequences, this criterion is enforced using an appropriate modulus function. The consequence is that if you specify a small "base" for a fractional sequence, your sequence will soon wrap around and you will get repeated values.

If you drop the first dimension of the standard NDIM-dimensional Hammersley sequence, you get the standard Halton sequence of dimension NDIM-1.

The standard Hammersley sequence has slightly better dispersion properties than the standard Halton sequence. However, it suffers from the problem that you must know, beforehand, the number of points you are going to generate. Thus, if you have computed a Hammersley sequence of length N = 100, and you want to compute a Hammersley sequence of length 200, you must discard your current values and start over. By contrast, you can compute 100 points of a Halton sequence, and then 100 more, and this will be the same as computing the first 200 points of the Halton sequence in one calculation.

In low dimensions, the multidimensional Hammersley sequence quickly "fills up" the space in a well-distributed pattern. However, for higher dimensions (such as NDIM = 40) for instance, the initial elements of the Hammersley sequence can be very poorly distributed; it is only when N, the number of sequence elements, is large enough relative to the spatial dimension, that the sequence is properly behaved. Remedies for this problem were suggested by Kocis and Whiten.

In some cases, it is recommended that the initial portion of the sequence be skipped over. A general suggestion is to let STEP be the first power of 2 that is equal to or greater than N, the number of points to generate.

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

Related Data and Programs:

HAMMERSLEY_DATASET, a C++ program which creates a Hammersley sequence and writes it to a file.

PLOT_POINTS, a FORTRAN90 program which can create Encapsulated PostScript images (EPS) of some of the two dimensional datasets.

TABLE, a file format which is used to store the data.

TABLE_TOP, a FORTRAN90 program which can display pairwise coordinate plots of higher dimensional datasets.

Example dataset:

A typical (but small) dataset looks like this:

#  hammersley_2_10.txt
#  created by routine HAMMERSLEY_WRITE.CC
#  at 21 July 2004 08:34:27 AM
#
#  NDIM =            2
#  N =              10
#  STEP =            0
#  SEED =            1           0
#  LEAP =            1           1
#  BASE =          -10           2
#  EPSILON (unit roundoff) = 2.22045e-16
#
       0.1           0  
       0.2         0.5  
       0.3        0.25  
       0.4        0.75  
       0.5       0.125  
       0.6       0.625  
       0.7       0.375  
       0.8       0.875  
       0.9      0.0625  
         0      0.5625

Reference:

J M Hammersley,
Monte Carlo methods for solving multivariable problems,
Proceedings of the New York Academy of Science,
Volume 86, pages 844-874, 1960.
Ladislav Kocis and William Whiten,
Computational Investigations of Low-Discrepancy Sequences,
ACM Transactions on Mathematical Software,
Volume 23, Number 2, June 1997, pages 266-294.

Datasets:

Datasets in M = 2 dimensions, with 0 skipping, include:

hammersley_02_00010.txt, M = 2, N = 10, STEP = 0, SEED = (1,0), BASE = default;
hammersley_02_00010.png, a PNG image of the dataset;
hammersley_02_00100.txt, M = 2, N = 100, STEP = 0, SEED = (1,0), BASE = default;
hammersley_02_00100.png, a PNG image of the dataset;
hammersley_02_01000.txt, M = 2, N = 1000, STEP = 0, SEED = (1,0), BASE = default;
hammersley_02_01000.png, a PNG image of the dataset;
hammersley_02_10000.txt, M = 2, N = 10000, STEP = 0, SEED = (1,0), BASE = default;

Datasets in M = 2 dimensions, with power of 2 skipping, include:

hammersley_02_00010_a.txt, M = 2, N = 10, STEP = 0, SEED = (1,16), BASE = default;
hammersley_02_00100_a.txt, M = 2, N = 100, STEP = 0, SEED = (1,128), BASE = default;
hammersley_02_01000_a.txt, M = 2, N = 1000, STEP = 0, SEED = (1,1024), BASE = default;
hammersley_02_10000_a.txt, M = 2, N = 10000, STEP = 0, SEED = (1,16384), BASE = default;

Datasets in M = 7 dimensions, with 0 skipping, include:

hammersley_07_00010.txt, M = 7, N = 10, STEP = 0, SEED = (1,0,...,0), BASE = default;
hammersley_07_00100.txt, M = 7, N = 100, STEP = 0, SEED = (1,0,...,0), BASE = default;
hammersley_07_01000.txt, M = 7, N = 1000, STEP = 0, SEED = (1,0,...,0), BASE = default;
hammersley_07_10000.txt, M = 7, N = 10000, STEP = 0, SEED = (1,0,...,0), BASE = default;

Datasets in M = 7 dimensions, with power of 2 skipping, include:

hammersley_07_00010_a.txt, M = 7, N = 10, STEP = 0, SEED = (1,16,...,16), BASE = default;
hammersley_07_00100_a.txt, M = 7, N = 100, STEP = 0, SEED = (1,128,...,128), BASE = default;
hammersley_07_01000_a.txt, M = 7, N = 1000, STEP = 0, SEED = (1,1024,...,1024), BASE = default;
hammersley_07_10000_a.txt, M = 7, N = 10000, STEP = 0, SEED = (1,16384,...,16384), BASE = default;

Datasets in M = 16 dimensions, with 0 skipping, include:

hammersley_16_00010.txt, M = 16, N = 10, STEP = 0, SEED = (1,0,...,0), BASE = default;
hammersley_16_00100.txt, M = 16, N = 100, STEP = 0, SEED = (1,0,...,0), BASE = default;
hammersley_16_01000.txt, M = 16, N = 1000, STEP = 0, SEED = (1,0,...,0), BASE = default;
hammersley_16_10000.txt, M = 16, N = 10000, STEP = 0, SEED = (1,0,...,0), BASE = default;

Datasets in M = 16 dimensions, with power of 2 skipping, include:

hammersley_16_00010_a.txt, M = 16, N = 10, STEP = 0, SEED = (1,16,...,16), BASE = default;
hammersley_16_00100_a.txt, M = 16, N = 100, STEP = 0, SEED = (1,128,...,128), BASE = default;
hammersley_16_01000_a.txt, M = 16, N = 1000, STEP = 0, SEED = (1,1024,...,1024), BASE = default;
hammersley_16_10000_a.txt, M = 16, N = 10000, STEP = 0, SEED = (1,16384,...,16384), BASE = default;

You can go up one level to the DATASETS directory.