PARTIAL_DIGEST
Recursive Solutions of the Partial Digest Problem


PARTIAL_DIGEST is a FORTRAN90 program which seeks solutions of the partial digest problem.

In the partial digest problem, we assume that there are N objects arranged along a line. We denote the position of object I by X(I). The positions of the objects are unknown. Instead, we have a list of the distances between every distinct pair of objects. Note that the distances are not "tagged"; that is, if there is a 175 on the list of distances, we don't know which two objects are separated by that distance. In the partial digest problem, we start with the (N*(N-1))/2 distances, and must come up with at least one linear arrangement of N objects that corresponds to the distances.

In the algorithm used here, we begin by arbitrarily setting X(1) to zero.

For our second step, we find the largest entry in the distance table, and set X(2) to that value.

On each recursive step thereafter, we find the largest unused distance, D, and note that this must represent the distance of the next object from either X(1) or X(2).

Starting with the first possibility, we consider placing this next object at X(K)=D. We now must search the distance table, and ensure that the distances |X(1)-X(K)| through |X(K-1)-X(K)| all show up. If so, then our tentative placement of the object is "plausible", and we can proceed to the next step of recursion, seeking the location of X(K+1).

The second possibility to check on this recursive step is that we should set X(K)=X(2)-D, since this would also explain the occurrence of the distance D. The analysis of this case is otherwise the same as for the first one.

This recursion is guaranteed to "encounter" every solution. (Of course, there might be no solutions whatever.)

This approach has the advantage that recursion is relatively clean and neat to program. Disadvantages include the fact that the amount of memory required to store partial results will grow explosively as the size of the problem increases. Also, it is difficult to intervene or interrupt the recursive process. For instance, the calling program never receives the computed solutions directly. Instead, the recursive routine "realizes" that it has computed a solution, and can print it out.

For these reasons, it would be worth developing an equivalent version of the routines that uses backtracking instead.

Note that this program used integers for the distances. While this is somewhat unnatural, it is convenient when programming, since we are searching the list of distances for values that we arrive at by subtraction, and the slightest roundoff would mean that the algorithm would fail. An alternative would be to allow floating point distances, but to allow a very slight margin of error when looking for a distance in the table that is equal to a difference calculated between two positions.

Licensing:

The computer code and data files made available on this web page are distributed under the GNU LGPL license.

Languages:

PARTIAL_DIGEST is available in a FORTRAN90 version.

Related Data and Programs:

CITIES, a FORTRAN90 library which carries out various computations involving the locations or relative distances of cities on a map.

CITIES, a dataset directory which defines the locations or relative distances of cities on a map.

COMBO, a FORTRAN90 library which carries out various combinatorial computations.

DISTANCE_TO_POSITION, a FORTRAN90 program which estimates the positions of cities based on a city-to-city distance table.

GEOMETRY, a FORTRAN90 library which computes various geometric quantities.

GRAFPACK, a FORTRAN90 library which manipulates abstract graphs.

KNAPSACK, a FORTRAN77 library which solves a variety of knapsack problems.

LAMP, a FORTRAN77 library which solves linear assignment and matching problems.

SUBSET, a FORTRAN90 library which carries out various combinatorial computations.

SUBSET_SUM, a FORTRAN90 library which seeks solutions of the subset sum problem.

Reference:

  1. Pavel Pevzner,
    Computational Molecular Biology,
    MIT Press, 2000,
    ISBN: 0-262-16197-4,
    LC: QH506.P47.

Source Code:

Examples and Tests:

List of Routines:

You can go up one level to the FORTRAN90 source codes.


Last revised on 28 November 2006.