KNAPSACK_REMOTE Solve a Knapsack Problem in Distributed Fashion

KNAPSACK_REMOTE is a directory which illustrates how the Parallel Computing Toolbox can be used to solve a problem using distributed computing.

In distributed computing, a problem is broken down into subproblems, each of which can be solved separately, simultaneously or sequentially.

MATLAB implements distributed computing by having the user define a "job" which in turn is composed of several "tasks". Each task is carried out by calling a MATLAB function with specific arguments, and the tasks are carried out in some order, over some set of processors, when the job is submitted for execution.

In the "subset sum" version of the knapsack problem, we are given a collection of (21) weights and a target value (24639098). We are to decide which combination(s) of weights yield a sum equal to the target value. The values of the weights, along with the target value are encoded in the file knapweights.mat. The code is an example from the MathWorks course (DC01) on Parallel Computing (Sept. 2009).

The nonparallel program knap.m loops through a range of possible weight combinations and reports the (last) weight labels and values that achieve the required sum. The distknap.m code divides the computation into four tasks, each corresponding to a specific subrange of the total set of possibilities.

Related Data and Programs:

BIRTHDAY_REMOTE, a MATLAB program which runs a Monte Carlo simulation of the birthday paradox, and includes instructions on how to run the job, via MATLAB's BATCH facility, on a remote system such as Virginia Tech's ITHACA cluster.

COLLATZ_PARALLEL is a MATLAB program which seeks the maximum Collatz sequence between 1 and N; it runs in parallel using MATLAB's "parfor" facility.

COLOR_REMOTE, a MATLAB program which carries out the color segmentation of an image in parallel, via SPMD commands; this includes instructions on how to run the job, via MATLAB's BATCH facility, on a remote system such as Virginia Tech's ITHACA cluster.

FDI_OPT, a MATLAB program which demonstrates the use of MATLAB's FMINCON constrained minimization function, taking advantage of MATLAB's Parallel Computing Toolbox for faster execution.

JTB_CODIST, a MATLAB program which demonstrates how the linear system associated with a finite element problem can be treated as a codistributed array whose entries are assigned to different MATLAB workers, so that the matrix is assembled in parallel.

LINEAR_SOLVE_DISTRIBUTED is a MATLAB program which solves a linear system A*x=b using MATLAB's spmd facility, so that the matrix A is "distributed" across multiple MATLAB workers.

LYRICS_REMOTE, a MATLAB program which runs in parallel, using three workers which cooperate "systolically", that is, as through they were on an assembly line. The output from worker 1 is passed to worker 2 for further processing, and so on. This includes instructions on how to run the job, via MATLAB's BATCH facility, on a remote system such as Virginia Tech's ITHACA cluster.

MATLAB_COMMANDLINE, MATLAB programs which illustrate how MATLAB can be run from the UNIX command line, that is, not with the usual MATLAB command window.

MATLAB_DISTCOMP, a MATLAB program which remotely runs a set of 5 jobs on the Ithaca cluster. These jobs are equivalent to the BIRTHDAY_REMOTE, COLOR_REMOTE, KNAPSACK_REMOTE, LYRICS_REMOTE and MD_REMOTE jobs.

MATLAB_PARALLEL, examples which illustrate "local" parallel programming on a single computer with MATLAB's Parallel Computing Toolbox.

MATLAB_REMOTE, examples which illustrate the use of remote job execution, in which a desktop copy of MATLAB sends programs and data to a remote machine for execution. Included is information needed to properly configure the local machine.

MD_PARALLEL is a MATLAB program which carries out a molecular dynamics simulation, running in parallel using MATLAB's "PARFOR" feature.

PRIME_NUMBER_PARALLEL is a MATLAB program which counts the number of primes between 1 and N; it runs in parallel using MATLAB's "parfor" facility.

PRIME_NUMBER_SPMD is a MATLAB program which counts the number of primes between 1 and N; running in parallel using MATLAB's "SPMD" feature.

QUAD_SPMD is a MATLAB program which estimates an integral using quadrature; running in parallel using MATLAB's "SPMD" feature.

SATISFIABILITY_PARALLEL is a MATLAB program which demonstrates, for a particular circuit, an exhaustive search for solutions of the circuit satisfiability problem, running in parallel using MATLAB's "PARFOR" feature.

SUBSET_SUM is a MATLAB library which seeks solutions of the subset sum problem.

TIMING_PARALLEL is a directory of MATLAB programs which illustrates how to time a parallel MATLAB program.

Reference:

MathWorks documentation for the Parallel Computing Toolbox is available at http://www.mathworks.com/products/parallel-computing/.

• Silvano Martello, Paolo Toth,
Knapsack Problems: Algorithms and Computer Implementations,
Wiley, 1990,
ISBN: 0-471-92420-2,
LC: QA267.7.M37.
• The MathWorks,
Parallel Computing Toolbox 4,
User's Guide.
• The Mathworks,
Parallel Computing Toolbox Release Notes,
The Mathworks, 2009.
• Virginia Tech Advanced Research Computing,
Notes on Enabling Remote Submission of MATLAB Jobs,
matlab_remote_submission.pdf.

Examples and Tests:

• knap.m is a MATLAB function which solves the knapsack problem for a given range of combinations.
• distknap.m is a MATLAB script which solves the knapsack problem over the full range of combinations. It defines a job of 4 tasks. Each task is assigned part of the range.
• distknap_output.txt the output from invoking the script, which sends the job to the Ithaca cluster.
• knapweights.mat is an MAT file which stores the values of the available weights, and the desired sum.

You can go up one level to the MATLAB source codes.

Last revised on 04 February 2010.