svd_basis, a FORTRAN90 code which applies the singular value decomposition to a set of data vectors, to extract the leading "modes" of the data.
This procedure, originally devised by Karl Pearson, has arisen repeatedly in a variety of fields, and hence is known under various names, including:
This program is intended as an intermediate application, in the following situation:
Thus, the program might read in 500 files, and write out 5 or 10 files of the corresponding size and "shape", representing the dominant solution modes.
The optional normalization step involves computing the average of all the solution vectors and subtracting that average from each solution. In this case, the average vector is treated as a special "mode 0", and also written out to a file.
To compute the singular value decomposition, we first construct the M by N matrix A using individual solution vectors as columns:
A = [ X1 | X2 | ... | XN ]
The singular value decomposition has the form:
A = U * S * V'and is determined using the DGESVD routine from the linear algebra package LAPACK. The leading L columns of the orthogonal M by M matrix U, associated with the largest singular values S, are chosen to form the basis.
In most PDE's, the solution vector has some structure; perhaps there are 100 nodes, and at each node the solution has perhaps 4 components (horizontal and vertical velocity, pressure, and temperature, say). While the solution is therefore a vector of length 400, it's more natural to think of it as a sort of table of 100 items, each with 4 components. You can use that idea to organize your solution data files; in other words, your data files can each have 100 lines, containing 4 values on each line. As long as every line has the same number of values, and every data file has the same form, the program can figure out what's going on.
The program assumes that each solution vector is stored in a separate data file and that the files are numbered consecutively, such as data01.txt, data02,txt, ... In a data file, comments (beginning with '#") and blank lines are allowed. Except for comment lines, each line of the file is assumed to represent all the component values of the solution at a particular node.
Here, for instance, is a tiny data file for a problem with just 3 nodes, and 4 solution components at each node:
# This is solution file number 1 # 1 2 3 4 5 6 7 8 9 10 11 12
The program is interactive, but requires only a very small amount of input:
The program computes L basis vectors, and writes each one to a separate file, starting with svd_001.txt, svd_002.txt and so on. The basis vectors are written with the same component and node structure that was encountered on the solution files. Each vector will have unit Euclidean norm.
The computer code and data files described and made available on this web page are distributed under the MIT license
svd_basis is available in a C++ version and a FORTRAN90 version and a MATLAB version.
SVD_BASIS_WEIGHT, a FORTRAN90 code which is similar to SVD_BASIS, but which allows the user to assign weights to each data vector.
svd_test, a FORTRAN90 code which demonstrates the singular value decomposition for a simple example.
SVD_SNOWFALL, a FORTRAN90 code which reads a file containing historical snowfall data and analyzes the data with the Singular Value Decomposition (SVD), and plots created by GNUPLOT.
SVD_TRUNCATED, a FORTRAN90 code which demonstrates the computation of the reduced or truncated Singular Value Decomposition (SVD) that is useful for cases when one dimension of the matrix is much smaller than the other.