set_theory, a C code which implements some of the operations of set theory.
We assume that a set is represented by a strictly ascending sequence of positive integers. We might think of a universal set U = 1 : N in cases where all our subsets will have elements between 1 and N.
Set theoretic operations include:
Although sets are traditionally allowed to contain arbitrary elements, it is computationally convenient to assume that our sets are simply subsets of the integers from 1 to N.
If N is no greater than 32, we can represent a set using a 32 bit integer. We term this the B4SET representation. It is compact, but as it stands, is limited to a universal set of no more than 32 elements.
Assuming we can regard the integer as an unsigned quantity, each bit of the binary representation of the integer represents the presence (1) or absence (0) of the corresponding integer in the set. Thus, assuming N is 5, the set { 1, 2, 5} corresponds to the binary representation 10011 and hence to the integer 19. In order to read or write the individual bits of an integer, the functions BTEST, IBCLR and IBSET are useful in this case.
A more flexible, but less efficient, representation of sets uses a logical vector, and is called the LSET representation. Assuming we have a universal set of N elements, any set is represented by a logical vector of N elements, the I-th element of which is TRUE if I is an element of the set.
A representation that can be more efficient for small subsets of a large universal set is the I4SET. In this representation, we simply list, in ascending order, the elements of the set. The representation is simple, but manipulation is more involved. For instance, to create the union of two sets, we must determine the number of unique elements in the two component sets, allocate the necessary space, then interleave the elements of the two components appropriately.
The computer code and data files described and made available on this web page are distributed under the MIT license
set_theory is available in a C version and a C++ version and a FORTRAN90 version and a MATLAB version.
COMBO, a C code which handles combinatorial problems, by Kreher and Stinson;
SUBSET, a C code which ranks, unranks, and generates random subsets, combinations, permutations, and so on;