[UP Home] [CS Home] [Vreda Home] [DADS]

Aptī Algoritmi

Vreda Pieterse

Aptī Algoritmi is the Latin for (fit / apt / suitable / fitting / prepared / useful) Algorithms.

This site hosts implementations of algorithms that are ready for use in software construction.

I am in the process of creating a topic map of algorithms that solve the transitive closure problem. The topic map is a classification of these algorithms and should ideally point to implementations of algorithms. As part of this endeavor, I implement the algorithms that I include in the topic map and publish them here to be accessible for the users of my topic map and others that may want to use these implementations. The aim of the topic map is ultimately to aid developers in finding suitable algorithms when constructing software.

Binary Relation

A binary relation R on a set S is a collection of ordered pairs of elements of S. It can be represented in set notation as a set of pairs. It can be visualised as a directed graph where the vertices correspond with the elements of S and the edges correspond with the elements of R. It can also be represented using an adjacency matrix — i.e. a square matrix where the entry at (i,j) is 1 if (i,j) is in R, with other words, there is an edge from vertex i to vertex j in the graph representation of R; otherwise the entry is 0. The following figure shows these three representations of an example of a binary relation on the set S ={A, B, C, D}.
Image of Set, Graph and Matrix
					 representations of a given binary
					 relation

Transitive Closure

The transitive closure of a binary relation R is an extension or superset of R such that whenever (i,j) and (j,k) are in the extension, (i,k) is also in the extension. The following figure shows the three representations of the transitive closure of the binary relation shown in the previous figure.
Image of Set, Graph and Matrix
					   representations of the
					   transitive closure of the given binary
					   relation

Test Run

Each of the the programs offered here is a command-line program to calculate the transitive closure of the relation that is entered using standard input. The input required is an ASCII stream of 0's and 1's representing the adjacency matrix of a binary relation preceded by an integer value specifying the width of this matrix. All the programs use the same input format and produce the same output. The following is a test run of a compiled version of one of these programs. This test run calculates the transitive closure of the relation shown in the above mentioned example.

Type the dimension, a comma, and then a continous stream representing the adjacency matrix of the relation: 4,0100001010010000
Input Relation:
0100
0010
1001
0000
Transitive Closure of the Relation:
1111
1111
1111
0000

C++ Implementations

Each of the following downloadable archives contains C++ source code of a command-line program consisting of the following three functions as well as a main function calling these functions:
getRelation
Read standard input and return a pointer to a variable of the appropriate data stucture for the algorithm.
transitiveClosure
Takes a pointer to a variable of the appropriate data structure containing the input relation as a parameter and transforms the value of this variable to the transitive closure of the input relation.
showRelation
Takes a pointer to a variable of the appropriate data structure containing a relation as a parameter and print this relation to standard output.

These implementations use square boolean matrices to represent relations. To do this, a vector of elements is used where each of these elements represent a row in the matrix. Each algorithm has a standard implementation named *Std that uses a vector< char* > to represent the relations. These implementations only need the standard template library. Each algorithm also has a second implementation named *Boost that uses a vector< dynamic_bitset<> > to represent the relations. The dynamic_bitset<> is defined in the boost c++ libraries.

Baker's Algorithm3
Iterates through the entries in the marix in row order and uses a change monitor to know when to stop.
BakerStd.zip (1025 B 2013-12-08 16:07)
BakerBoost.zip (993 B 2013-09-01 18:08)
The Blocked Column Algorithm1, 2
A variation of Warshall's algorithm: Columns are processed in blocks while applying loop tiling to minimise cache thrashing. The elements within the block and within the square section around the diagonal are processed in column order. After processing the elements in the square section, the rest of the elements within a block are processed in row order
BlockColStd.zip (1303 B 2014-04-21 17:14)
BlockColBoost.zip (1458 B 2014-04-21 17:05)
The Blocked Row Algorithm1, 2
A variation of Warren's algorithm: Rows are processed in blocks while applying loop tiling to minimise cache thrashing. The elements within a block are processed in column order
BlockRowStd.zip (1256 B 2014-01-26 15:59)
BlockRowBoost.zip (1331 B 2014-01-26 15:54)
The Fused Coat Algorithm
An optimasation of Prosser's algorithm: A loop that multiplies two matrices and a loop that adds two matrices are fused.
FusedCoatStd.zip (1303 B 2013-12-19 20:26)
FusedCoatBoost.zip (1331 B 2013-12-19 20:11)
Martynyuk's Algorithm4
Iterates through the entries in the marix in row order and stop after log2n iterations.
MartynyukStd.zip (1028 B 2013-12-08 16:07)
MartynyukBoost.zip (1004 B 2013-09-02 20:10)
The Monitored Coat Algorithm
Apply repeated marix multiplication and uses a change monitor to know when to stop.
MonCoatStd.zip (1377 B 2013-12-19 20:27)
MonCoatBoost.zip (1433 B 2013-12-19 20:12)
The Neat Coat Algorithm
An optimasation of the monitored coat algorithm: A loop that multiplies two matrices and a loop that adds two matrices are fused.
NeatCoatStd.zip (1379 B 2013-12-31 12:11)
NeatCoatBoost.zip (1420 B 2013-12-31 12:28)
The Short Circuit Warshall Algorithm
A variation of Warshall's algorithm: A row monitor is applied to short circuit or speed up row additions when a row used in a calculation is empty or full.
ShortWarshallStd.zip (1265 B 2014-10-12 17:56)
ShortWarshallBoost.zip (1203 B 2014-10-12 18:26)
Prosser's Algorithm5
Apply repeated marix multiplication and stop after n - 1 iterations.
ProsserStd.zip (1308 B 2013-12-08 16:08)
ProsserBoost.zip (1299 B 2013-09-02 02:42)
Warren's Algorithm6
Iterates through the entries in the marix in row order. Process entries below the main diagonal in the first pass and those above in the second pass. It calculates the transitive closure in two passes.
WarrenStd.zip (2014 B 2014-01-08 19:02)
WarrenBoost.zip (970 B 2014-01-08 19:22)
Warshall's Algorithm7
Iterates through the entries in the marix in column order. It calculates the transitive closure in one pass.
WarshallStd.zip (988 B 2013-12-08 16:08)
WarshallBoost.zip (931 B 2013-09-02 17:41)

Usage

Compile and run the code using any C++ compiler for example the C++ compiler that can be found in the GNU Compiler Collection on Linux, the C++ compiler that is part of Minimalist GNU for Windows, or the C++ compiler that is included in the Command Line Tools for Xcode on a Mac.

When using the code that relies on boost c++ libraries, the dynamic bitset library needs to be installed on your machine. When issuing the compile command, the boost directory has to be in the search path of your C++ compiler. Assume, for example, that you installed boost_1_54_0.bz2 in the /usr/local/boost directory, you can compile BakerBoost.cpp by issuing the following command in the directory where BakerBoost.cpp is saved:

g++ -I /usr/local/boost/boost_1_54_0/boost BakerBoost.cpp

If BakerBoost.cpp is compiled using the above mentioned command, a compiled version of this program is saved in the current directory. On Windows it is called a.exe and on Linux and Unix (Mac), it is called a.out. It can now be executed by typing a in Windows or ./a.out in Linux/Unix.

It should not be too difficult to adapt the code and reuse it in any project that needs it.

If you experience any problems when using the code, feel free to drop me an e-mail. I cannot guarentee that I will be able to assist, but might be able to give guidance in solving your problems. Your queries will help me to compile a troubleshooting guide.

Test data

InputData.zip is an archive containing some input files. Each file contains input to the provided programs to specify the adjacency matrix of a randomly generated relation. It's filename encodes facts about this relation. The first character is an alphabetical character which is varied to have more than one relation having the same characteristics. The next four characters is a four digit number specifying the dimension of the relation. The last two digits is the percentage filled. Thus, C0600_05.txt is input containing the adjacency matrix of a 600 × 600 relation that has 95% zeros and 5% ones.

These are text files that can be piped as input to a compiled version of each of the downloadable programs. The following is the command to execute an executable with C0600_05.txt as its input on Windows, when assuming the executable is called a.exe:

a < C0600_05.txt

Similarly on Linux/Unix, assuming that the executable is called a.out, the command is:

./a.out < C0600_05.txt

This archive also contains the source code of a program called generate.cpp which you can compile and run to generate more data.

Other Implementations

A really cool implementation of a transitive closure algorithm implemented by Vladimir Prus is provided in the graph library that is included in the boost c++ libraries. Information about this algorithm can be found at transitve closure documentation at boost.

References

  1. Agrawal R and Jagadish HV (1987) Direct Algorithms for Computing the Transitive Closure of Database Relations, Proceedings of the 13th International Conference on Very Large Data Bases (VLDB'87), 255 - 266.
  2. Agrawal R, Dar S and Jagadish HV (1990) Direct transitive closure algorithms: design and performance evaluation, ACM Transactions on Database Systems (TODS), 15(3):427 - 459.
  3. Baker JJ (1962) A note on multiplying Boolean matrices, Communications of the ACM, 5(2):102.
  4. Martynyuk V (1963) The economical construction of a transitive closure of a binary relation, USSR Computational Mathematics and Mathematical Physics, 2(4):817-821.
  5. Prosser RT (1959) Applications of Boolean matrices to the analysis of flow diagrams, Proceedings of the Eastern Joint Computer Conference No.16: 133-138.
  6. Warren HS Jr (1975) A Modification of Warshall's algorithm for transitive closure of binary relations, Communications of the ACM, 18(4):218-220.
  7. Warshall S (1962) A Theorem of Boolean Matrices, Journal of the ACM, 9(1):11-12.

License

All downloadable files on this site may be used in any commercial or non-commercial projects, free of any charge. You are encouraged to let me know about your projects, and applications of this code. I will appreciate if you acknowledge this site when you use some of the code.
Updated Sun Oct 12 18:31:58 2014 by Vreda Pieterse  (vreda@fastar.org) This page's URL is http://www.cs.up.ac.za/cs/vpieterse/AptiAlgo/AptiAlgoritmi.html