Answering Reachability Queries On Very Large Directed Graphs
Stc
Date: 2010-09-29
Time: 15:00
Room:
BBL 083
Speaker: Bas van Schaik
Title: Answering reachability queries on very large directed graphs -- introducing a new data structure using bit vector compression (thesis defense)
Abstract
Answering reachability queries on graphs has been subject of extensive
research for the last couple of decades. Reachability data structures
and algorithms provide an answer to probably one of the easiest
sounding questions in graph theory: can some vertex i reach another
vertex j (possibly using intermediate vertices), or not? Such queries
can be answered in many different ways, for example by using data
structures based on a representation of the transitive closure of a
graph.
Starting in the 1950's, computer scientists and mathematicians have
presented multiple ways to process a graph, extract reachability
information and represent the transitive closure. Data sources -- and
the graphs representing them -- are vastly growing, forcing
researchers to look for new ways to efficiently represent a transitive
closure, which grows quadratically in the number of vertices of a
graph. Additionally, the amount of time required to process a graph
and build a transitive closure data structure should be limited, as
well as the amount of time required to answer reachability queries
using the data structure.
This thesis provides an introduction to transitive closure computation
and introduces the concept of bit vector compression to reduce the
amount of memory required to represent a transitive closure. A new
data structure (based on bit vector compression) is presented,
together with both a theoretical and experimental analysis to compare
its performance -- in terms of memory usage, construction time and
query response time -- to data structures presented in publications at
major conferences.