[Networkit] BFS: lambda function, timestamps

Marcel Radermacher marcel.radermacher at student.kit.edu
Mon Jun 9 14:27:50 CEST 2014

Hash: SHA1

Dear NetworKit-Developers,

I would like to suggest two rather simple enhancements to the BFS
implementation. I am currently working an image segmentation algorithm
which relies on an efficient merging of two clusters. The partition
datastructure implemented in NetworKit is not sufficient as it needs
linear time in the size of the graph to merge two clusters. To avoid
redundancy it is not a favoured option to implement a secondary
partition datastructure based on the union-find datastructure (does
not support removing elements of a cluster).

A heuristic that can be used is to run a breadth-first search (BFS)
starting from a cluster vertex and set the cluster id accordingly. To
achieve linear time in the cluster size I would like to propose the
following changes to the current BFS implementation.

First of all, I would like to add a second run function to the BFS
class with the following header:
	template<typename N, typename E, typename T>
	void BFS::run(Graph &g
		, node source
		, N &&handle_node
		, E &&handle_edge
		, T &&traverse_edge);

- - The parameter handle_node is a void function which takes the current
node (I.e. the current head node of the queue) as parameter.
- - The void function handle_edge is called with the parameter (current,
neighbor) iff neighbor is pushed into the queue.
- - traverse_edge takes (current, neighbor) and returns bool. The node
neighbor is only pushed into the queue iff this function returns true
and the node neighbor has not already been pushed.

With this I would be able to limit the search space of the BFS to one
cluster or more general to nodes with a specific property. The current
run function can be easily reimplemented using this interface.

Further, in each BFS run a new distance array is allocated. Therefore,
the BFS would still use linear time in the the size of the graph. A
simple way to deal with this is to use timestamps to decide whether a
node was visited before or not. A complete scan of the array is only
all INT_MAX times necessary. This can be implemented within the BFS
class or more general in a new class within the auxiliary directory(?)
to enable other algorithms to profit from this datastructure (e.g.
Dijkstra). If requested, I could post a more thorough specification of
the interface.

I thing other algorithms can profit from this changes as well. Either
in runtime or to improve readability. For example, the BFS implemented
within centrality/Betweens.cpp could be eliminated.

As a last remark, why are there two BFS implementations? One within
the graph datastructure (only used by the RegionGrowingOverlapper
class) and one as a standalone class? I think the breadthFirst*
functions adds unnecessary complexity to the graph datastructure.

I am looking forward to hear about your ideas.

Kind regards

Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/


More information about the NetworKit mailing list