[Networkit] request for comments: new API for edge attributes

Florian Weber uagws at student.kit.edu
Thu Jul 31 16:51:39 CEST 2014


> Think about a billion edge attributes, either stored in an
> std::vector<…> indexed by edge id, or stored in an
> std::hash_map<std::pair<node, node>, …>. Which will have faster
> read/write access? Memory footprint?

I guess it somewhat depends: Yes, with the hashmap and the bigger ID we
would likely need more memory to safe the attributes. OTOH we would safe
memory in other places since members like inEdgeIds and outEdgeIds would
become obsolete. In case of the iteration over Edges we would safe
ourselves the cost for the lookups.

Note that I'm not saying that my idea is better than the current
implementation, I just prefer to have considered all possibilities
before putting this into stone.

> What about parallel write access
> (not a good idea for std::hash_map, I believe)?

That really wouldn't be more of a problem than it is with the vector:

* parallel inserts/removes are unsafe with both
* parallel writes on the same attribute are unsafe with both
* parallel writes on different attributes are safe with both

(In Order to be absolutely sure, I even tried the last one with clangs
thread-sanitizer and it found no problems.)


Then, for another thing: Currently there is a *huge* amount of code
duplication in the switch-statements. I am aware that this is because
you (rightfully) don't want to check things like whether the graph is
directed inside a tight loop. I do however believe, that there is a
better and most likely as fast approach:

template<typename L>
void Graph::forEdges(L handle) const {
	switch (weighted) { // simplified as it is an example
		case true:
			forEdgesImpl<true>(L); break;
		case false:
			forEdgesImpl<false>(L); break;
	}
}

template<bool Weighted, typename L>
void Graph::forEdgesImpl(L handle) const {
	for (node u = 0; u < z; ++u) {
	for (index i = 0; i < outEdges[u].size(); ++i) {
		node v = outEdges[u][i];
		if (u >= v) {
			edgeweight ew = getEdgeWeight<Weighted>(u, i);
			edgeLambda(handle, u, v, ew, 0);
		}
	}
}

template<bool Weighted>
edgeweight Graph::getEdgeWeight(node u, node i) const {
	return outEdgeWeights[u][i];
}

template<>
constexpr edgeweight Graph::getEdgeWeight<false>(node, node) const {
	return defaultEdgeWeight;
}


This may look like overkill for the simplified version, but I am certain
that it will greatly cut down the code for the situation that we have in
the code with three independent variables and four of those switches.
(Note also that the const getEdgeWeight and so on can be used in all
four templates (const-parrallel, mutable-parallel, const-serial,
mutable-serial).

And since I am already at it: I think it might be a great thing to be
able to write something like this:

void some_fun(const Graph& G) {
	for(auto&& edge: G.edges()) {
		use(edge);
	}
}

The basic idea behind that is to provide an iterator-interface over the
edges/notes/whatever and then put those into a very simple class that
only has to provide a begin()- and end()-method. It definitely isn't
something that can be reasonably done until Monday, but all things
considered it is certainly doable. Other advantages would be:

* We could suddenly use all the great things from the algorithm-header
that work on iterators.
* We could implement the current loops in terms of those iterators
making the net-increase in code-size relatively small.

But that's just an idea for your consideration.


-------------- n?chster Teil --------------
Ein Dateianhang mit Bin?rdaten wurde abgetrennt...
Dateiname   : signature.asc
Dateityp    : application/pgp-signature
Dateigr??e  : 884 bytes
Beschreibung: OpenPGP digital signature
URL         : <https://lists.ira.uni-karlsruhe.de/mailman/private/networkit/attachments/20140731/5d85b17c/attachment.sig>


More information about the NetworKit mailing list