[Networkit] Request for comments: How to read attributes in graph files?
michael.hamann at kit.edu
Mon Sep 21 18:48:32 CEST 2015
I've just pushed a commit  that adds experimental support for writing (not
reading) attributes in GraphML files. You can simply pass attributes to the
write-method. I chose this approach as in my opinion it fits best to how graph
readers and writers are currently implemented. However I do not think that
this is the structure that we should keep for graph readers and writers.
On Wednesday 02 September 2015 13:13:50, Christian Staudt wrote:
> On 02 Sep 2015, at 12:12, Maximilian Vogel
<maximilian.vogel at student.kit.edu> wrote:
> > Regarding the GraphReader class, various approaches should be considered:
> > - One could argue, that you always want to have a graph, but not
> > necessarily its attributes, so the read()-function reads the file and
> > returns the graph object. For attributes, the getAttribute()-function can
> > be used. [corresponds to your suggestion and also the current state]
> That’s a viable approach.
I'm against this approach as this is introduces yet another way of returning
results in NetworKit in addition to the two options we currently have:
returning all results directly or having a void run method and getter-methods
for returning results.
> > - The GraphReader class gets a method process() which reads the file and
> > creates all the objects like the Graph, attributes and mappings. Then,
> > various getter-functions can be used to retrieve the desired objects.
> That would be even cleaner and more consistent with our current design
> pattern for classes, but leads to much refactoring.
I think this is the way to go. Graph readers and writers should be changed to
follow the structure that is used in our algorithm classes. This means: The
constructor should take all arguments and the "run"-method (I'm not sure if
this is the right term for graph readers and writers) should just read or
write the data. The result should then be obtained using methods like
"getGraph()" or "getAttribute(key)" or "getAttributeKeys()". The latter two
methods should probably also take a boolean that determines if we are
interested in node or edge attributes. In order to not to duplicate the read
graph (and attribute data) we should consider moving the Graph object like it
is already implemented in the MST class .
A problem for the implementation could be that graph formats support different
types of attributes (like boolean, integer, double or string) that should be
mapped to appropriate C++ data types. This will probably lead to quite some
code duplication and means that at least in the C++ layer we cannot easily
pass all attributes to the constructor for the writer classes unless we use
variadic templates which cannot be mapped to Cython easily. I therefore
suggest that attributes should be set using a setAttribute(key, data,
edge/node-flag)-method which is overloaded for all supported attribute types.
I wonder if it would make sense to support Partition objects as natural
containers of categorical attributes. My GraphMLWriter-implementation actually
supports this already as Partition objects support all needed operations.
More information about the NetworKit