[Networkit] Request for comments: How to read attributes in graph files?

Michael Hamann michael.hamann at kit.edu
Mon Sep 21 18:48:32 CEST 2015


Hi,

I've just pushed a commit [0] that adds experimental support for writing (not 
reading) attributes in GraphML files. You can simply pass attributes to the 
write-method. I chose this approach as in my opinion it fits best to how graph 
readers and writers are currently implemented. However I do not think that 
this is the structure that we should keep for graph readers and writers.

On Wednesday 02 September 2015 13:13:50, Christian Staudt wrote:
> On 02 Sep 2015, at 12:12, Maximilian Vogel 
<maximilian.vogel at student.kit.edu> wrote:
[...]
> > Regarding the GraphReader class, various approaches should be considered:
> > - One could argue, that you always want to have a graph, but not
> > necessarily its attributes, so the read()-function reads the file and
> > returns the graph object. For attributes, the getAttribute()-function can
> > be used. [corresponds to your suggestion and also the current state]
> That’s a viable approach.

I'm against this approach as this is introduces yet another way of returning 
results in NetworKit in addition to the two options we currently have: 
returning all results directly or having a void run method and getter-methods 
for returning results.

> > - The GraphReader class gets a method process() which reads the file and
> > creates all the objects like the Graph, attributes and mappings. Then,
> > various getter-functions can be used to retrieve the desired objects.
> That would be even cleaner and more consistent with our current design
> pattern for classes, but leads to much refactoring.

I think this is the way to go. Graph readers and writers should be changed to 
follow the structure that is used in our algorithm classes. This means: The 
constructor should take all arguments and the "run"-method (I'm not sure if 
this is the right term for graph readers and writers) should just read or 
write the data. The result should then be obtained using methods like 
"getGraph()" or "getAttribute(key)" or "getAttributeKeys()". The latter two 
methods should probably also take a boolean that determines if we are 
interested in node or edge attributes. In order to not to duplicate the read 
graph (and attribute data) we should consider moving the Graph object like it 
is already implemented in the MST class [1].

A problem for the implementation could be that graph formats support different 
types of attributes (like boolean, integer, double or string) that should be 
mapped to appropriate C++ data types. This will probably lead to quite some 
code duplication and means that at least in the C++ layer we cannot easily 
pass all attributes to the constructor for the writer classes unless we use 
variadic templates which cannot be mapped to Cython easily. I therefore 
suggest that attributes should be set using a setAttribute(key, data, 
edge/node-flag)-method which is overloaded for all supported attribute types.

I wonder if it would make sense to support Partition objects as natural 
containers of categorical attributes. My GraphMLWriter-implementation actually 
supports this already as Partition objects support all needed operations.

Best regards,
Michael

[0]: 
https://algohub.iti.kit.edu/parco/NetworKit/NetworKit/changeset/a8999f083d6b87f09a291c3031bfae579fc1ad5c
[1]: 
https://algohub.iti.kit.edu/parco/NetworKit/NetworKit/files/a8999f083d6b87f09a291c3031bfae579fc1ad5c/networkit/cpp/graph/MST.cpp#L102





More information about the NetworKit mailing list