[Networkit] addEdge seg faults when invoked with nodes not present in the graph

Albert Cardona sapristi at gmail.com
Wed Jun 28 22:23:13 CEST 2017


I've moved the ball a bit further: the Graph and DiGraph classes of
the nxnk package can now act as drop-in replacements for the
homonimous classes in NetworkX, but they are built on top of
NetworKit.

It's quite amusing to be able to use NetworkX's library of algorithms
on a NetworKit backend. Of course, the goal was to do in a way the
opposite: have the ease and familiarity of NetworkX while being able
to access directly the algorithms in NetworKit. This is also possible,
using the "to_networkit_nodes" and "to_user_nodes" mapping functions.

https://github.com/acardona/nxnk

Regarding NetworKit: it would be nice if the python layer provided
nodes() and edges() as generators rather than lists, or at least
provide alternative methods to get them as iterables. Otherwise, I am
loading large lists that are consumed right away, incurring in an
unnecessary performance penalty.

Best,

Albert

2017-06-24 17:05 GMT-04:00 Albert Cardona <sapristi at gmail.com>:
> A first attempt:
>
> https://github.com/acardona/nxnk
>
> Clearly alpha stage and rather incomplete.
>
> I've learned a lot about NetworKit by writing it. In particular, the
> key feature that I can't define node IDs, having to instead keep two
> dictionaries relating my node IDs and NetworKit's node IDs. If this is
> not the case, I have misunderstood NetworKit's node management
> altogether.
>
> What's beautiful about it is that:
> 1. I would need a way to manage my graph node IDs vs NetworKit's
> assigned node IDs, so this provides it.
> 2. Performance is about 5 to 10% better than NetworkX for e.g.
> "addPath". Could be better if "add_edge" and "add_node" were inlined.
> Function calls are expensive in Python.
>
> Let me know your thoughts.
>
> Albert
>
>
> 2017-06-24 9:18 GMT-04:00 Christian Staudt <christian.staudt at kit.edu>:
>>
>> On 23. Jun 2017, at 20:00, Albert Cardona <sapristi at gmail.com> wrote:
>>
>> Hi Henning and Matteo,
>>
>> While I am not partial to any naming scheme, I have two comments to make:
>>
>> 1. Invoking a method "incorrectly" should not seg fault a python
>> program. At most, throw an exception. If the latter is not possible
>> for performance reasons, then these methods need to be flagged in some
>> way as "unsafe", as in, the user be warned that strict usage is to be
>> expected. Given that a method like "addEdge" is bound to be used a
>> lot, this issue will bite many beginner users of the library such as
>> myself.
>>
>>
>> I agree with your point, Albert. A segfault is unpythonic. It will kill the
>> interpreter and the entire session, that’s too harsh a punishment for a user
>> mistake. As mentioned, you get bounds checking by C++ asserts when compiled
>> in debug mode, but that may not solve the issue for beginners.
>>
>> The “it’s necessary for performance” argument might not be valid when we
>> talk about calling the addEdge method from Python. Each Python method call
>> has some overhead, right? I would not be surprised if the time for bounds
>> checking didn’t matter much in comparison. Of course measurement is needed
>> to prove that. Safe to say that if you need to add a lot of edges and really
>> can’t afford to lose any time, write that in C++, wrap it with Cython and
>> call it from your Python code.
>>
>>
>>
>>
>> 2. A possible solution would be to build a "networkx" subpackage, so
>> to speak: a namespace implementing basic functions in the same way
>> that networkx does, but with networkit as the backend. Among these
>> there would be the constructors Graph and Digraph; their functions
>> add_node, add_edge; node and edge and weight lookups a-la networkx;
>> and some other basics that could be added incrementally. That there is
>> a performance cost is clear, and that should be stated in the
>> documentation of one such package. This package would both facilitate
>> the adoption of networkit and provide, for many use cases, a drop-in
>> replacement for networkx. Once users are in, and their networkx-styled
>> programs are working, upon learning that performance could increase if
>> they used networkit functions directly, then there's every reason and
>> motivation for them to do so. Note that, given the abysmal performance
>> of networkx, this new drop-in replacement package would outperform
>> networkx in any case by a large factor.
>>
>>
>> I’m not sure that networkit could become a real drop-in replacement for
>> networkx, and maybe it shouldn’t try to. Networkx is slower because it has a
>> different design with much fewer constraints, e.g. any (hashable) object can
>> be a node. And is there really a strong need for a drop-in replacement? Can
>> you point to an example where library B gives you better performance than
>> library A and you can just drop in B instead of A? Numpy arrays aren’t a
>> drop-in replacement for lists, for example, a small amount of rewriting is
>> needed and accepted.
>>
>> Interesting idea though. Have you tried talking to the networkx people and
>> convince them to use networkit as a backend? ;)
>>
>>
>> Best
>> Chris
>>
>>
>>
>> Dr.rer.nat. Christian Staudt
>> http://clstaudt.me/
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> NetworKit mailing list
>> NetworKit at ira.uni-karlsruhe.de
>> https://lists.ira.uni-karlsruhe.de/mailman/listinfo/networkit
>>
>
>
>
> --
> http://albert.rierol.net
> http://www.janelia.org/lab/cardona-lab
> http://www.ini.uzh.ch/~acardona/



-- 
http://albert.rierol.net
http://www.janelia.org/lab/cardona-lab
http://www.ini.uzh.ch/~acardona/



More information about the NetworKit mailing list