[Networkit] addEdge seg faults when invoked with nodes not present in the graph
christian.staudt at kit.edu
Sat Jun 24 15:18:40 CEST 2017
> On 23. Jun 2017, at 20:00, Albert Cardona <sapristi at gmail.com> wrote:
> Hi Henning and Matteo,
> While I am not partial to any naming scheme, I have two comments to make:
> 1. Invoking a method "incorrectly" should not seg fault a python
> program. At most, throw an exception. If the latter is not possible
> for performance reasons, then these methods need to be flagged in some
> way as "unsafe", as in, the user be warned that strict usage is to be
> expected. Given that a method like "addEdge" is bound to be used a
> lot, this issue will bite many beginner users of the library such as
I agree with your point, Albert. A segfault is unpythonic. It will kill the interpreter and the entire session, that’s too harsh a punishment for a user mistake. As mentioned, you get bounds checking by C++ asserts when compiled in debug mode, but that may not solve the issue for beginners.
The “it’s necessary for performance” argument might not be valid when we talk about calling the addEdge method from Python. Each Python method call has some overhead, right? I would not be surprised if the time for bounds checking didn’t matter much in comparison. Of course measurement is needed to prove that. Safe to say that if you need to add a lot of edges and really can’t afford to lose any time, write that in C++, wrap it with Cython and call it from your Python code.
> 2. A possible solution would be to build a "networkx" subpackage, so
> to speak: a namespace implementing basic functions in the same way
> that networkx does, but with networkit as the backend. Among these
> there would be the constructors Graph and Digraph; their functions
> add_node, add_edge; node and edge and weight lookups a-la networkx;
> and some other basics that could be added incrementally. That there is
> a performance cost is clear, and that should be stated in the
> documentation of one such package. This package would both facilitate
> the adoption of networkit and provide, for many use cases, a drop-in
> replacement for networkx. Once users are in, and their networkx-styled
> programs are working, upon learning that performance could increase if
> they used networkit functions directly, then there's every reason and
> motivation for them to do so. Note that, given the abysmal performance
> of networkx, this new drop-in replacement package would outperform
> networkx in any case by a large factor.
I’m not sure that networkit could become a real drop-in replacement for networkx, and maybe it shouldn’t try to. Networkx is slower because it has a different design with much fewer constraints, e.g. any (hashable) object can be a node. And is there really a strong need for a drop-in replacement? Can you point to an example where library B gives you better performance than library A and you can just drop in B instead of A? Numpy arrays aren’t a drop-in replacement for lists, for example, a small amount of rewriting is needed and accepted.
Interesting idea though. Have you tried talking to the networkx people and convince them to use networkit as a backend? ;)
Dr.rer.nat. Christian Staudt
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NetworKit