[Networkit] Linux resource limits and call-stack-based recursion in NetworKit algorithms

Christian Staudt christian.staudt at kit.edu
Fri Jul 15 15:21:19 CEST 2016


Good work on the problem! Something as basic as strongly connected components should just work out of the box and require no tweaking of arcane things like the call stack limit, so what NetworKit currently has seems like the wrong implementation. A possible 10% performance loss is a small price to pay for having a robust algorithm for a basic analysis task.

Just my two cents, it’s up to the current dev team and maintainer to incorporate your contribution.

Chris


> On 15 Jul 2016, at 13:59, Obada Mahdi <omahdi at gmail.com> wrote:
> 
> Hi everyone,
> 
> following up on my brief comment during the GALA project presentations yesterday, I would like to share my insight here for reference.
> 
> In short, most Linux-based distributions (including the one used on our computeNN.iti.kit.edu <http://computenn.iti.kit.edu/> servers) set per-process resource limits in a way that restricts stack size to a maximum of 8 MB of memory by default [1], which is reached quickly when running recursive algorithms like NetworKit's StronglyConnectedComponents on instances like e.g. DIMACS road graphs [2]. Exceeding this limit causes programs to crash with a "segmentation fault".
> 
> Since NetworKit is targeted at large-scale network analysis, I assume that the default limit of 8 MB stack size on most Linux-based machines is too restrictive in general and may require user interaction to lift it. Given the rather unspecific error message ("Segmentation fault", program crash), I suggest to indicate the use of stack-based recursion in the NetworKit documentation (e.g. for StronglyConnectedComponents::run()) and pointing out possible workarounds for affected platforms.
> 
> For completeness, here are two ways a user can adjust the stack size limit:
> 
> (1) From the command line ("bash" shell), the stack resource limit can be lifted via
> 
>   ulimit -s unlimited
> 
> if no hard limit is set. If a hard limit exists, the maximum possible value can be obtained by
> 
>   ulimit -Hs
> 
> It seems that Mac OS X may not be as generous and enforces a hard limit of 16 MB [3].
> 
> Note that resource limits are set on a per-process basis and are inherited by child processes. Changing it this way will apply, for example, to a Python shell started subsequently within the same shell session, but changes are lost once the session is terminated.
> 
> (2) It is also possible to change resource limits directly from within Python:
> 
>   import resource
>   resource.setrlimit(resource.RLIMIT_STACK, (-1, -1))
> 
> will have the same effect as the first shell command above, i.e. lifting the stack size limit.
> 
> 
> Background: I have experienced program crashes while trying to confirm that some road graphs from the 9th DIMACS challenge [1] are indeed strongly connected by transforming them into an EdgeList-compatible format and using NetworKit's Python interface to compute SCCs.
> 
> MWE using a DIMACS graph (reproducible on e.g. compute3.iti.kit.edu <http://compute3.iti.kit.edu/>):
> 
> - shell command for fetching CAL instance and converting it to an edge list:
> 
> curl http://www.dis.uniroma1.it/challenge9/data/USA-road-t/USA-road-t.CAL.gr.gz <http://www.dis.uniroma1.it/challenge9/data/USA-road-t/USA-road-t.CAL.gr.gz> | \
>   zcat | awk '/^a / { print $2 " " $3 " " $4 }' >USA-road-t.CAL.wel
> 
> - interactive Python3 session:
> 
>   >>> from networkit import *
>   >>> gr = readGraph("USA-road-t.CAL.wel", fileformat=Format.EdgeList, separator=" ", \
>   ... firstNode=1, commentPrefix="c", continuous=True, directed=True)
>   >>> s = components.StronglyConnectedComponents(gr)
>   >>> s.run()
>   Segmentation fault (core dumped)
> 
> With a web graph from the SNAP dataset repository:
> 
> - fetching dataset:
>   curl http://snap.stanford.edu/data/web-Google.txt.gz <http://snap.stanford.edu/data/web-Google.txt.gz> | zcat >web-Google.txt
> 
> - interactive Python3 session:
> 
>   >>> from networkit import *
>   >>> gr = readGraph("web-Google.txt", fileformat=Format.EdgeListTabZero, \
>   ... commentPrefix="#", continuous=True, directed=True)
>   >>> s = components.StronglyConnectedComponents(gr)
>   >>> s.run()
>   Segmentation fault (core dumped)
> 
> Both examples run fine with the stack limit removed as shown above.
> 
> 
> The attached patch (scc-iter.patch, against current Dev [5750:e0903081deb1]) is an experimental attempt to work around limited stack size by employing an iterative adaption of the existing implementation in StronglyConnectedComponents::run(). It provides the alternative method runIteratively(), which uses an additional std::vector-based stack instead of the program call stack. I am sharing it here for the curious and interested -- it has not been tested thoroughly, and it currently relies on an additional interface function hacked into the Graph class (namely Graph::getIthNeighbor) for obtaining the i-th outgoing neighbor of a node in constant time.
> 
> It seems to work fine on the two instances mentioned above and is slightly faster (about 10%) for the strongly connected road graph (one component) than the recursive baseline and slightly slower (again about 10%) for the web graph, for which roughly half of its nodes are in one SCC, while most others build components of size 1.  Again, no extensive evaluation has been performed, but these preliminary findings line up with my intuition that call-stack recursion performs very well because it is based on a dedicated hardware register/memory layout and does not require explicit boundary checks or initialization, not to mention compiler optimizations that may take advantage of (partial) function inlining. The overhead of storing additional return addresses and padding stack frames seems to show only for a larger recursion depth.
> 
> 
> Regards
> 
> Obada
> 
> [1] http://stackoverflow.com/questions/2656722/ <http://stackoverflow.com/questions/2656722/>
> [2] http://www.dis.uniroma1.it/challenge9/download.shtml <http://www.dis.uniroma1.it/challenge9/download.shtml>
> [3] http://stackoverflow.com/questions/13245019/ <http://stackoverflow.com/questions/13245019/>
> [4] https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm <https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm>
> <scc-iter.patch>_______________________________________________
> NetworKit mailing list
> NetworKit at ira.uni-karlsruhe.de
> https://lists.ira.uni-karlsruhe.de/mailman/listinfo/networkit

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ira.uni-karlsruhe.de/pipermail/networkit/attachments/20160715/620e5ee1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.ira.uni-karlsruhe.de/pipermail/networkit/attachments/20160715/620e5ee1/attachment-0001.sig>


More information about the NetworKit mailing list