[Networkit] Problem computing ApproxBetweeness

Lars Beckmann lars.beckmann at gmail.com
Wed May 21 11:21:41 CEST 2014


Hi Christian,

I don't have access to that machine now, but I am pretty sure this issue
was related to the optimization flags. I will check with Kevin.

Thanks for getting back to us proactively!

Lars


On Fri, May 16, 2014 at 1:47 PM, Christian Staudt
<christian.staudt at kit.edu>wrote:

> Hi,
> I believe the problem is still unsolved, and we cannot reproduce it on our
> machines. I have two more suggestions:
>
> - Could it be that it is a subtle parallel bug? What happens if you reduce
> the number of threads, possibly running it in single-threaded mode? This
> can be controlled with the NetworKit.setNumberOfThreads(1) function in
> Python or with the —threads=1 flag for the unit tests.
> - Would it be acceptable to reduce the accuracy of the approximation so
> that you get a solution in time?
>
> Christian
>
> Am 07.05.2014 um 16:07 schrieb Lars Beckmann <lars.beckmann at gmail.com>:
>
> Hi Henning,
>
> I did what you suggested (see output attached). I let it run for a couple
> of hours on my admittedly relatively weak machine. I have a VM running on
> my laptop (which has a 3 years old i7-620M).
>
> To me the output looks ok and if I understand that correctly it chewed
> through about 20k out of the 60k samples when I aborted the process.
>
> Does that give you any info that I don't see? Is there anything else we
> can do?
>
> Lars
>
>
>
> On Wed, May 7, 2014 at 11:03 AM, Henning Meyerhenke <meyerhenke at kit.edu>wrote:
>
>> Hi Kevin,
>>
>> I have created a unit test tryApproxBetweennessOnRealGraph which
>> operates on your graph.
>>
>> On my MacBook with 4 cores the operation took about 800 seconds, which is
>> in the expected range and far from the behavior you describe. This timing
>> is even with --loglevel=DEBUG, fewer output would result in a somewhat
>> faster run.
>>
>> Could you please update to the latest Dev branch revision and run the
>> test yourself? You need to put your graph into the input/ directory for
>> that. Please produce DEBUG output as well and have it written to a file if
>> possible.
>>
>> Thanks,
>> Henning
>>
>>
>>
>>
>> Am 07.05.14 10:46, schrieb Kevin Tierney:
>>
>>  Hi,
>>>
>>> We used the defaults.
>>>
>>> Kevin
>>>
>>> On 07.05.2014 10:34, Henning Meyerhenke wrote:
>>>
>>>> Hi Kevin,
>>>>
>>>> What were the approximation and confidence parameters you used? Or did
>>>> you go along with the default epsilon=0.01 and delta=0.1?
>>>>
>>>> Best,
>>>> Henning
>>>>
>>>>
>>>> Am 07.05.14 09:34, schrieb Kevin Tierney:
>>>>
>>>>> Hi,
>>>>>
>>>>> We have come across another graph where there seems to be an issue with
>>>>> ApproxBetweenness. This graph is not trivially small, nor is it
>>>>> completely connected, meaning we'd actually be rather interested in
>>>>> finding out the ApproxBetweenness value for it. I let it run overnight
>>>>> (~14 hours with 8 cores) and did not get an answer, so I assume it was
>>>>> never going to finish.
>>>>>
>>>>> The graph is a bit large, so I put it here:
>>>>> https://dl.dropboxusercontent.com/u/16348505/ns894786.mps.
>>>>> gz.variable.graph.bz2
>>>>>
>>>>>
>>>>> Any help you can provide would be greatly appreciated.
>>>>>
>>>>> By the way, we tried to compute the number of samples ourselves and it
>>>>> seemed like it would only be around ~63000, so perhaps there is another
>>>>> issue?
>>>>>
>>>>> Thanks for your help.
>>>>>
>>>>> Kevin
>>>>>
>>>>> On 06.05.2014 16:32, Christian Staudt wrote:
>>>>>
>>>>>> The important thing is that you get sensible results on the real
>>>>>> networks that are of interest to you. If that’s the case, I give a
>>>>>> low priority to this bug. I agree that betweenness is uninteresting
>>>>>> on near-complete graphs.
>>>>>>
>>>>>> Am 06.05.2014 um 16:21 schrieb Kevin Tierney <kevin.tierney at upb.de>:
>>>>>>
>>>>>>  Hi Christian,
>>>>>>>
>>>>>>> Thanks for looking into the issue and for your update.
>>>>>>>
>>>>>>> We've noticed similar behavior on complete or nearly-complete
>>>>>>> graphs. I
>>>>>>> suppose computing a betweenness measure on such graphs is pointless
>>>>>>> anyway (at least for what we're doing), but at least that seems like
>>>>>>> a
>>>>>>> possible cause of the weird behavior.
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> On 06.05.2014 15:30, Christian Staudt wrote:
>>>>>>>
>>>>>>>> Hi Kevin,
>>>>>>>> running the program with the log level set to INFO reveals the
>>>>>>>> problem:
>>>>>>>>
>>>>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 38: virtual
>>>>>>>> void NetworKit::ApproxBetweenness::run()]: estimating vertex
>>>>>>>> diameter
>>>>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 41: virtual
>>>>>>>> void NetworKit::ApproxBetweenness::run()]: estimated diameter: 2
>>>>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 47: virtual
>>>>>>>> void NetworKit::ApproxBetweenness::run()]: taking
>>>>>>>> 9223372036854775808 path samples
>>>>>>>>
>>>>>>>> A ridiculously high number of samples is taken. Where does it come
>>>>>>>> from? The number of samples is calculated as:
>>>>>>>>
>>>>>>>>      count r = ceil((c / (epsilon * epsilon)) * (floor(log(vd - 2))
>>>>>>>> + 1 + log(1 / delta)));
>>>>>>>>
>>>>>>>> So if the estimated diameter (vd) is 2, then the formula contains
>>>>>>>> the logarithm of 0, which looks like trouble. I am currently not
>>>>>>>> sure how to fix this, because this is how the formula was
>>>>>>>> published. We will investigate.
>>>>>>>>
>>>>>>>> At the same time I discovered and fixed an error in the formula,
>>>>>>>> but I believe this was unrelated to the problem you describe.
>>>>>>>> Please also update to the latest Dev-Revision for your experiments.
>>>>>>>>
>>>>>>>> Kind regards
>>>>>>>> Christian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 06.05.2014 um 14:57 schrieb Kevin Tierney
>>>>>>>> <kevin.tierney at uni-paderborn.de>:
>>>>>>>>
>>>>>>>>  Dear NetworKit Gurus,
>>>>>>>>>
>>>>>>>>> We are trying to use the approximate betweenness function and
>>>>>>>>> notice an
>>>>>>>>> odd thing happening on small graphs. I have attached a graph where
>>>>>>>>> calling:
>>>>>>>>>
>>>>>>>>> NetworKit.centrality.ApproxBetweenness(G)
>>>>>>>>>
>>>>>>>>> seems to never finish. At least the CPU time required seems to be
>>>>>>>>> way
>>>>>>>>> more than should be for such a small graph.
>>>>>>>>>
>>>>>>>>> You might say "Why would you compute approxbetweenness for such a
>>>>>>>>> small
>>>>>>>>> graph?" -- the reason is this was just mixed in with a bunch of
>>>>>>>>> larger
>>>>>>>>> graphs where betweenness would take too long.
>>>>>>>>>
>>>>>>>>> Do you have any ideas on what is happening?
>>>>>>>>>
>>>>>>>>> By the way, could this perhaps be related to the sample size of 42
>>>>>>>>> in
>>>>>>>>> the .cpp file for approxbetweenness? We aren't getting any issues
>>>>>>>>> on
>>>>>>>>> large graphs; in fact they are being solved relatively quickly.
>>>>>>>>>
>>>>>>>>> Thanks for your help!
>>>>>>>>>
>>>>>>>>> Sincerely,
>>>>>>>>>
>>>>>>>>> Kevin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jun.-Prof Dr. Kevin Tierney
>>>>>>>>> University of Paderborn
>>>>>>>>> Faculty of Business Administration and Economics
>>>>>>>>> Department 3: Business Information Systems
>>>>>>>>> Room: Q2.472
>>>>>>>>> Tel.: +49 (0)5251 60-5244
>>>>>>>>> <markshare_5_0.mps.gz.constraint.graph><ATT00001.c>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Jun.-Prof Dr. Kevin Tierney
>>>>>>> University of Paderborn
>>>>>>> Faculty of Business Administration and Economics
>>>>>>> Department 3: Business Information Systems
>>>>>>> Room: Q2.472
>>>>>>> Tel.: +49 (0)5251 60-5244
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> --
>>
>> =======================================================
>> Karlsruhe Institute of Technology (KIT)
>> Institute of Theoretical Informatics (ITI)
>>
>> Juniorprof. Dr. Henning Meyerhenke
>> Theoret. Informatics / Parallel Computing
>>
>> Phone: +49-721-608-41876
>> Web: http://parco.iti.kit.edu/henningm/
>>
>> KIT - University of the State of Baden-Wuerttemberg and
>> National Research Center of the Helmholtz Association
>> =======================================================
>>
>>
>> _______________________________________________
>> NetworKit mailing list
>> NetworKit at ira.uni-karlsruhe.de
>> https://lists.ira.uni-karlsruhe.de/mailman/listinfo/networkit
>>
>>
>
>
> --
> Lars Beckmann
>
> Uhlenstraße 10
> 33098 Paderborn
> Germany
>
> Cell (D2): +49 (0) 178 59 69 150
> Phone: +49 (0) 89 380 123 90
> Fax: +49 (0) 180 102 113 4628
>  <outputRealGraphTest.zip>
>
>
>


-- 
Lars Beckmann

Uhlenstraße 10
33098 Paderborn
Germany

Cell (D2): +49 (0) 178 59 69 150
Phone: +49 (0) 89 380 123 90
Fax: +49 (0) 180 102 113 4628
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ira.uni-karlsruhe.de/mailman/private/networkit/attachments/20140521/2d5c83e6/attachment.html>


More information about the NetworKit mailing list