[Networkit] Problem computing ApproxBetweeness

Kevin Tierney kevin.tierney at upb.de
Wed May 7 09:34:55 CEST 2014


Hi,

We have come across another graph where there seems to be an issue with 
ApproxBetweenness. This graph is not trivially small, nor is it 
completely connected, meaning we'd actually be rather interested in 
finding out the ApproxBetweenness value for it. I let it run overnight 
(~14 hours with 8 cores) and did not get an answer, so I assume it was 
never going to finish.

The graph is a bit large, so I put it here:
https://dl.dropboxusercontent.com/u/16348505/ns894786.mps.gz.variable.graph.bz2

Any help you can provide would be greatly appreciated.

By the way, we tried to compute the number of samples ourselves and it 
seemed like it would only be around ~63000, so perhaps there is another 
issue?

Thanks for your help.

Kevin

On 06.05.2014 16:32, Christian Staudt wrote:
> The important thing is that you get sensible results on the real networks that are of interest to you. If that’s the case, I give a low priority to this bug. I agree that betweenness is uninteresting on near-complete graphs.
>
> Am 06.05.2014 um 16:21 schrieb Kevin Tierney <kevin.tierney at upb.de>:
>
>> Hi Christian,
>>
>> Thanks for looking into the issue and for your update.
>>
>> We've noticed similar behavior on complete or nearly-complete graphs. I
>> suppose computing a betweenness measure on such graphs is pointless
>> anyway (at least for what we're doing), but at least that seems like a
>> possible cause of the weird behavior.
>>
>> Kevin
>>
>>
>> On 06.05.2014 15:30, Christian Staudt wrote:
>>> Hi Kevin,
>>> running the program with the log level set to INFO reveals the problem:
>>>
>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 38: virtual void NetworKit::ApproxBetweenness::run()]: estimating vertex diameter
>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 41: virtual void NetworKit::ApproxBetweenness::run()]: estimated diameter: 2
>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 47: virtual void NetworKit::ApproxBetweenness::run()]: taking 9223372036854775808 path samples
>>>
>>> A ridiculously high number of samples is taken. Where does it come from? The number of samples is calculated as:
>>>
>>> 	count r = ceil((c / (epsilon * epsilon)) * (floor(log(vd - 2)) + 1 + log(1 / delta)));
>>>
>>> So if the estimated diameter (vd) is 2, then the formula contains the logarithm of 0, which looks like trouble. I am currently not sure how to fix this, because this is how the formula was published. We will investigate.
>>>
>>> At the same time I discovered and fixed an error in the formula, but I believe this was unrelated to the problem you describe. Please also update to the latest Dev-Revision for your experiments.
>>>
>>> Kind regards
>>> Christian
>>>
>>>
>>>
>>> Am 06.05.2014 um 14:57 schrieb Kevin Tierney <kevin.tierney at uni-paderborn.de>:
>>>
>>>> Dear NetworKit Gurus,
>>>>
>>>> We are trying to use the approximate betweenness function and notice an
>>>> odd thing happening on small graphs. I have attached a graph where calling:
>>>>
>>>> NetworKit.centrality.ApproxBetweenness(G)
>>>>
>>>> seems to never finish. At least the CPU time required seems to be way
>>>> more than should be for such a small graph.
>>>>
>>>> You might say "Why would you compute approxbetweenness for such a small
>>>> graph?" -- the reason is this was just mixed in with a bunch of larger
>>>> graphs where betweenness would take too long.
>>>>
>>>> Do you have any ideas on what is happening?
>>>>
>>>> By the way, could this perhaps be related to the sample size of 42 in
>>>> the .cpp file for approxbetweenness? We aren't getting any issues on
>>>> large graphs; in fact they are being solved relatively quickly.
>>>>
>>>> Thanks for your help!
>>>>
>>>> Sincerely,
>>>>
>>>> Kevin
>>>>
>>>>
>>>> --
>>>> Jun.-Prof Dr. Kevin Tierney
>>>> University of Paderborn
>>>> Faculty of Business Administration and Economics
>>>> Department 3: Business Information Systems
>>>> Room: Q2.472
>>>> Tel.: +49 (0)5251 60-5244
>>>> <markshare_5_0.mps.gz.constraint.graph><ATT00001.c>
>>>
>>
>> --
>> Jun.-Prof Dr. Kevin Tierney
>> University of Paderborn
>> Faculty of Business Administration and Economics
>> Department 3: Business Information Systems
>> Room: Q2.472
>> Tel.: +49 (0)5251 60-5244
>

-- 
Jun.-Prof Dr. Kevin Tierney
University of Paderborn
Faculty of Business Administration and Economics
Department 3: Business Information Systems
Room: Q2.472
Tel.: +49 (0)5251 60-5244



More information about the NetworKit mailing list