[Networkit] Problem computing ApproxBetweeness

Kevin Tierney kevin.tierney at upb.de
Wed May 7 10:46:18 CEST 2014


Hi,

We used the defaults.

Kevin

On 07.05.2014 10:34, Henning Meyerhenke wrote:
> Hi Kevin,
>
> What were the approximation and confidence parameters you used? Or did
> you go along with the default epsilon=0.01 and delta=0.1?
>
> Best,
> Henning
>
>
> Am 07.05.14 09:34, schrieb Kevin Tierney:
>> Hi,
>>
>> We have come across another graph where there seems to be an issue with
>> ApproxBetweenness. This graph is not trivially small, nor is it
>> completely connected, meaning we'd actually be rather interested in
>> finding out the ApproxBetweenness value for it. I let it run overnight
>> (~14 hours with 8 cores) and did not get an answer, so I assume it was
>> never going to finish.
>>
>> The graph is a bit large, so I put it here:
>> https://dl.dropboxusercontent.com/u/16348505/ns894786.mps.gz.variable.graph.bz2
>>
>>
>> Any help you can provide would be greatly appreciated.
>>
>> By the way, we tried to compute the number of samples ourselves and it
>> seemed like it would only be around ~63000, so perhaps there is another
>> issue?
>>
>> Thanks for your help.
>>
>> Kevin
>>
>> On 06.05.2014 16:32, Christian Staudt wrote:
>>> The important thing is that you get sensible results on the real
>>> networks that are of interest to you. If that’s the case, I give a
>>> low priority to this bug. I agree that betweenness is uninteresting
>>> on near-complete graphs.
>>>
>>> Am 06.05.2014 um 16:21 schrieb Kevin Tierney <kevin.tierney at upb.de>:
>>>
>>>> Hi Christian,
>>>>
>>>> Thanks for looking into the issue and for your update.
>>>>
>>>> We've noticed similar behavior on complete or nearly-complete graphs. I
>>>> suppose computing a betweenness measure on such graphs is pointless
>>>> anyway (at least for what we're doing), but at least that seems like a
>>>> possible cause of the weird behavior.
>>>>
>>>> Kevin
>>>>
>>>>
>>>> On 06.05.2014 15:30, Christian Staudt wrote:
>>>>> Hi Kevin,
>>>>> running the program with the log level set to INFO reveals the
>>>>> problem:
>>>>>
>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 38: virtual
>>>>> void NetworKit::ApproxBetweenness::run()]: estimating vertex diameter
>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 41: virtual
>>>>> void NetworKit::ApproxBetweenness::run()]: estimated diameter: 2
>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 47: virtual
>>>>> void NetworKit::ApproxBetweenness::run()]: taking
>>>>> 9223372036854775808 path samples
>>>>>
>>>>> A ridiculously high number of samples is taken. Where does it come
>>>>> from? The number of samples is calculated as:
>>>>>
>>>>>     count r = ceil((c / (epsilon * epsilon)) * (floor(log(vd - 2))
>>>>> + 1 + log(1 / delta)));
>>>>>
>>>>> So if the estimated diameter (vd) is 2, then the formula contains
>>>>> the logarithm of 0, which looks like trouble. I am currently not
>>>>> sure how to fix this, because this is how the formula was
>>>>> published. We will investigate.
>>>>>
>>>>> At the same time I discovered and fixed an error in the formula,
>>>>> but I believe this was unrelated to the problem you describe.
>>>>> Please also update to the latest Dev-Revision for your experiments.
>>>>>
>>>>> Kind regards
>>>>> Christian
>>>>>
>>>>>
>>>>>
>>>>> Am 06.05.2014 um 14:57 schrieb Kevin Tierney
>>>>> <kevin.tierney at uni-paderborn.de>:
>>>>>
>>>>>> Dear NetworKit Gurus,
>>>>>>
>>>>>> We are trying to use the approximate betweenness function and
>>>>>> notice an
>>>>>> odd thing happening on small graphs. I have attached a graph where
>>>>>> calling:
>>>>>>
>>>>>> NetworKit.centrality.ApproxBetweenness(G)
>>>>>>
>>>>>> seems to never finish. At least the CPU time required seems to be way
>>>>>> more than should be for such a small graph.
>>>>>>
>>>>>> You might say "Why would you compute approxbetweenness for such a
>>>>>> small
>>>>>> graph?" -- the reason is this was just mixed in with a bunch of
>>>>>> larger
>>>>>> graphs where betweenness would take too long.
>>>>>>
>>>>>> Do you have any ideas on what is happening?
>>>>>>
>>>>>> By the way, could this perhaps be related to the sample size of 42 in
>>>>>> the .cpp file for approxbetweenness? We aren't getting any issues on
>>>>>> large graphs; in fact they are being solved relatively quickly.
>>>>>>
>>>>>> Thanks for your help!
>>>>>>
>>>>>> Sincerely,
>>>>>>
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jun.-Prof Dr. Kevin Tierney
>>>>>> University of Paderborn
>>>>>> Faculty of Business Administration and Economics
>>>>>> Department 3: Business Information Systems
>>>>>> Room: Q2.472
>>>>>> Tel.: +49 (0)5251 60-5244
>>>>>> <markshare_5_0.mps.gz.constraint.graph><ATT00001.c>
>>>>>
>>>>
>>>> --
>>>> Jun.-Prof Dr. Kevin Tierney
>>>> University of Paderborn
>>>> Faculty of Business Administration and Economics
>>>> Department 3: Business Information Systems
>>>> Room: Q2.472
>>>> Tel.: +49 (0)5251 60-5244
>>>
>>
>

-- 
Jun.-Prof Dr. Kevin Tierney
University of Paderborn
Faculty of Business Administration and Economics
Department 3: Business Information Systems
Room: Q2.472
Tel.: +49 (0)5251 60-5244



More information about the NetworKit mailing list