[Networkit] Problem computing ApproxBetweeness

Henning Meyerhenke meyerhenke at kit.edu
Wed May 7 11:03:06 CEST 2014


Hi Kevin,

I have created a unit test tryApproxBetweennessOnRealGraph which 
operates on your graph.

On my MacBook with 4 cores the operation took about 800 seconds, which 
is in the expected range and far from the behavior you describe. This 
timing is even with --loglevel=DEBUG, fewer output would result in a 
somewhat faster run.

Could you please update to the latest Dev branch revision and run the 
test yourself? You need to put your graph into the input/ directory for 
that. Please produce DEBUG output as well and have it written to a file 
if possible.

Thanks,
Henning




Am 07.05.14 10:46, schrieb Kevin Tierney:
> Hi,
>
> We used the defaults.
>
> Kevin
>
> On 07.05.2014 10:34, Henning Meyerhenke wrote:
>> Hi Kevin,
>>
>> What were the approximation and confidence parameters you used? Or did
>> you go along with the default epsilon=0.01 and delta=0.1?
>>
>> Best,
>> Henning
>>
>>
>> Am 07.05.14 09:34, schrieb Kevin Tierney:
>>> Hi,
>>>
>>> We have come across another graph where there seems to be an issue with
>>> ApproxBetweenness. This graph is not trivially small, nor is it
>>> completely connected, meaning we'd actually be rather interested in
>>> finding out the ApproxBetweenness value for it. I let it run overnight
>>> (~14 hours with 8 cores) and did not get an answer, so I assume it was
>>> never going to finish.
>>>
>>> The graph is a bit large, so I put it here:
>>> https://dl.dropboxusercontent.com/u/16348505/ns894786.mps.gz.variable.graph.bz2
>>>
>>>
>>> Any help you can provide would be greatly appreciated.
>>>
>>> By the way, we tried to compute the number of samples ourselves and it
>>> seemed like it would only be around ~63000, so perhaps there is another
>>> issue?
>>>
>>> Thanks for your help.
>>>
>>> Kevin
>>>
>>> On 06.05.2014 16:32, Christian Staudt wrote:
>>>> The important thing is that you get sensible results on the real
>>>> networks that are of interest to you. If that’s the case, I give a
>>>> low priority to this bug. I agree that betweenness is uninteresting
>>>> on near-complete graphs.
>>>>
>>>> Am 06.05.2014 um 16:21 schrieb Kevin Tierney <kevin.tierney at upb.de>:
>>>>
>>>>> Hi Christian,
>>>>>
>>>>> Thanks for looking into the issue and for your update.
>>>>>
>>>>> We've noticed similar behavior on complete or nearly-complete graphs. I
>>>>> suppose computing a betweenness measure on such graphs is pointless
>>>>> anyway (at least for what we're doing), but at least that seems like a
>>>>> possible cause of the weird behavior.
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>> On 06.05.2014 15:30, Christian Staudt wrote:
>>>>>> Hi Kevin,
>>>>>> running the program with the log level set to INFO reveals the
>>>>>> problem:
>>>>>>
>>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 38: virtual
>>>>>> void NetworKit::ApproxBetweenness::run()]: estimating vertex diameter
>>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 41: virtual
>>>>>> void NetworKit::ApproxBetweenness::run()]: estimated diameter: 2
>>>>>> [INFO ][“src/cpp/centrality/ApproxBetweenness.cpp”, 47: virtual
>>>>>> void NetworKit::ApproxBetweenness::run()]: taking
>>>>>> 9223372036854775808 path samples
>>>>>>
>>>>>> A ridiculously high number of samples is taken. Where does it come
>>>>>> from? The number of samples is calculated as:
>>>>>>
>>>>>>      count r = ceil((c / (epsilon * epsilon)) * (floor(log(vd - 2))
>>>>>> + 1 + log(1 / delta)));
>>>>>>
>>>>>> So if the estimated diameter (vd) is 2, then the formula contains
>>>>>> the logarithm of 0, which looks like trouble. I am currently not
>>>>>> sure how to fix this, because this is how the formula was
>>>>>> published. We will investigate.
>>>>>>
>>>>>> At the same time I discovered and fixed an error in the formula,
>>>>>> but I believe this was unrelated to the problem you describe.
>>>>>> Please also update to the latest Dev-Revision for your experiments.
>>>>>>
>>>>>> Kind regards
>>>>>> Christian
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am 06.05.2014 um 14:57 schrieb Kevin Tierney
>>>>>> <kevin.tierney at uni-paderborn.de>:
>>>>>>
>>>>>>> Dear NetworKit Gurus,
>>>>>>>
>>>>>>> We are trying to use the approximate betweenness function and
>>>>>>> notice an
>>>>>>> odd thing happening on small graphs. I have attached a graph where
>>>>>>> calling:
>>>>>>>
>>>>>>> NetworKit.centrality.ApproxBetweenness(G)
>>>>>>>
>>>>>>> seems to never finish. At least the CPU time required seems to be way
>>>>>>> more than should be for such a small graph.
>>>>>>>
>>>>>>> You might say "Why would you compute approxbetweenness for such a
>>>>>>> small
>>>>>>> graph?" -- the reason is this was just mixed in with a bunch of
>>>>>>> larger
>>>>>>> graphs where betweenness would take too long.
>>>>>>>
>>>>>>> Do you have any ideas on what is happening?
>>>>>>>
>>>>>>> By the way, could this perhaps be related to the sample size of 42 in
>>>>>>> the .cpp file for approxbetweenness? We aren't getting any issues on
>>>>>>> large graphs; in fact they are being solved relatively quickly.
>>>>>>>
>>>>>>> Thanks for your help!
>>>>>>>
>>>>>>> Sincerely,
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jun.-Prof Dr. Kevin Tierney
>>>>>>> University of Paderborn
>>>>>>> Faculty of Business Administration and Economics
>>>>>>> Department 3: Business Information Systems
>>>>>>> Room: Q2.472
>>>>>>> Tel.: +49 (0)5251 60-5244
>>>>>>> <markshare_5_0.mps.gz.constraint.graph><ATT00001.c>
>>>>>>
>>>>>
>>>>> --
>>>>> Jun.-Prof Dr. Kevin Tierney
>>>>> University of Paderborn
>>>>> Faculty of Business Administration and Economics
>>>>> Department 3: Business Information Systems
>>>>> Room: Q2.472
>>>>> Tel.: +49 (0)5251 60-5244
>>>>
>>>
>>
>

-- 

=======================================================
Karlsruhe Institute of Technology (KIT)
Institute of Theoretical Informatics (ITI)

Juniorprof. Dr. Henning Meyerhenke
Theoret. Informatics / Parallel Computing

Phone: +49-721-608-41876
Web: http://parco.iti.kit.edu/henningm/

KIT - University of the State of Baden-Wuerttemberg and
National Research Center of the Helmholtz Association
=======================================================

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5141 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.ira.uni-karlsruhe.de/mailman/private/networkit/attachments/20140507/ab8e2c95/attachment-0001.p7s>


More information about the NetworKit mailing list