[Networkit] Problem computing ApproxBetweeness

Lars Beckmann lars.beckmann at gmail.com
Thu May 8 10:42:15 CEST 2014


Hi Henning,

thanks! Again, you found the issue. I didn't know we could call the tests
with "-O". It still took 4x the time you reported on my VM, but that is
just because it really is slow I think (3507350 ms).

Now we need to figure out why it didn't finish in 14h on our bigger
machine. Maybe we didn't compile with optimize flags there either. We will
check and get back to you if we can't find the issue.

Thanks again!

Lars


On Wed, May 7, 2014 at 5:48 PM, Henning Meyerhenke <meyerhenke at kit.edu>wrote:

> Hi Lars,
>
> This is odd, so let's look into possible causes.
>
> - Have you experienced similar slowdowns in your VW with other programs?
>
> - How did you compile NetworKit and later run the unit test: Did you use
> optimize=D(ebug) or optimize=O(ptimized) and ./NetworKit-Tests-D/O? It
> happened to me before that I confused both because D and O look similar in
> the shell. The implications on performance are dramatic...
>
> Best,
> Henning
>
>
>
> Am 07.05.14 16:07, schrieb Lars Beckmann:
>
>> Hi Henning,
>>
>> I did what you suggested (see output attached). I let it run for a
>> couple of hours on my admittedly relatively weak machine. I have a VM
>> running on my laptop (which has a 3 years old i7-620M).
>>
>> To me the output looks ok and if I understand that correctly it chewed
>> through about 20k out of the 60k samples when I aborted the process.
>>
>> Does that give you any info that I don't see? Is there anything else we
>> can do?
>>
>> Lars
>>
>>
>>
>> On Wed, May 7, 2014 at 11:03 AM, Henning Meyerhenke <meyerhenke at kit.edu
>> <mailto:meyerhenke at kit.edu>> wrote:
>>
>>     Hi Kevin,
>>
>>     I have created a unit test tryApproxBetweennessOnRealGrap__h which
>>
>>     operates on your graph.
>>
>>     On my MacBook with 4 cores the operation took about 800 seconds,
>>     which is in the expected range and far from the behavior you
>>     describe. This timing is even with --loglevel=DEBUG, fewer output
>>     would result in a somewhat faster run.
>>
>>     Could you please update to the latest Dev branch revision and run
>>     the test yourself? You need to put your graph into the input/
>>     directory for that. Please produce DEBUG output as well and have it
>>     written to a file if possible.
>>
>>     Thanks,
>>     Henning
>>
>>
>>
>>
>>     Am 07.05.14 10 <tel:07.05.14%2010>:46, schrieb Kevin Tierney:
>>
>>
>>         Hi,
>>
>>         We used the defaults.
>>
>>         Kevin
>>
>>         On 07.05.2014 10 <tel:07.05.2014%2010>:34, Henning Meyerhenke
>> wrote:
>>
>>             Hi Kevin,
>>
>>             What were the approximation and confidence parameters you
>>             used? Or did
>>             you go along with the default epsilon=0.01 and delta=0.1?
>>
>>             Best,
>>             Henning
>>
>>
>>             Am 07.05.14 09 <tel:07.05.14%2009>:34, schrieb Kevin Tierney:
>>
>>
>>                 Hi,
>>
>>                 We have come across another graph where there seems to
>>                 be an issue with
>>                 ApproxBetweenness. This graph is not trivially small,
>>                 nor is it
>>                 completely connected, meaning we'd actually be rather
>>                 interested in
>>                 finding out the ApproxBetweenness value for it. I let it
>>                 run overnight
>>                 (~14 hours with 8 cores) and did not get an answer, so I
>>                 assume it was
>>                 never going to finish.
>>
>>                 The graph is a bit large, so I put it here:
>>                 https://dl.dropboxusercontent.
>> __com/u/16348505/ns894786.mps.__gz.variable.graph.bz2
>>
>>                 <https://dl.dropboxusercontent.com/u/
>> 16348505/ns894786.mps.gz.variable.graph.bz2>
>>
>>
>>                 Any help you can provide would be greatly appreciated.
>>
>>                 By the way, we tried to compute the number of samples
>>                 ourselves and it
>>                 seemed like it would only be around ~63000, so perhaps
>>                 there is another
>>                 issue?
>>
>>                 Thanks for your help.
>>
>>                 Kevin
>>
>>                 On 06.05.2014 16 <tel:06.05.2014%2016>:32, Christian
>>
>>                 Staudt wrote:
>>
>>                     The important thing is that you get sensible results
>>                     on the real
>>                     networks that are of interest to you. If that’s the
>>                     case, I give a
>>                     low priority to this bug. I agree that betweenness
>>                     is uninteresting
>>                     on near-complete graphs.
>>
>>                     Am 06.05.2014 <tel:06.05.2014> um 16:21 schrieb
>>                     Kevin Tierney <kevin.tierney at upb.de
>>                     <mailto:kevin.tierney at upb.de>>:
>>
>>
>>                         Hi Christian,
>>
>>                         Thanks for looking into the issue and for your
>>                         update.
>>
>>                         We've noticed similar behavior on complete or
>>                         nearly-complete graphs. I
>>                         suppose computing a betweenness measure on such
>>                         graphs is pointless
>>                         anyway (at least for what we're doing), but at
>>                         least that seems like a
>>                         possible cause of the weird behavior.
>>
>>                         Kevin
>>
>>
>>                         On 06.05.2014 15 <tel:06.05.2014%2015>:30,
>>
>>                         Christian Staudt wrote:
>>
>>                             Hi Kevin,
>>                             running the program with the log level set
>>                             to INFO reveals the
>>                             problem:
>>
>>                             [INFO
>>                             ][“src/cpp/centrality/__
>> ApproxBetweenness.cpp”,
>>                             38: virtual
>>                             void NetworKit::ApproxBetweenness::__run()]:
>>                             estimating vertex diameter
>>                             [INFO
>>                             ][“src/cpp/centrality/__
>> ApproxBetweenness.cpp”,
>>                             41: virtual
>>                             void NetworKit::ApproxBetweenness::__run()]:
>>                             estimated diameter: 2
>>                             [INFO
>>                             ][“src/cpp/centrality/__
>> ApproxBetweenness.cpp”,
>>                             47: virtual
>>                             void NetworKit::ApproxBetweenness::__run()]:
>>
>>                             taking
>>                             9223372036854775808 path samples
>>
>>                             A ridiculously high number of samples is
>>                             taken. Where does it come
>>                             from? The number of samples is calculated as:
>>
>>                                   count r = ceil((c / (epsilon *
>>                             epsilon)) * (floor(log(vd - 2))
>>                             + 1 + log(1 / delta)));
>>
>>                             So if the estimated diameter (vd) is 2, then
>>                             the formula contains
>>                             the logarithm of 0, which looks like
>>                             trouble. I am currently not
>>                             sure how to fix this, because this is how
>>                             the formula was
>>                             published. We will investigate.
>>
>>                             At the same time I discovered and fixed an
>>                             error in the formula,
>>                             but I believe this was unrelated to the
>>                             problem you describe.
>>                             Please also update to the latest
>>                             Dev-Revision for your experiments.
>>
>>                             Kind regards
>>                             Christian
>>
>>
>>
>>                             Am 06.05.2014 <tel:06.05.2014> um 14:57
>>                             schrieb Kevin Tierney
>>                             <kevin.tierney at uni-paderborn.__de
>>                             <mailto:kevin.tierney at uni-paderborn.de>>:
>>
>>
>>                                 Dear NetworKit Gurus,
>>
>>                                 We are trying to use the approximate
>>                                 betweenness function and
>>                                 notice an
>>                                 odd thing happening on small graphs. I
>>                                 have attached a graph where
>>                                 calling:
>>
>>                                 NetworKit.centrality.__
>> ApproxBetweenness(G)
>>
>>
>>                                 seems to never finish. At least the CPU
>>                                 time required seems to be way
>>                                 more than should be for such a small
>> graph.
>>
>>                                 You might say "Why would you compute
>>                                 approxbetweenness for such a
>>                                 small
>>                                 graph?" -- the reason is this was just
>>                                 mixed in with a bunch of
>>                                 larger
>>                                 graphs where betweenness would take too
>>                                 long.
>>
>>                                 Do you have any ideas on what is
>> happening?
>>
>>                                 By the way, could this perhaps be
>>                                 related to the sample size of 42 in
>>                                 the .cpp file for approxbetweenness? We
>>                                 aren't getting any issues on
>>                                 large graphs; in fact they are being
>>                                 solved relatively quickly.
>>
>>                                 Thanks for your help!
>>
>>                                 Sincerely,
>>
>>                                 Kevin
>>
>>
>>                                 --
>>                                 Jun.-Prof Dr. Kevin Tierney
>>                                 University of Paderborn
>>                                 Faculty of Business Administration and
>>                                 Economics
>>                                 Department 3: Business Information Systems
>>                                 Room: Q2.472
>>                                 Tel.: +49 (0)5251 60-5244
>>                                 <tel:%2B49%20%280%295251%2060-5244>
>>                                 <markshare_5_0.mps.gz.__
>> constraint.graph><ATT00001.c>
>>
>>
>>
>>
>>                         --
>>                         Jun.-Prof Dr. Kevin Tierney
>>                         University of Paderborn
>>                         Faculty of Business Administration and Economics
>>                         Department 3: Business Information Systems
>>                         Room: Q2.472
>>                         Tel.: +49 (0)5251 60-5244
>>                         <tel:%2B49%20%280%295251%2060-5244>
>>
>>
>>
>>
>>
>>
>>     --
>>
>>     ==============================__=========================
>>
>>     Karlsruhe Institute of Technology (KIT)
>>     Institute of Theoretical Informatics (ITI)
>>
>>     Juniorprof. Dr. Henning Meyerhenke
>>     Theoret. Informatics / Parallel Computing
>>
>>     Phone: +49-721-608-41876 <tel:%2B49-721-608-41876>
>>     Web: http://parco.iti.kit.edu/__henningm/
>>
>>     <http://parco.iti.kit.edu/henningm/>
>>
>>     KIT - University of the State of Baden-Wuerttemberg and
>>     National Research Center of the Helmholtz Association
>>     ==============================__=========================
>>
>>
>>     _______________________________________________
>>     NetworKit mailing list
>>     NetworKit at ira.uni-karlsruhe.de <mailto:NetworKit at ira.uni-karlsruhe.de
>> >
>>
>>     https://lists.ira.uni-karlsruhe.de/mailman/listinfo/networkit
>>
>>
>>
>>
>> --
>> Lars Beckmann
>>
>> Uhlenstraße 10
>> 33098 Paderborn
>> Germany
>>
>> Cell (D2): +49 (0) 178 59 69 150
>> Phone: +49 (0) 89 380 123 90
>> Fax: +49 (0) 180 102 113 4628
>>
>
> --
>
> =======================================================
> Karlsruhe Institute of Technology (KIT)
> Institute of Theoretical Informatics (ITI)
>
> Juniorprof. Dr. Henning Meyerhenke
> Theoret. Informatics / Parallel Computing
>
> Phone: +49-721-608-41876
> Web: http://parco.iti.kit.edu/henningm/
>
> KIT - University of the State of Baden-Wuerttemberg and
> National Research Center of the Helmholtz Association
> =======================================================
>
>


-- 
Lars Beckmann

Uhlenstraße 10
33098 Paderborn
Germany

Cell (D2): +49 (0) 178 59 69 150
Phone: +49 (0) 89 380 123 90
Fax: +49 (0) 180 102 113 4628
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ira.uni-karlsruhe.de/mailman/private/networkit/attachments/20140508/0679118a/attachment-0001.html>


More information about the NetworKit mailing list