[Networkit] Performance optimization of PLM implementation

Michael Hamann michael.hamann at kit.edu
Wed Sep 3 17:31:17 CEST 2014


Hi,

as I was wondering why the PLM implementation in NetworKit is so slow (see 
below for some numbers) I looked at the differences of the two implementations 
and found some differences:

- PLM uses a std::map for storing the weight of the edges to other clusters 
while Louvain uses a std::vector
- PLM iterates over the clusters of all neighboring nodes instead of just all 
different neighboring clusters in order to determine the modularity gain

While changing the loop to only include each neighboring cluster is easy (just 
iterate over the elements in the map), the main performance gain in my 
experiments came from using the vector instead of a map.

Now the problem is that PLM is parallel, so we actually need a vector per 
thread. And also a second vector for storing the neighboring clusters. All in 
all this means that we get 2 * numberOfThreads * numberOfNodes * 8 bytes 
memory overhead.

What do you think? Is the difference in running time worth the memory overhead? 
I've attached both changes as patches so you can apply them if you want to 
test the variants.

Michael

Performance measurements (all with 4 threads on my Intel(R) Core(TM) i7-4600U 
CPU @ 2.10GHz):

// without patch
Reading graph file ../graphs/coPapersDBLP.graph ...
Reading graph took 1.7 s
Parallel Label Propagation on ../graphs/coPapersDBLP.graph: 1.5 s
        # communities: 8208
        modularity: 0.785288
Parallel Label Propagation on ../graphs/coPapersDBLP.graph: 1.5 s
        # communities: 7928
        modularity: 0.788512
Parallel Louvain on ../graphs/coPapersDBLP.graph: 7.8 s
        # communities: 184
        modularity: 0.850204
Parallel Louvain on ../graphs/coPapersDBLP.graph: 9.7 s
        # communities: 184
        modularity: 0.850031

// with PLM_neighbors.patch
Reading graph file ../graphs/coPapersDBLP.graph ...
Reading graph took 1.7 s
Parallel Label Propagation on ../graphs/coPapersDBLP.graph: 1.5 s
        # communities: 8008
        modularity: 0.787435
Parallel Label Propagation on ../graphs/coPapersDBLP.graph: 1.5 s
        # communities: 8012
        modularity: 0.786336
Parallel Louvain on ../graphs/coPapersDBLP.graph: 5.9 s
        # communities: 183
        modularity: 0.849225
Parallel Louvain on ../graphs/coPapersDBLP.graph: 5.6 s
        # communities: 187
        modularity: 0.850499

// with PLM.patch
Reading graph file ../graphs/coPapersDBLP.graph ...
Reading graph took 1.8 s
Parallel Label Propagation on ../graphs/coPapersDBLP.graph: 1.5 s
        # communities: 7662
        modularity: 0.788591
Parallel Label Propagation on ../graphs/coPapersDBLP.graph: 1.5 s
        # communities: 7757
        modularity: 0.788746
Parallel Louvain on ../graphs/coPapersDBLP.graph: 2.5 s
        # communities: 175
        modularity: 0.848935
Parallel Louvain on ../graphs/coPapersDBLP.graph: 2.3 s
        # communities: 171
        modularity: 0.851161
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PLM.patch
Type: text/x-patch
Size: 3759 bytes
Desc: not available
URL: <https://lists.ira.uni-karlsruhe.de/mailman/private/networkit/attachments/20140903/37f64aaa/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PLM_neighbors.patch
Type: text/x-patch
Size: 2114 bytes
Desc: not available
URL: <https://lists.ira.uni-karlsruhe.de/mailman/private/networkit/attachments/20140903/37f64aaa/attachment-0001.bin>


More information about the NetworKit mailing list