[Networkit] Dissimilarity measures for clusterings with overlapping clusters (covers)

Hamann, Michael (ITI) michael.hamann at kit.edu
Tue Jul 15 11:03:34 CEST 2014

Hi everybody,

as some of you might have noticed I've started working on the support 
for clusterings with overlapping clusters (covers). So far I've added 
two reader and a writer class for two common file formats. What I would 
like to add next are (dis)similarity measures for covers. For 
dissimilarity measures we have a base class, DissimilarityMeasure, and a 
few subclasses like NMIDistance. All of them only take partitions as 
parameters. For example in the case of NMI, there are also 
generalizations of NMI for overlapping clusters that I would like to 
implement in NetworKit. My question is now: Where do they fit in NetworKit?

I see two options:
1. Extend DissimilarityMeasure with an additional 
getDissimilarity()-function that accepts two cover instances instead of 
two partitions. As not every dissimilarity measure has a variant for 
covers, I would suggest to add a default implementation that throws an 
2. Add a new base class, CoverDissimilarityMeasure, and add new classes 
for each dissimilarity measure for covers.

The problem of the first option is in my opinion that it is not obvious 
which measure accepts covers and which not and that there might be more 
than one generalization for covers. That's why I would actually prefer 
the second option even though I do not really like that the variants for 
covers contain "Cover" in their name while the variants for partitions 
do not contain "Partition" in their name.

Do you have any other suggestions or opinions on these options?


PS: I am working on the topic of "Skeleton-based Clustering in Big and 
Streaming Social Networks" (see [0] for an abstract) and I will probably 
implement most of the stuff for the project in NetworKit so you can 
expect to see more work on clustering and skeletons in NetworKit.


More information about the NetworKit mailing list