[Networkit] Dissimilarity measures for clusterings with overlapping clusters (covers)
Hamann, Michael (ITI)
michael.hamann at kit.edu
Tue Jul 15 11:03:34 CEST 2014
as some of you might have noticed I've started working on the support
for clusterings with overlapping clusters (covers). So far I've added
two reader and a writer class for two common file formats. What I would
like to add next are (dis)similarity measures for covers. For
dissimilarity measures we have a base class, DissimilarityMeasure, and a
few subclasses like NMIDistance. All of them only take partitions as
parameters. For example in the case of NMI, there are also
generalizations of NMI for overlapping clusters that I would like to
implement in NetworKit. My question is now: Where do they fit in NetworKit?
I see two options:
1. Extend DissimilarityMeasure with an additional
getDissimilarity()-function that accepts two cover instances instead of
two partitions. As not every dissimilarity measure has a variant for
covers, I would suggest to add a default implementation that throws an
2. Add a new base class, CoverDissimilarityMeasure, and add new classes
for each dissimilarity measure for covers.
The problem of the first option is in my opinion that it is not obvious
which measure accepts covers and which not and that there might be more
than one generalization for covers. That's why I would actually prefer
the second option even though I do not really like that the variants for
covers contain "Cover" in their name while the variants for partitions
do not contain "Partition" in their name.
Do you have any other suggestions or opinions on these options?
PS: I am working on the topic of "Skeleton-based Clustering in Big and
Streaming Social Networks" (see  for an abstract) and I will probably
implement most of the stuff for the project in NetworKit so you can
expect to see more work on clustering and skeletons in NetworKit.
More information about the NetworKit