[Networkit] personalized PageRank implementation

Karim Kouki kookstrash at gmail.com
Tue May 9 17:39:38 CEST 2017


ERRATUM:
about the file that needs to be read in case a Networkit::Cover/Partition
is used, this is not an edgeList list but a node ID list.

Kind Regards,
Karim

2017-05-09 17:32 GMT+02:00 Karim Kouki <kookstrash at gmail.com>:

> Hi Henning,
>
> Thank you for taking the time to correct my mistake with PageRank (sorry
> for that).
>
> Well I think I am aware of what you suggest, but I meant that I don't
> understand how to implement it.
>
>
> About the PageRank implementation, I figured that in the loop I pointed
> out in the previous e-mail, there is no problem with the Weight Matrix
> contribution, that  is handled by the current implementation in the "power
> iteration" algorithm.
>
>
> Nevertheless, about the teleport term, I need to specify a subset of nodes
> that will receive the additional teleport term as contribution of special
> nodes (fraudulent transaction nodes).
>
>
>
>
>
>
> My problem is that I don't know what to do for that specifically. I found
> two objects in the documentation: NetworKit::Partition and NetworKit::Cover
> that seem to do the trick but this implies I need to read a supplementary
> text file (in edgelist format) to load a Partion/Cover in memory and check,
> for each node in the loop, whether or not it belongs to a partition/Cover
> and add the teleportation term in the positive alternative.
>
> I see no other way to do that and I am convinced there should be a better
> one. Does anyone have any suggestion for this part specifically?
>
> Kind regards,
> Karim
>
> 2017-05-09 16:42 GMT+02:00 Henning Meyerhenke <henning.meyerhenke at kit.edu>
> :
>
>> Dear Karim,
>>
>> What you attached is PageRank, not personalized PageRank.
>>
>> The current implementation assumes the teleport vector to be a uniform
>> distribution. While this is a common assumption, it may not work for all
>> cases -- as you experience.
>>
>> You could work with a general probability distribution vector instead.
>> The nodes that do not get the teleport term receive probability zero.
>> That needs a few code changes (if the vector is part of the object, the
>> interface can stay the same, I guess) and may be slightly slower. But
>> overall, it sounds like a reasonable solution.
>>
>> Best,
>> Henning
>>
>>
>>
>>
>> Am 04.05.17 um 18:35 schrieb Karim Kouki:
>> > Hello,
>> >
>> > I would like to implement the personalized PageRank with the help of the
>> > Networkit Library SourceCode in C++.
>> >
>> > In practical terms, this means in the followinf While loop:
>> >
>> >
>> >
>> >
>> >
>> >
>> > while (! isConverged) {
>> > handler.assureRunning();
>> > G.balancedParallelForNodes([&](node u) {
>> > pr[u] = 0.0;
>> > G.forInEdgesOf(u, [&](node u, node v, edgeweight w) {
>> > // note: inconsistency in definition in Newman's book (Ch. 7) regarding
>> > directed graphs
>> > // we follow the verbal description, which requires to sum over the
>> > incoming edges
>> > pr[u] += scoreData[v] * w / deg[v];
>> > });
>> > pr[u] *= damp;
>> > if(u.edgeId in listIDs.Id){ //here, if an edge is part of a predefined
>> > list, this if{} test would be valid
>> >  //and the teleport term would be added
>> > pr[u] += teleportProb; //
>> > });
>> >
>> > ...
>> >
>> >
>> >
>> >
>> > (see attachment)
>> >
>> >
>> >
>> >
>> > I would like to add the teleport term only for a given list of nodes.
>> >
>> > I thought the Partition/EdgeListPartition readers would help, but it
>> > needs to read the special nodes from a text file and I potentially have
>> > a huge file to read.
>> >
>> >
>> > I was hoping there would be a node attribute that could help me do that.
>> >
>> >
>> >
>> > Does anyone have an idea how to solve that issue?
>> >
>> > (for the personalized pageRank algorithm, see: attachment 2, I want to
>> > replicate the process described in this document)
>> >
>> >
>> > Kind regards,
>> > Karim
>> >
>> >
>> >
>> > _______________________________________________
>> > NetworKit mailing list
>> > NetworKit at ira.uni-karlsruhe.de
>> > https://lists.ira.uni-karlsruhe.de/mailman/listinfo/networkit
>> >
>>
>> --
>>
>> ==========================================================
>> Karlsruhe Institute of Technology (KIT)
>> Institute of Theoretical Informatics (ITI)
>>
>> Prof. Dr. Henning Meyerhenke
>> Theoret. Informatics / Parallel Computing
>>
>> Phone: +49-721-608-41876
>> Web: http://parco.iti.kit.edu/henningm/
>>
>> KIT - The Research University in the Helmholtz Association
>> ==========================================================
>>
>>
>> _______________________________________________
>> NetworKit mailing list
>> NetworKit at ira.uni-karlsruhe.de
>> https://lists.ira.uni-karlsruhe.de/mailman/listinfo/networkit
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ira.uni-karlsruhe.de/pipermail/networkit/attachments/20170509/0fe37cf4/attachment-0001.html>


More information about the NetworKit mailing list