Evaluate whether one can skip the cooccurrence computation on the whole graph or do it just on a random sample of pairs of nodes
The main idea is the following:
Since the Curveball Algorithm needs to compute the common (and uncommon neighbors) then one can probably use this in order to compute a sample of the cooccurrence of all pairs of nodes. Even though we may run more steps of the Curveball Algorithm.
one step t
by doing trades between two vertices u
and v
, the Curveball Algorithm computes the coocc(u,v) at time t-1
.
Another Idea is to do like in Ying Gu master thesis, that is, each time a trade is made (a certain number of swaps) then the cooccurrence matrix is uploaded.