On the bursty evolution of Blogspace
by Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, Andrew Tomkins
The authors have crawled between 20,000 and 30,000 blogs and extracted links between blogs and times the links were created. The blogs were not uniform in templating, so not all times could be extracted accurately, but suggested 90% correct. The idea is then to extract dense subgraphs from this set of edges and nodes before viewing each edge creation as an event in the subgraph and applying Kleinburg's work on bursty streams.
The authors also found that by their community extraction algorithm the largest community began to take over the majority of nodes in the graph. Not present at the start of data collection in 2001, by some point in 2002 when the paper was written(?) the main community comprised 20% of the total graph, doubling in size every 3 months.
They compare this graph growth to a similarly constructed random graph and determine that the random graph experiences greater singly-connected-component growth, but less overall community numbers.
They also find that communities are becoming more bursty in how they link to each other.
They mention using Kleinberg's bursty streams, but also analyzing events in the stream as relevant/non-relevent in the weighting of bursts. This seems highly relevant and clearly I need to read a bit more of Kleinberg.
The things that were done were fine, but they didn't interest me all that much except for the idea of viewing linkages as burst. Perhaps then information flow could be viewed as a burst? Hmm. Also enjoyed the aspect of comparing the graph to a similar random graph - even if I don't understand the mechanics of such.
Problem Solved: How are communities in blogspace evolving, with respect to community size, linking and link burstiness.