Blog research

From PublicWiki
Revision as of 04:57, 24 January 2006 by Jack (talk | contribs)

Jump to: navigation, search

Information flow through Blogspace Research

The current goal of this research is to characterize the information flow through blogs, specifically those of the site Livejournal. There is an available feed of the latest blog posts made to this site which usually arrive at a rate of over 300 posts per minute.

An initial study has been done to track the flow of links. See Link_Results.

Current work is attempting to track information flow in terms of conversational topics. Initially topics will be characterized by words that are momentaril bursty (somewhat related to Kleinburg's Bursty Streams). This should be able to catch the spread of information of an item that is extremely news-worthy but will not be able to catch slowly spreading information very well.

A second take on the spread of information flow would be to use Latent Dirichlet Allocation or LDA. LDA can be used to model every topic in a corpus of documents as a distribution of words and each post as a distribution of topics. While this method would do less to catch 'information epidemics' it might be able to detect users who consistently influence which topics the readers of their blogs post about. See Statistical text modeling.


Recently read papers

Bursty and Hierarchical Structure in Streams

On the bursty evolution of Blogspace

Temporal Dynamics of On-Line Information Streams

BlogPulse: Automated Trend Discovery for Weblogs BlogPulse_Review

Implicit Structure and the Dynamics of Blogspace ImplicitStructure_Review


Less recently read papers

Information Diffusion Through Blogspace

On the bursty evolution of Blogspace

And there are more, I just need to find them all.