Difference between revisions of "Blog research"
Line 18: | Line 18: | ||
[http://www.cs.cornell.edu/home/kleinber/stream-survey04.pdf Temporal Dynamics of On-Line Information Streams] | [http://www.cs.cornell.edu/home/kleinber/stream-survey04.pdf Temporal Dynamics of On-Line Information Streams] | ||
− | [http://www.blogpulse.com/papers/www2004glance.pdf BlogPulse: Automated Trend Discovery for Weblogs][[BlogPulse_Review]] | + | [http://www.blogpulse.com/papers/www2004glance.pdf BlogPulse: Automated Trend Discovery for Weblogs] [[BlogPulse_Review]] |
+ | |||
+ | [http://www.blogpulse.com/papers/Adar_blogworkshop2_ppt.pdf Implicit Structure and the Dynamics of Blogspace] [[ImplicitStructure_Review]] | ||
+ | |||
==Less recently read papers== | ==Less recently read papers== | ||
[http://www.www2004.org/proceedings/docs/1p491.pdf Information Diffusion Through Blogspace] | [http://www.www2004.org/proceedings/docs/1p491.pdf Information Diffusion Through Blogspace] | ||
− | |||
− | |||
[http://www.hpl.hp.com/research/idl/papers/blogs/blogspace-draft.pdf On the bursty evolution of Blogspace] | [http://www.hpl.hp.com/research/idl/papers/blogs/blogspace-draft.pdf On the bursty evolution of Blogspace] | ||
And there are more, I just need to find them all. | And there are more, I just need to find them all. |
Revision as of 04:57, 24 January 2006
Information flow through Blogspace Research
The current goal of this research is to characterize the information flow through blogs, specifically those of the site Livejournal. There is an available feed of the latest blog posts made to this site which usually arrive at a rate of over 300 posts per minute.
An initial study has been done to track the flow of links. See Link_Results.
Current work is attempting to track information flow in terms of conversational topics. Initially topics will be characterized by words that are momentaril bursty (somewhat related to Kleinburg's Bursty Streams). This should be able to catch the spread of information of an item that is extremely news-worthy but will not be able to catch slowly spreading information very well.
A second take on the spread of information flow would be to use Latent Dirichlet Allocation or LDA. LDA can be used to model every topic in a corpus of documents as a distribution of words and each post as a distribution of topics. While this method would do less to catch 'information epidemics' it might be able to detect users who consistently influence which topics the readers of their blogs post about. See Statistical text modeling.
Recently read papers
Bursty and Hierarchical Structure in Streams
On the bursty evolution of Blogspace
Temporal Dynamics of On-Line Information Streams
BlogPulse: Automated Trend Discovery for Weblogs BlogPulse_Review
Implicit Structure and the Dynamics of Blogspace ImplicitStructure_Review
Less recently read papers
Information Diffusion Through Blogspace
On the bursty evolution of Blogspace
And there are more, I just need to find them all.