Selasa, 16 Oktober 2007

What people Digg More?

Due to too much work i wasn't able to write to this blog as much as i wanted. Although people continue to answer the questionnaire (over 400!) i wasn't able to make any other analysis so far that will shed some light on the patterns that emerge from living our lives.

However, i feel that i should write something about my new ventures on text mining. The question that came up to my mind was simple :

"What stories people tend to digg more?"

So i collected all stories on digg and for each story the number of diggs and the time that the story has been around was recorded. By dividing the number of diggs by the total minutes the story has been out, you get a "Diggs_per_Minute" score which essentially designates which stories are "hot" and which are not.

After the preliminary analysis i immediately found out that it is essential to use data from a specific time period and not just everything. If you think about it, a story should be out for quite a while (say 10 days) so that you are able to get a good estimate of the "Diggs_per_Minute" variable. Stories that have been out for less than 2 days tend to have a much greater score of Diggs per Minute than newer stories.

So the process is as follows: Diggs from stories that have been out for 10-11 days are collected. I then use text mining techniques to find out what words the stories with many diggs have in common. Don't you think that marketing people would love to know this information?


First Results for Most Digged stories :


1) Stories that have pictures tend to be digged more

2) Having the phrase "Digg this if you....."

3) Specific Companies / technologies etc (e.g Apple and Ipod)


That's all for now but i will come back with more.



Tidak ada komentar:

Posting Komentar