Jumat, 17 Juni 2011

The Greek Debt crisis - Some Trends

Several friends and blog readers ask me very frequently on what i think about Greece and the problems that Greece has on a Social and Economic level. Since this is not a blog about Politics or the Economy i will try to give my point of view with some analytics added. 

 It is always interesting to know how people feel and what do they think about the economy,their future, the politicians and how the general sentiment is. Also of great importance is the trend of all opinions and/or sentiment as this is recorded in Blog posts and other Social Media sources.

Here are some examples from data that i collect on a daily basis, several times a day from Greek blogs. Hundreds of Concepts are annotated within thousands of Blogs entries and collected for further analysis.


The results that i will show here are for :

- the latest Government Reform

- words that communicate Negative Sentiment.

-The "Indignants Movement" : Citizens that do not agree with the practices of both 2 largest Greek political parties  during the past 30 years and spending cuts directed by the IMF.

- Debt Crisis

Let us begin with the trend of "Government Reform" which at the time of writing (17/06/11 - Note that date format is  DD/MM/YY) has just happened. Here is the trend of mentions :





Notice how during the previous days not many mentions were captured and how much the trend increases until June 17th were the reform took place.


Next, let's look at entries that communicate "economic default" and their trend :





Again notice how on previous days mentions of Greek default start to rise (starting from June 3rd) and gradually the trend appears to fade out (French and German leaders said they will back up Greek debt on June 17th). It was no surprise that on June 8th and 9th (yet more) Greeks rushed in Banks to withdraw their money.

Here is the Trend of "The Indignants" movement :



Notice dates May 29th-30th, June 5th, Jun 12th-13th. All of these dates are Sundays (or close to Sundays) which is the day that most people gather in Syntagma square to express their anger for the IMF and Government practices. The trend however appears to be falling but  this may well be changing in the next days. Time will tell.

How about the words that communicate Negative Sentiment? Here is the trend :





Negative sentiment words appear to be somewhat rising after 31/05 but are coming down to previous levels.

FYI,  words that frequently occur with the concept "Politicians" are : "leaders", "cheats", "traitors".


More on the next post.

Kamis, 16 Juni 2011

Apple Products on Twitter - A Text Analytics example


My presentation on the 7th annual text analytics summit was a tutorial in one of the methodologies one could use to analyze unstructured text. The sample consisted of 365000 tweets that contained keywords of Apple products and concepts such as iPad, iPhone, iPod, Apple Store, Mac, Steve Jobs and the goal was to get an understanding of what people where tweeting about each product or concept.

The first step is to use a text analysis toolkit (i used GATE) to annotate the tweets and identify which concepts and keywords occur within the tweets. But this is not always easy. Take the word Mac for example. According to the context, Mac could be a computer type,  a burger type, the MAC beauty products or Mac Arthur airport. So when a query sent to Twitter API that contains the word  Mac we end up with lots of erroneous information.

So one of the things that have to be done to ensure good results  is word sense disambiguation. We know for example that if a tweet contains a word such as fries, lettuce and/or salad then quite likely the word Mac that was also found within this tweet was about the Big Mac (even though the word Big may not be present). If we find the word Arthur next to the word Mac then the tweet is about the Mac Arthur airport, etc. Here is GATE in action, identifying different keywords and concepts in Tweets :







Now we can see which concepts and keywords appear frequently in Re-Tweets ('USER' denotes that a '@' was present in the Tweet, 'URL' that a URL link was found in the Tweet,etc)





 We can also see which words frequently occur with iPhone5 :