Some time ago i wanted to explore the idea of analyzing several kinds and sources of Information (e.g TechCrunch, TheNextWeb, News sites and Twitter) to identify promising Investment opportunities in Technology and more specifically Startups.
Here is a snapshot of a Webpage from TechCrunch :
In many posts in this Blog it was discussed how our Reactions for almost any kind of information are recorded. This was not possible when everyone was reading newspapers in its paper form whereas now any kind of Text is associated with a number of Views, Re-Tweets, Number of Comments and FaceBook "Likes".
The second important information that is being generated is our Emotions for any Topic as these are expressed within Comments, Twitter and FaceBook posts. The intensity of our emotions is also captured and this information is very important since whatever we associate with intense emotions really stays within our psyche, fuels our interest and (usually) drives our purchase decisions.
We may then continue with some Exploratory work as follows : We can collect Posts from various Tech sources and their associated Reactions, annotate the text with Sentiment, Events and Topics and analyze this information to understand which Topics and/or Events appear to have an affinity for a high number of Reactions or High Sentiment intensity for Startups or Tech Topics .
As an example, 10K posts from various Tech sources were collected and each one of the posts was marked as generating either HIGH or LOW interest based on the amount of Reactions (Re-Tweets, FB Likes, Comments) that each post generated. Special filtering is applied for the frequencies of the words that appear in each post :
Then this information is fed to KNIME for further analysis. The implementation which is shown here is rather naive and simplistic for many reasons : Only keywords are used as input -as opposed to Topics, Events- and many other parameters that are involved and which will be discussed later but for our example we will keep things simple.
The workflow uses 3 algorithms namely PART (so that some rules are generated), SMO and Random Forests :
This -again- is a very naive approach which gave a result of 61.9% (F-Measure) in identifying keywords that commonly appear with posts that generate Interest vs posts that do not. We keep in mind that this knowledge alone is not enough with which a decision can be made but we decide to explore things a little further.
We may find that some words that we expected do appear in posts of High interest (such as Google, Apple, Pinterest). There could be however some words that deserve more of our attention such as Education and Schools which during the analysis appeared to exist more frequently in High Interest posts.
So how can this information be used for a potential investment on a Startup and is there really a way to model new ideas and predict their performance? Again, it is not suggested here that if you come across a startup aimed in Education you should immediately put your money in but this observation could be one parameter to consider. There are so many other considerations such as whether the idea is novel or not, how many competitors exist, who are the people behind the Startup, whether its founders have created a successful Startup in the past, which people have already invested in the particular Startup, what is the "buzz" that this Startup has generated so far and so on.
Whenever we read about a new startup there are some immediate thoughts going through our minds : Does this sound like a good idea? Is it applicable to me and would it make my life easier? Is this idea truly disruptive or not? What does our "gut feeling" tells us?
We should always keep in mind that there are limitations to what Predictive Analytics can do but perhaps we can extract some hints that we may then use to make better decisions.
It was also interesting to read this post (hence the use of word "Revisited" in this Post's Title) on Gigaom regarding the same Subject. This is a fascinating area that i started looking at and there will be similar posts in the future on this Subject.
As an example, 10K posts from various Tech sources were collected and each one of the posts was marked as generating either HIGH or LOW interest based on the amount of Reactions (Re-Tweets, FB Likes, Comments) that each post generated. Special filtering is applied for the frequencies of the words that appear in each post :
Then this information is fed to KNIME for further analysis. The implementation which is shown here is rather naive and simplistic for many reasons : Only keywords are used as input -as opposed to Topics, Events- and many other parameters that are involved and which will be discussed later but for our example we will keep things simple.
The workflow uses 3 algorithms namely PART (so that some rules are generated), SMO and Random Forests :
This -again- is a very naive approach which gave a result of 61.9% (F-Measure) in identifying keywords that commonly appear with posts that generate Interest vs posts that do not. We keep in mind that this knowledge alone is not enough with which a decision can be made but we decide to explore things a little further.
We may find that some words that we expected do appear in posts of High interest (such as Google, Apple, Pinterest). There could be however some words that deserve more of our attention such as Education and Schools which during the analysis appeared to exist more frequently in High Interest posts.
So how can this information be used for a potential investment on a Startup and is there really a way to model new ideas and predict their performance? Again, it is not suggested here that if you come across a startup aimed in Education you should immediately put your money in but this observation could be one parameter to consider. There are so many other considerations such as whether the idea is novel or not, how many competitors exist, who are the people behind the Startup, whether its founders have created a successful Startup in the past, which people have already invested in the particular Startup, what is the "buzz" that this Startup has generated so far and so on.
Whenever we read about a new startup there are some immediate thoughts going through our minds : Does this sound like a good idea? Is it applicable to me and would it make my life easier? Is this idea truly disruptive or not? What does our "gut feeling" tells us?
We should always keep in mind that there are limitations to what Predictive Analytics can do but perhaps we can extract some hints that we may then use to make better decisions.
It was also interesting to read this post (hence the use of word "Revisited" in this Post's Title) on Gigaom regarding the same Subject. This is a fascinating area that i started looking at and there will be similar posts in the future on this Subject.



Tidak ada komentar:
Posting Komentar