Kamis, 30 Oktober 2008

Decision Tree Interpretation

On the previous post i went through some basic steps required for predicting the price changes of a specific stock of the Greek stock exchange market. As a result of this process, the following decision tree was generated :





To interpret a decision tree, the analyst starts from the root of the tree and reads through it until a leaf node is reached. For example a rule that can be extracted from the decision tree above is the following:

"IF aseStockExchange > 0.360 AND aseStockExchange > 1.985 THEN price>+2"

The rule above can be found by starting from the root of the tree, moving on the left branch and then continuing to the right sub-branch. In the same way an analyst is able to find the rest of the rules identified by the decision tree.

When using decision tree learners or rule extractors, analysts record the precision and recall of a rule which are not shown in the decision tree above. However, for matters of simplicity i will omit this information and describe the insights provided from the analysis. Decision Trees possess the two following qualities :


1) They provide easy model interpretation

and

2) They show us the relevant importance of the variables

When confronted with many variables, analysts usually start by building a decision tree and then using the variables which the decision tree algorithm has selected with other methods that suffer from the complexity of many variables, such as neural networks. However, decision trees perform worse when the problem at hand is not linearly separable. For the purpose of our example though, a decision tree 'explains' the behavior of the stock nicely.

It should be noted that during the Feature Selection analysis of our stock example we have found that features 'aseStockExchange' and 'DAX' are important. Other features such as 'xaaPersonalHouseProducts' were flagged as important from the Feature Selection algorithm and were not used in the decision tree. Different feature selection methods produce different results (and one might say that this is not very assuring) but usually most methods produce a common feature subset that is of high predictive value.

The importance of the attributes can be seen from the level that they appear on the decision tree (the higher the level, the better is the prediction power of the attribute). So in our example, the 'aseStockExchange' feature is the most important (since it is the attribute with which the decision tree starts) and less important attributes seem to be 'xaaLeisure' and 'xaaBenefit'.

Rabu, 15 Oktober 2008

Insights from a Decision Tree

Assuming that an analyst has made all necessary pre-processing tasks prior to the data mining phase, we are ready to deploy analytical methods such as decision tree learners that can classify unseen cases. For the goal of stock prediction we assume that we have the following data collected :




The column named as XAACLASS is the target column that we wish to classify. Essentially here we have the following classes :

-price change percentage greater than 2%
-price change percentage less than -2%
-price change percentage greater than 0% and +2% inclusive
-price change percentage between -2% inclusive and 0% inclusive

In other words, each line shows us the state of the stock we wish to predict, that occurs given the rest of the market indices (such as realTimeFTSE, realTimeDAX, etc).

So, let us assume that we are ready to build such a model. However, we have to decide the time window that our predictions will be made for...do we wish to predict what the stock price change will be 2 hours ahead? How about 1 day ahead?

Before dealing with this issue, i wanted to see how good a predictive model is by predicting the stock price percentage change right now, based on the current market conditions. Here is a decision tree that is created from such data:





More to come on the next post where the model seen above will be explained in detail. Until then please read the post from this blog about the same problem. If you can, read Fooled By Randomness also...

Kamis, 09 Oktober 2008

So...What's important??

A step of a Knowledge Discovery Process is to perform what is known as Feature Selection, which essentially is the identification of a subset of features with high predictive value.

Feature selection can potentially help in increasing the accuracy of prediction models. Methods such as Naive Bayes can perform better when presented with a subset of selected features, rather than the whole feature set (because of feature redundancy).

Even if feature selection does not prove to help too much, it is important to know the predictive power of each feature. There are numerous methods to do this and -as normally is the case- there is no universally better method to perform an optimal feature selection. The following is a representation of all available Feature Selection methods in WEKA:




Let us stick to our example with stocks, to make things more clear. Suppose that i would like to know which features seem to be important for predicting the behavior of a stock. For our example we will try to find out about how the stock of NBG reacts.

By using a feature selection method we extract the following information :



The feature selection method above shows us how many times each attribute was selected during a 10-fold cross validation. We can see that some attributes are used more times than other attributes during each cross validation . For example :

realTimeDax
aseStockExchangeIndex
xaaPersonalHouseProducts
xaaTechnology
bankAgrotiki
bankAlpha
bankPiraeus
bankEuro


are present in all 10 folds of our cross-validation and hence the 10(100%) entry. xaaFinancialServices index has been selected fewer times (8 out of 10) and hence the 8(80%) entry. Other features never appear to any of the cross validation folds.

Of course feature selection does not stop here and there are many ways to enhance the process. Data Mining is both an art and a science. However for our purpose, we were able to identify those attributes that seem to be important in the prediction of the NBG stock. We immediately see for example that DAX index and the Athens Stock Exchange Index are two important features, plus the stocks of four specific banks. Other methods of feature selection produce weights that essentially rank the importance of each attribute for class prediction.


Senin, 06 Oktober 2008

Always know your data!

Before rushing in analyzing and predicting the Financial Markets (and actually anything else) it is essential that we get an idea about the data at hand. So after data collection (ie getting values of different market indices) i wanted to understand first what is going on to the markets. And a correlation Matrix tells us just that. Let's see what happens on the Greek Stock Exchange :







By looking at the matrix we can immediately see some interesting things :

1) There is a high correlation (=0.847) between DAX index and the Greek stock exchange index (marked as aseStockExchangeIndex)

2) The Insurance index sector (xaaInsurance) and the Media sector (xaaMedia) have a low correlation with the aseStockExchangeIndex. Consider the following scatter chart that shows the poor correlation between Insurance sector stocks and the aseStockExchangeIndex :




Those two facts alone can help significantly in trading: For example if an investor's trading decision is heavily based on aseStockExchangeIndex then the investor should also keep a close look on the DAX Index as opposed to other European indices (such as FTSE,CAC40,etc).

A lot of problems later in the analysis can be prevented if one pays attention to the "Data Understanding" phase. Plus, we also get an insight as to what kind of results should we expect from the learning algorithms.