in Economics, R Stats

Machine Learning and Econometrics: Trees, Random Forests, and Boosting

 

This is the third in a series of posts where I document my own process in figuring out how machine learning relates to the classic econometrics one learns in a graduate program in economics. Note that I am a humble practitioner of applied econometrics, so the ideas I am working through are not new, and my first take on them might not even be fully correct. But, I think many of us applied economists are starting to learn and dabble in this area and I thought it might be fruitful to community if I learn publicly in these blog posts. These posts are certainly going to solidify my own understanding and serve as helpful notes for my own purposes.

In this post I summarize what I learned from Chapter 8 of Introduction to Statistical Learning (ISL). In my last post I covered cross validation from chapter 5. I am skipping chapters 6 (Linear Model Selection and Regularization) and 7 (Moving Beyond Linearity) partly because I have kind of seen some of that stuff before and partly because I need to move my learning along in order to get something prepared for the SCC-76 meeting April 5-7th (Todd and Todd, I hope you aren’t getting nervous yet!).

I’m really excited to write up this post because it covers something that I truly know zero about, and feels like ‘real’ machine learning. Tree-Based Methods! I’ll briefly describe regression trees, bagging, random forests, and boosting, and then apply boosting to a stock-picking application.

Regression Trees

Regression trees take a fundamentally different approach to prediction than pretty much any kind of regression you will find in an econometrics class. In an econometrics class, you try to find the best function (often linear) that will predict the data. Specifically, you find \(\hat{f}(x)\) to predict values of \(y\) often by minimizing a quadratic loss function that looks something like the following:

\[\sum_{i=1}^n (y_i – \hat{f}(x))^2\]

So finding a functional form that minimizes the quantity above, usually referred to as the residual sum of squares (RSS), is the name of the game. If you further assume that the functional form is linear, then you just need to find the coefficients (\(\beta\)’s in most notation) and you end up with ordinary least squares estimates.

In a regression tree, you literally just find the cut-points in the prediction variables that split your sample in a way that gives the best possible prediction. This is really illustrated best with an example. I’ll piggyback on the example given in ISL. They use a database of professional baseball players’ salary. The database includes variables for years in the league and number of hits. The figure below is from ISL and it displays the result of fitting a regression tree to this data to predict the log of players’ salary. The first step is to determine which predictor has the best predictive power. Since we see years in the first split of the tree, it must be that splitting years at 4.5 years resulted in the best prediction compared to any other split in the years variable, and compared to splitting with hits first instead.

In the second step, there are two leaves that may be split further, or the tree may stop splitting if there are less than a threshold number of observations in that bin. So for observations with years less than 4.5 years, you get your salary prediction by averaging the salary of all players who have been in the league less than 4.5 years. This average is 5.11 (logged salary…).

On the other branch, for observations where year is greater than 4.5 years, there are more observations and we can split the sample based on the next best predicting variable. In this case, it is hits. So the algorithm searching for a split in the hits variable for observations that have years greater than 4.5 and it finds the best predicted salary comes from splitting hits at 117.5.

Figure 8.1 from ISL

Figure 8.1 from ISL

This figure is taken from “An Introduction to Statistical Learning, with applications in R” (Springer, 2013) with permission from the authors: G. James, D. Witten, T. Hastie and R. Tibshirani

That is it. It is pretty simple conceptually. The book discusses techniques to prevent over-fitting, specifically ‘pruning’ the tree. But I’ll leave that for interested readers to explore themselves. However, a simple regression tree as I described above can be influenced highly be slight changes or additions to the data-set, making the predictions not very robust.

The methods of bagging and random forests attempt to stabilize the results. I find the approach to be pretty intuitive.

Bagging

It turns out, bagging stands for ‘bootstrapped aggregation’. Once you know that, and if you have ever used a bootstrap, you might be able to guess what bagging is. The idea is that you can create more robust predictions by bootstrapping a bunch of sub-samples, fitting a regression tree to each sub-sample, and then averaging the predictions across the bootstrapped samples. By bootstrapping and then aggregating, you will greatly reduce the variance in your predictions as you update your sample to include new data. Then you would test your predictions on a hold out sample that was never drawn from for any of the bootstrap samples. This is called ‘out of bag’ prediction.

Cool. Very simple idea.

Random Forests

Bagging will improve accuracy and robustness of predictions, however random forests can improve further. Bagging can be improved upon because if one of your variables is a pretty good predictor, than it will probably be a pretty good predictor in most of your bootstrap samples. That will lead it to appear higher in the regression tree, and the split will probably be pretty similar across bootstrapped samples too. That means the predictions across bootstrapped samples within the bag will be pretty correlated and you are not getting as much bang for your buck from the bootstrapped samples as you would hope.

A random forest solves this problem by randomly choosing a subset of the predictors as candidates for each split. At each split and new subset of potential predictors is chosen. This restriction de-correlates the trees in the bagging process. The intuition is easiest to follow if you think about the best predictor. The best predictor will end up being the predictor that is split first in most of the bagged regression trees. However, in a random forest, some of those trees do not allow the best predictor to be used in the first split. Hence, the tree will be different than the highly correlated trees from the bagged sample. Counter-intuitively, by handicapping the bootstrapped trees this way you usually reduce the mean square error of the predictions.

Boosting

So, as we learned, random forests are a natural extension of bagging, and both are built on taking bootstrapped samples and applying a regression tree to each sample. Boosting is different, in that a boosted regression tree is built off of one test sample, and the trees are built sequentially. It took me a bit more to understand the basic idea of boosting. But what happens is this:

  1. Define the test sample
  2. Fit a regression tree with depth d = the number of splits, and shrinkage parameter chosen (see chapter 6. shrinkage parameter as in a ridge or lasso regression)
  3. Calculate residuals from the actual and predicted values
  4. Fit a regression tree to the residuals obtained in step 3
  5. Repeat 3 and 4 until the desired number of trees has been fitted

Since the sequential trees improve on the residuals of the previous steps, the model is said to ‘learn slowly’, which generally achieves a better fit.

A Forecasting Example

Since bagging, random forests, and boosting are completely new techniques to me, I wanted to apply them in an area that is familiar to me, and also the the very hardest environment to forecast in… the stock market. For this example I will focus on boosting, because I did some small forecasting test runs in stocks, and indeed, the basic boosting algorithm used in ISL had a lower mean square error than bagging, random forests, or a linear model.

Here’s the forecasting set up.

I will consider the top 10 S&P 500 companies by market cap, and forecast the daily returns of each. Then I choose the stock with the largest absolute value of return and buy it if the forecasted return is positive or sell it if the forecasted return is negative. Also, I will throw in four of the biggest Spyder sector exchange traded funds as predictors just for fun.

Facebook and Berkshire Hathaway are excluded and replaced with the #11 and #12 companies – Bank of America and Wells Fargo. I wanted to use a long series and FB has only been public since 2012, and I had trouble downloading the Berkshire Hathaway data.

Peroformance of this approach will be measured by how much money this method makes in the test sample compared to bying and holding the SPY and a couple of other big names like Apple and Amazon. I like this method of evaluating forecast performance because it is hard to tell from a mean squared error-type measure if you have a forecast that is good enough to make money. Turning the forecast into a trading strategy and then back-testing it’s performance gives you a sense of the quality of your forecasts.

A Few Caveats Up-Front

I am going to use data going back to 2007, but I am restricting my choices to the current top 10 market cap companies. Companies like Google and Amazon were not in the top 10 by market cap in 2007, so the fact that they ascended to the top 10 by today builds in a lot of look-back bias. If you already know a company will go from not being in the top 10, to being in the top 10 you don’t need a fancy forecasting model to make money. Just buy it.

I could do this better if I restricted my decision set to who was in the top 10 at the time of the forecast and then update that at the time of the forecast. I could also do better if I set up my forecast to use a rolling window of, say, 50 or 100 observations and then re-estimate the boosted model each day. Both of those things would reduce the look back bias, but the purpose of this post is more for me to get a feel for the boosting algorithm than it is to find the best realistic forecasting model.

And like I always tell my students, if I ever find a forecasting model that actually works well enough to make money, I’ll never tell you about it!

The Details

The code below loads the required R packages and downloads the stocks that I want.

# If you haven't installed any of the packages listed below, do so with the "install.packages('tibble')" command. 
library(tibble)
library(tidyr)
library(ggplot2)
library(knitr)
library(quantmod)
library(randomForest)
library(PerformanceAnalytics)
library(gbm)
library(broom)

#SPY and Top 10 S&P 500 Companies by Market Cap (BRK.B replaced by #11 BAC, bc Berkshire wouldn't download. FB replaced by #12 WFC bc I want long sample and FB has only been trading since '12. Plus 4 of the biggest Spyder sector etfs.) XLF - Financials, XLK - Tech, XLE - Energy, XLY - Consumer Discretionary
SP  <- c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY")
getSymbols(SP)
##  [1] "SPY"  "AAPL" "MSFT" "AMZN" "WFC"  "JPM"  "BAC"  "JNJ"  "GOOG" "XOM" 
## [11] "XLF"  "XLK"  "XLE"  "XLY"
SP <- cbind(SPY$SPY.Adjusted, AAPL$AAPL.Adjusted, MSFT$MSFT.Adjusted, AMZN$AMZN.Adjusted, WFC$WFC.Adjusted, JPM$JPM.Adjusted, BAC$BAC.Adjusted, JNJ$JNJ.Adjusted, GOOG$GOOG.Adjusted, XOM$XOM.Adjusted, XLF$XLF.Adjusted, XLK$XLK.Adjusted, XLE$XLE.Adjusted, XLY$XLY.Adjusted)
colnames(SP) <- c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY")

The next chunk of code calculates percentage returns, create lags, and combine into one object, SPRet.

SPRet <- Return.calculate(SP, method = 'log')
laggs.1 <- apply(SPRet, 2, Lag, 1)
laggs.2 <- apply(SPRet, 2, Lag, 2)
laggs.3 <- apply(SPRet, 2, Lag, 3)
laggs.4 <- apply(SPRet, 2, Lag, 4)
laggs.5 <- apply(SPRet, 2, Lag, 5)
laggs.6 <- apply(SPRet, 2, Lag, 6)
laggs.7 <- apply(SPRet, 2, Lag, 7)
laggs.8 <- apply(SPRet, 2, Lag, 8)
laggs.9 <- apply(SPRet, 2, Lag, 9)
laggs.10 <- apply(SPRet, 2, Lag, 10)
laggs   <- cbind(laggs.1, laggs.2, laggs.3, laggs.4, laggs.5, laggs.6, laggs.7, laggs.8, laggs.9, laggs.10)

SPRet   <- cbind(SPRet, laggs)
NAMES   <- c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY")

In the next code chunk we split the data into training and testing sub-samples. Then it uses the gbm() function from the gbm package to use a gradient boosting machine to implement the boosted regression trees. For a more technical description of the gbm, see here. We we get boosted predictions for each of the stocks we are considering.


#train         <- sample(12:nrow(SPRet[12:dim(SPRet)[1]]), nrow(SPRet[12:dim(SPRet)[1]])/1.5)
#temp          <- SPRet[train]

temp          <- SPRet[12:2110,]
testing       <- SPRet[2110:(dim(SPRet)[1]),]
ntrees        <- 1000
depth         <- 1

# SPY
SPRtemp       <- temp[, setdiff(colnames(SPRet), c("AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.SPY     <- gbm(SPY~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = 4, shrinkage = .001)
yhat.SPY      <- predict(boost.SPY, newdata = testing, n.trees = ntrees)
SPRtempFull   <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.SPY      <- predict(boost.SPY, newdata = SPRtempFull, n.trees = ntrees)
indicator.SPY <- apply(as.matrix(Lag(yhat.SPY)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)


# AAPL
SPRtemp       <- temp[, setdiff(colnames(SPRet), c("SPY", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.AAPL     <- gbm(AAPL~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.AAPL      <- predict(boost.AAPL, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.AAPL      <- predict(boost.AAPL, newdata = SPRtempFull, n.trees = ntrees)
indicator.AAPL <- apply(as.matrix(Lag(yhat.AAPL)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# MSFT
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.MSFT     <- gbm(MSFT~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.MSFT      <- predict(boost.MSFT, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "AMZN", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.MSFT      <- predict(boost.MSFT, newdata = SPRtempFull, n.trees = ntrees)
indicator.MSFT <- apply(as.matrix(Lag(yhat.MSFT)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# AMZN
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.AMZN     <- gbm(AMZN~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.AMZN      <- predict(boost.AMZN, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "WFC", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.AMZN      <- predict(boost.AMZN, newdata = SPRtempFull, n.trees = ntrees)
indicator.AMZN <- apply(as.matrix(Lag(yhat.AMZN)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# WFC
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.WFC     <- gbm(WFC~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.WFC      <- predict(boost.WFC, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "JPM", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.WFC      <- predict(boost.WFC, newdata = SPRtempFull, n.trees = ntrees)
indicator.WFC <- apply(as.matrix(Lag(yhat.WFC)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# JPM
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.JPM     <- gbm(JPM~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.JPM      <- predict(boost.JPM, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "BAC", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.JPM      <- predict(boost.JPM, newdata = SPRtempFull, n.trees = ntrees)
indicator.JPM <- apply(as.matrix(Lag(yhat.JPM)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# BAC
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.BAC     <- gbm(BAC~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.BAC      <- predict(boost.BAC, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "JNJ", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.BAC      <- predict(boost.BAC, newdata = SPRtempFull, n.trees = ntrees)
indicator.BAC <- apply(as.matrix(Lag(yhat.BAC)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# JNJ
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.JNJ     <- gbm(JNJ~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.JNJ      <- predict(boost.JNJ, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "GOOG", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.JNJ      <- predict(boost.JNJ, newdata = SPRtempFull, n.trees = ntrees)
indicator.JNJ <- apply(as.matrix(Lag(yhat.JNJ)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

# GOOG
SPRtemp        <- temp[, setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "XOM", "XLF", "XLK", "XLE", "XLY"))]

boost.GOOG     <- gbm(GOOG~., data= SPRtemp, distribution = "gaussian", n.trees = ntrees, interaction.depth = depth, shrinkage = .001)
yhat.GOOG      <- predict(boost.GOOG, newdata = testing, n.trees = ntrees)
SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]), setdiff(colnames(SPRet), c("SPY", "AAPL", "MSFT", "AMZN", "WFC", "JPM", "BAC", "JNJ", "XOM", "XLF", "XLK", "XLE", "XLY"))]
yhat.GOOG      <- predict(boost.GOOG, newdata = SPRtempFull, n.trees = ntrees)
indicator.GOOG <- apply(as.matrix(Lag(yhat.GOOG)[2:dim(Lag(yhat.SPY))]), 1, function(x) if(x<0) (-1) else 1)

SPRtempFull    <- SPRet[2110:(dim(SPRet)[1]),]

Now, the next chunk takes all the predictions, and each day records the stock that has the highest return in absolute value, and also records whether it was positive or negative so we know whether to buy or sell.

# Picking Stock with highest absolute value of predicted return. 
all            <- as.data.frame(cbind(yhat.SPY, yhat.AAPL, yhat.AMZN, yhat.MSFT, yhat.WFC, yhat.JPM, yhat.BAC, yhat.JNJ, yhat.GOOG))
colnames(all)  <- c("SPY", "AAPL", "AMZN", "MSFT", "WFC", "JPM", "BAC", "JNJ", "GOOG")

Temp           <- apply(abs(all), 1, which.max)
return.picker  <- data.frame()
for (i in 1:length(Temp)){
return.picker[i,1] <- all[i, Temp[i]]
}

# Long or Short
indicator.picker <- apply(return.picker, 1, function(x) if(x<0) (-1) else 1)

# Actual Returns of Picked Stocks
return.Picker1  <- data.frame(stringsAsFactors = FALSE)
return.Picker2  <- data.frame(stringsAsFactors = FALSE)
return.Picker3  <- data.frame(stringsAsFactors = FALSE)

for (i in 1:length(Temp)){
return.Picker1[i, 1] <- SPRtempFull[i, Temp[i]]
return.Picker2[i, 1] <- colnames(all)[Temp[i]]
return.Picker3[i, 1] <- indicator.picker[i]
}
return.Picker1    <- xts(return.Picker1 , order.by = index(SPRtempFull))
return.Picker2    <- xts(return.Picker2 , order.by = index(SPRtempFull))
return.Picker3    <- xts(return.Picker3 , order.by = index(SPRtempFull))

OK, now the next chunk of code takes $10,000 and starts buying or selling every day in the test set, which is after about September 2014.

invest         <- 10000

DATA            <- cbind(invest*cumprod(1+SPRtempFull$SPY), invest*cumprod(1+return.Picker1), invest*cumprod(1+SPRtempFull$AMZN), invest*cumprod(1+SPRtempFull$AAPL))
colnames(DATA)  <- c('SPY', "Picker", "AMZN", "AAPL") 

DATA            <- tidy(DATA)

How did the boosting stock picker do? I plot the cumulative returns of just buying and holding the SPY, Apple, and Amazon. Our boosted stock picker returns are in blue.

The boosting stock picker had us long Apple most of the time. Then in the test period, Amazon was the monster. But since we didn’t re-estimating the model through the forecasting period, it never pivoted. I kind of expected that the boosted forecasts might recognize the large returns in Amazon and switch, even though it was using a dated model.

ggplot(DATA, aes(x = index, y = value, color = series)) + 
  geom_line() + 
  theme_bw() +
  labs(title = "Buy & Hold SPY, AAPL, AMZN vs 'Boosting' Picker, $10K Invested", x = "")

This chart is just a count of the number of days in each stock.

ggplot(return.Picker2, aes(x = V1)) + geom_bar() + theme_bw()

And this chart is a count of the number of days we were long (1) versus short (-1).

ggplot(return.Picker3, aes(x = V1)) + geom_bar() + theme_bw()

The interesting thing is, that this method of stock picking with a boosting algorithm on lagged returns basically gives you the same result any trend following strategy would. And those are generally much simpler. A strategy like like works along the lines of ‘buy if the price is above the 50 day moving average’, etc. I guess it does independently suggest that trend following is better than contrarian strategies which for example might sell if the price is above the 50 day moving average, or else the boosted forecasts would have had us short a lot more.

I’m curious about what the result would be in commodities. Stocks generally trend up, unless we are in a major correction. Commodities do not have that same property. So it is not clear to me that the boosting stock picker would look like a trend following strategy.

Perhaps another day…

That’s It!

That’s it for this time! I was a little surprised at how this method came out to be so similar to other trend following strategies. If I had suspected this from the beginning I would have set up a proper horse-race between the boosting stock picker and some other common trading strategies. I may have time to do a post on support vector machines. But, it is getting to be crunch time for the SCC76 conference!

 

 

Write a Comment

Comment

Webmentions

  • DemonJSubCollar liked this article on twitter.com.

  • yokokaiten liked this article on twitter.com.

  • nakamichi liked this article on twitter.com.

  • Eric Morin liked this article on twitter.com.

  • artur liked this article on twitter.com.

  • Jorge Miranda liked this article on twitter.com.

  • Matt Holt liked this article on twitter.com.

  • Mohammed Ohiul Islam liked this article on twitter.com.

  • Rajib Sarkar liked this article on twitter.com.

  • Christian Prokopp liked this article on twitter.com.

  • Daniel C. Miller liked this article on twitter.com.

  • Conner Naughton liked this article on twitter.com.

  • Joe Balagtas liked this article on twitter.com.

  • Max Auffhammer liked this article on twitter.com.