技术控

    今日:287| 主题:57746
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] The Problem With Depmix For Online Regime Prediction

[复制链接]
一切的一切 投递于 2016-10-6 07:00:09
318 4
(This article was first published on R – QuantStrat TradeR , and kindly contributed toR-bloggers)
  This post will be about attempting to use the Depmix package for online state prediction. While the depmix package performs admirably when it comes to describing the states of the past, when used for one-step-ahead prediction, under the assumption that tomorrow’s state will be identical to today’s, the hidden markov model process found within the package does not perform to expectations.
  So, to start off, this post was motivated by Michael Halls-Moore, who recently posted some R code about using the depmixS4 library to use hidden markov models. Generally, I am loath to create posts on topics I don’t feel I have an absolutely front-to-back understanding of, but I’m doing this in the hope of learning from others on how to appropriately do online state-space prediction, or “regime switching” detection, as it may be called in more financial parlance.
   Here’s Dr. Halls-Moore’s post .
  While I’ve seen the usual theory of hidden markov models (that is, it can rain or it can be sunny, but you can only infer the weather judging by the clothes you see people wearing outside your window when you wake up), and have worked with toy examples in MOOCs (Udacity’s self-driving car course deals with them, if I recall correctly–or maybe it was the AI course), at the end of the day, theory is only as good as how well an implementation can work on real data.
  For this experiment, I decided to take SPY data since inception, and do a full in-sample “backtest” on the data. That is, given that the HMM algorithm from depmix sees the whole history of returns, with this “god’s eye” view of the data, does the algorithm correctly classify the regimes, if the backtest results are any indication?
  Here’s the code to do so, inspired by Dr. Halls-Moore’s.
  1. require(depmixS4)
  2. require(quantmod)
  3. getSymbols('SPY', from = '1990-01-01', src='yahoo', adjust = TRUE)
  4. spyRets <- na.omit(Return.calculate(Ad(SPY)))

  5. set.seed(123)

  6. hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data=spyRets)
  7. hmmfit <- fit(hmm, verbose = FALSE)
  8. post_probs <- posterior(hmmfit)
  9. post_probs <- xts(post_probs, order.by=index(spyRets))
  10. plot(post_probs$state)
  11. summaryMat <- data.frame(summary(hmmfit))
  12. colnames(summaryMat) <- c("Intercept", "SD")
  13. bullState <- which(summaryMat$Intercept > 0)
  14. bearState <- which(summaryMat$Intercept < 0)

  15. hmmRets <- spyRets * lag(post_probs$state == bullState) - spyRets * lag(post_probs$state == bearState)
  16. charts.PerformanceSummary(hmmRets)
  17. table.AnnualizedReturns(hmmRets)
复制代码
Essentially, while I did select three states, I noted that anything with an intercept above zero is a bull state, and below zero is a bear state, so essentially, it reduces to two states.
  With the result:
   

The Problem With Depmix For Online Regime Prediction

The Problem With Depmix For Online Regime Prediction-1-技术控-prediction,published,recently,learning,article

  1. table.AnnualizedReturns(hmmRets)
  2.                           SPY.Adjusted
  3. Annualized Return               0.1355
  4. Annualized Std Dev              0.1434
  5. Annualized Sharpe (Rf=0%)       0.9448
复制代码
So, not particularly terrible. The algorithm works, kind of, sort of, right?
  Well, let’s try online prediction now.
  1. require(DoMC)

  2. dailyHMM <- function(data, nPoints) {
  3.   subRets <- data[1:nPoints,]
  4.   hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data = subRets)
  5.   hmmfit <- fit(hmm, verbose = FALSE)
  6.   post_probs <- posterior(hmmfit)
  7.   summaryMat <- data.frame(summary(hmmfit))
  8.   colnames(summaryMat) <- c("Intercept", "SD")
  9.   bullState <- which(summaryMat$Intercept > 0)
  10.   bearState <- which(summaryMat$Intercept < 0)
  11.   if(last(post_probs$state) %in% bullState) {
  12.     state <- xts(1, order.by=last(index(subRets)))
  13.   } else if (last(post_probs$state) %in% bearState) {
  14.     state <- xts(-1, order.by=last(index(subRets)))
  15.   } else {
  16.     state <- xts(0, order.by=last(index(subRets)))
  17.   }
  18.   colnames(state) <- "State"
  19.   return(state)
  20. }

  21. # took 3 hours in parallel
  22. t1 <- Sys.time()
  23. set.seed(123)
  24. registerDoMC((detectCores() - 1))
  25. states <- foreach(i = 500:nrow(spyRets), .combine=rbind) %dopar% {
  26.   dailyHMM(data = spyRets, nPoints = i)
  27. }
  28. t2 <- Sys.time()
  29. print(t2-t1)
复制代码
So what I did here was I took an expanding window, starting from 500 days since SPY’s inception, and kept increasing it, by one day at a time. My prediction, was, trivially enough, the most recent day, using a 1 for a bull state, and a -1 for a bear state. I ran this process in parallel (on a linux cluster, because windows’s doParallel library seems to not even know that certain packages are loaded, and it’s more messy), and the first big issue is that this process took about three hours on seven cores for about 23 years of data. Not exactly encouraging, but computing time isn’t expensive these days.
  So let’s see if this process actually works.
  First, let’s test if the algorithm does what it’s actually supposed to do and use one day of look-ahead bias (that is, the algorithm tells us the state at the end of the day–how correct is it even for that day?).
  1. onlineRets <- spyRets * states
  2. charts.PerformanceSummary(onlineRets)
  3. table.AnnualizedReturns(onlineRets)
复制代码
With the result:
   

The Problem With Depmix For Online Regime Prediction

The Problem With Depmix For Online Regime Prediction-2-技术控-prediction,published,recently,learning,article

  1. > table.AnnualizedReturns(onlineRets)
  2.                           SPY.Adjusted
  3. Annualized Return               0.2216
  4. Annualized Std Dev              0.1934
  5. Annualized Sharpe (Rf=0%)       1.1456
复制代码
So, allegedly, the algorithm seems to do what it was designed to do, which is to classify a state for a given data set. Now, the most pertinent question: how well do these predictions do even one day ahead? You’d think that state space predictions would be parsimonious from day to day, given the long history, correct?
  1. onlineRets <- spyRets * lag(states)
  2. charts.PerformanceSummary(onlineRets)
  3. table.AnnualizedReturns(onlineRets)
复制代码
With the result:
   

The Problem With Depmix For Online Regime Prediction

The Problem With Depmix For Online Regime Prediction-3-技术控-prediction,published,recently,learning,article

  1. > table.AnnualizedReturns(onlineRets)
  2.                           SPY.Adjusted
  3. Annualized Return               0.0172
  4. Annualized Std Dev              0.1939
  5. Annualized Sharpe (Rf=0%)       0.0888
复制代码
That is, without the lookahead bias, the state space prediction algorithm is atrocious. Why is that?
  Well, here’s the plot of the states:
   

The Problem With Depmix For Online Regime Prediction

The Problem With Depmix For Online Regime Prediction-4-技术控-prediction,published,recently,learning,article

  In short, the online hmm algorithm in the depmix package seems to change its mind very easily, with obvious (negative) implications for actual trading strategies.
  So, that wraps it up for this post. Essentially, the main message here is this: there’s a vast difference between loading doing descriptive analysis (AKA “where have you been, why did things happen”) vs. predictive analysis (that is, “if I correctly predict the future, I get a positive payoff”). In my opinion, while descriptive statistics have their purpose in terms of explaining why a strategy may have performed how it did, ultimately, we’re always looking for better prediction tools. In this case, depmix, at least in this “out-of-the-box” demonstration does not seem to be the tool for that.
  If anyone has had success with using depmix (or other regime-switching algorithm in R) for prediction, I would love to see work that details the procedure taken, as it’s an area I’m looking to expand my toolbox into, but don’t have any particular good leads. Essentially, I’d like to think of this post as me describing my own experiences with the package.
  Thanks for reading.
  NOTE: On Oct. 5th, I will be in New York City. On Oct. 6th, I will be presenting at The Trading Show on the Programming Wars panel.
   NOTE: My current analytics contract is up for review at the end of the year, so I am officially looking for other offers as well. If you have a full-time role which may benefit from the skills you see on my blog, please get in touch with me. My linkedin profile can be found here.
   

The Problem With Depmix For Online Regime Prediction

The Problem With Depmix For Online Regime Prediction-5-技术控-prediction,published,recently,learning,article



上一篇:漫谈JVM
下一篇:Audit.NET: A small framework to audit .NET object changes
吻老婆 投递于 2016-10-6 17:06:29
现在问题来了,“挖掘机技术哪家强?找吻老婆”
回复 支持 反对

使用道具 举报

电商令狐冲 投递于 2016-11-14 07:33:43
鼎力支持!!
回复 支持 反对

使用道具 举报

快活仙人 投递于 2016-11-15 06:26:41
沙发???
回复 支持 反对

使用道具 举报

郎涛 投递于 2016-11-16 13:42:46
楼主,你妈妈喊你回家吃饭!
回复 支持 反对

使用道具 举报

我要投稿

推荐阅读


回页顶回复上一篇下一篇回列表
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 | 粤公网安备 44010402000842号 )

© 2001-2017 Comsenz Inc.

返回顶部 返回列表