Research

Energy Stat Arb

Back to my roots. Haven’t tested outright entry exit trading systems for a while now since the Mechanica and Tblox days but I aim to post more about these in the future.

I’ve been looking and reading about market neutral strategies lately to expand my knowledge. Long only strategies are great but sometimes constant outright directional exposure may leave your portfolio unprotected to the downside when all assets are moving in the same direction. A good reminder would be the May of last year when gold took a nose dive.

Below are some tests I conducted on trading related energy pairs. Note that I haven’t done any elaborate testing for whether the spread is mean reverting,etc. I just went with my instincts. No transaction costs. Spread construction based on stochastic differential, 10 day lookback, +-2/0 std normalized z score entry/exit, and delay 1 bar execution.

Crude Oil and Natural Gas Futures (Daily) (Daily don’t seem to work that well no more):

CL-NG

OIL and UNG ETF (1 Min Bar)

OIL-UNG

XLE and OIL ETF (1 Min Bar)

XLE-OIL

Pair trading is the simplest form of statistical arbitrage but what gets interesting is when you start dealing with a basket of assets. For example, XLE tracks both Crude Oil and Natural Gas companies, therefore a potential 3 legged trade would be to trade XLE against both OIL and UNG. Another well-known trade would be to derive value for the SPY against TLT (rates), HYG (corp spreads), and VXX (Vol).

The intuition behind relative value strategies is to derive a fair value of an asset “relative” to another. In basic pair trading, we are using one leg to derive the value of another, or vice versa. Any deviations are considered opportunities for arbitrage. In the case for multi legged portfolio, a set of assets are combined in some way (optimization, factor analysis, PCA) to measure the value. See (Avellaneda) for details.

While the equity lines above look nice, please remember that they don’t account for transaction costs and are modelled purely on adjusted last trade price. A more realistic simulation would be to test the sensitivity of entry and order fills given level 1 bid-ask spreads. For that, a more structured backtesting framework should be employed.

(Special thanks to QF for tremendous insight)

Thanks for reading,

Mike

Advertisements

Random Subspace Optimization: Max Sharpe

I was reading David’s post on the idea of Random Subspace Optimization and thought I’d provide some code to contribute to the discussion. I’ve always loved ensemble methods since combining multiple streams of estimates makes more robust estimation outcomes.

In this post, I will show how RSO overlay performs using max sharpe framework. To make things more comparable, I will employ the same assets as David for the backtest. One additional universe I would like to incorporate is the current day S&P 100 (survivorship bias).

Random subspace method is a generalization of the random forest algorithm. Instead of generating random decision trees, the method can employ any desired classifiers. Applied to portfolio management, given N different asset classes and return streams, we will randomly select `k` assets `s` times. Given `s` different random asset combinations, we can perform a user defined sizing algorithm for each of them. The last step is to combined them though averaging to get the final weights. In R, the problem can be easily formulated via `lapply` or `for` loops as the base iterative procedure. For random integers, the function `sample` will be employed. Note my RSO function employs functions inside Systematic Investors Toolbox.


rso.optimization<-function(ia,k,s,list.param){
 size.fn = match.fun(list.param$weight.function)
 if(k > ia$n) stop("K is greater than number of assets.")
 space = seq(1:ia$n)
 index.samples =t(replicate(s,sample(space,size=k)))
 weight.holder = matrix(NA,nrow = s , ncol = ia$n)
 colnames(weight.holder) = ia$symbol.names

 hist = coredata(ia$hist.returns)
 constraints = new.constraints(k, lb = 0, ub = 1)
 constraints = add.constraints(diag(k), type='&;=', b=0, constraints)
 constraints = add.constraints(diag(k), type='<=', b=1, constraints)

 #SUM x.i = 1
 constraints = add.constraints(rep(1, k), 1, type = '=', constraints)

 for(i in 1:s){
 ia.temp = create.historical.ia(hist[,index.samples[i,]],252)
 weight.holder[i,index.samples[i,]] = size.fn(ia.temp,constraints)
 }
 final.weight = colMeans(weight.holder,na.rm=T)

 return(final.weight)
}

The above function will take in a `ia` object, short for input assumption. It calculates all the necessary statistics for most sizing algorithms. Also, I’ve opted to focus on long only.

The following are the results for 8 asset class. All backtest hereafter will keep `s` equal to 100 while varying `k` from 2 to N-1, where N equals the total number of assets. The base comparison will be that of simple max sharpe and equal weight portfolio.

8asset_s100

The following is for 9 sector asset classes.

9sector-asset_s100

Last but not least is the performance for current day S&P 100 stocks.

sp100-s100

The RSO method seems to improve all the universes that I’ve thrown at it. For a pure stock universe, it is able to reduce volatility by more than 300-700 basis points depending on your selection of k. In a series of tests across different universes, I have found that the biggest improvements from RSO comes from applying it to a universe of instruments that belong to the same asset class. Also, I’ve found that for a highly similar universe (stocks), a lower `k` is better than a higher `k`. One explanation: since the max sharpe portfolio of X identical assets is equal to that of an equal weight portfolio, we can postulate that when the asset universe is highly similar or approaching equivalence, resampling with a lower `k` Y times where Y approaches infinity, we are in a sense approaching the limit of a equally weighted portfolio. This is in line with the idea behind curse of dimensionality: for better estimates,  the data required grows exponentially when the number of assets increase.  In this case, with limited data, a simple equal weight portfolio will do better which conforms to a better performance for lower `k`.

For a well specified universe of assets, RSO with a higher `k` yields better results than lower `k`. This is most likely caused by the fact that simple random sampling of such universe with a small `k` will yield samples that contain highly mis-specified universe. This problem is magnified when the number of diversifying assets like bonds are significantly out-numbered by other assets like equities as the probability of sampling an asset with diversification benefits are far lower than sampling an asset without such benefits. Another word, with a lower `k`, one will most likely end up with a portfolio that contain a lot of risky assets relative to lower risk assets.

Possible future direction would be to figure out some ways of having to specify the `k` and `s` in a RSO. For example, randomly selecting `k` OR selecting a `k` such that it targets a certain risk/return OR maximize an user defined performance metric.

Thanks for reading,

Mike

Engineering Risks and Returns

In this post, I want to present a framework for formulating portfolio with targeted risk or return. The basic idea was inspired by controlling risk from a different point of view. The traditional way of controlling for portfolio risk was to apply a given set of weights to historical data to calculate historical risk. If estimated portfolio risk exceeds a threshold, we peel off allocation percentages for each asset. In this framework, I focus on constructing portfolios that target a given risk or return on a efficient risk return frontier.

First lets get some data to so we can visualize traditional portfolio optimization’s risk return characteristics. I will be using a 8 asset ETF universe.

rm(list=ls())
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
source(con)
close(con)
tickers = spl('EEM,EFA,GLD,IWM,IYR,QQQ,SPY,TLT')
data <- new.env()
getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
bt.prep(data, align='keep.all', dates='2000:12::')

Here are the return streams we are working with

Rplot

The optimization algorithms I will employ are the following:

  • Minimum Variance Portfolio
  • Risk Parity Portfolio
  • Equal Risk Contribution Portfolio
  • Maximum Diversification Portfolio
  • Max Sharpe Portfolio

To construct the risk return plane, I will put together the necessary input assumptions (correlation, return, covariance, etc). This can be done with create.historical.ia function in the SIT tool box.

#input Assumptions
prices = data$prices
n=ncol(prices)
ret = prices/mlag(prices)-1
ia = create.historical.ia(ret,252)
# 0 <= x.i <= 1
constraints = new.constraints(n, lb = 0, ub = 1)
constraints = add.constraints(diag(n), type='>=', b=0, constraints)
constraints = add.constraints(diag(n), type='<=', b=1, constraints)

# SUM x.i = 1
constraints = add.constraints(rep(1, n), 1, type = '=', constraints)

With the above we can go ahead and input both ‘ia’ and ‘constraints’ in to the above optimization algorithms to get weights. With the weights, we can derive the portfolio risk and portfolio return. These then can be plotted on a risk return plain visually.

# create efficient frontier
ef = portopt(ia, constraints, 50, 'Efficient Frontier')
plot.ef(ia, list(ef), transition.map=F)

Rplot01

The risk return plain in the above image shows all the possible space to which a portfolio’s risk and return characteristic can reside. Anything that is beyond to the left side of the frontier do not exist (unless leverage, to which the EF will also shift leftward too). Since I am more of a visual guy, I tend to construct this risk return plain whenever I am working on new allocation algorithms. This allows me to compare with other portfolio the expected risk and return.

As you can see, each portfolio algorithm has their own set of characteristics. Note that these characteristics fluctuate across the frontier were we to frame this rolling through time. A logical extension to these risk return concepts is to construct a portfolio that aims to target ether a given risk or a given return on the frontier. To formulate this problem in SIT for the return component, simply modify the constraints as follows:

constraints = add.constraints(ia$expected.return,type='>=', b=target.return, constraints)

Note that the target.return variable is simply a variable storing the desired target return. After adding the constraint, simply run a minimum variance portfolio and you will get a target return portfolio. On the other hand, targeting risk is a bit more complicated. If you look at the efficient frontier, you will find that for a given level of risk there is two portfolios that line on it.  (The sub-optimal portion of the efficient frontier is hidden). I solved for the weights using a multi optimization framework which employed both linear and quadratic (dual) optimization.

target.risk.obj<-function(ia,constraints,target.risk){

 max.w = max.return.portfolio(ia, constraints)
 min.w = min.var.portfolio(ia, constraints)
 max.r = sum(max.w * ia$expected.return)
 min.r = sum(min.w * ia$expected.return)
 max.risk = portfolio.risk(max.w,ia)
 min.risk = portfolio.risk(min.w,ia)

 # If target risk exists as an efficient portfolio else
 # return weights of 0
 if(target.risk >= min.risk | target.risk <= max.risk){
 out <-optimize(f =target.return.risk.helper,
 interval = c(0,max.r),
 target.risk = target.risk,
 ia = ia,
 constraints = constraints)$minimum
 weight=target.return.portfolio(out)(ia,constraints)
 }else{
 weight=rep(0,ia$n)
 }

 return(weight)
}

Below is a simple backtest that takes the above assets and optimizes for the target return or target risk component. Each will run with a target of 8%.

Backtest1Now the model itself requires us to specify a return or risk component. What if instead we make that a dynamic component such that we extract ether the risk or return component of a alternative sizing algorithm. Below are the performance of the dynamic risk or return component extracted from naive risk parity.

Backtest2

 

Not surprisingly, whenever we target risk, the strategy tends to become more risky. This confirms confirms risk based allocations are superior if investors are aiming to achieve low long term volatility.

 

Thanks for reading,

Mike

Max Decorrelation Portfolio

Its been almost almost two months since I posted. Finishing the school year off with exams and moving twice forced me to put the blog on hold. I hope to post more in the future!

Today I humbly attempt to formulate in R the maximum decorrelation algorithm in constructing portfolios. This method was formulated by Peter Christoffersen et al.  (a fellow Canadian at Rotman School of Management) and presented by EDHEC in a paper called: “Scientific Beta Maximum Decorrelation Indices“. For those interested in asset allocation and risk management, EDHEC has a treasure trove of papers and research.

In traditional mean variance optimization, we are minimizing the portfolio risk given estimations of the covariance matrix. More specifically, we need to estimate both volatility and correlation which are used to construct the covariance. The objective function to minimize is:

    

The problem with portfolio optimization models is that we are making forecasts about future covariance structures. As it is unlikely they will hold in the future, what may be optimal today may not be optimal in the next period. This is what most practitioners term as “estimation error”.  Over the years, there has been different ways to overcome this. Methods ranging from covariance shrinkage to re-sampled efficient frontiers are most widely known. Some have instead scrapped the entire optimization process and focused on simple heuristics algorithms in estimating optimal portfolio weights.

The Maximum Decorrelation portfolio attempts to reduce the number of inputs and use solely the correlation matrix as its main input assumption. Instead of focusing on volatility, the strategy assumes that individual asset volatility are identical. The object function to maximize is therefore:

  

The idea is that there is less stuff to estimate which should mean estimation error should be lower.

In R, the objective function becomes:

max.decorr<-function(weight, correl){
    weight <- weight / sum(weight)
    obj<-1- (t(weight) %*% correl %*% weight)
    return(-obj)
}

I am using R’s optim function. This is my first time formulating the objective function from scratch. While I am 90% sure I am correct, I am but a student and am all ears if there are any mistakes and errors (or more efficient way of implementing it). Please leave comments below : ).

I took the algorithm for a test drive and below are the results for the standard 10 asset class.

equity SideBySide WeightTransition MonthlyTable inTimePie

For benchmark purposes, I have used minimum variance and equal weight portfolios. The Max Decor strategy earned higher returns but with higher volatility, hence the lower sharpe compared to Min Var.

Code can be found here: Dropbox

 

Thanks for reading,

Mike

Equity Bond Exposure Management

I did a post last October (here) looking at varying allocation between stocks/bonds and at the end I hinted towards a tactical overly between the two asset classes. Six months later, I finally found a decent overlay I feel may hold value.

In a paper called “Principal Components as a Measure of Systemic Risk” (SSRN), Kritzman Et al. presented a method for identifying “fragile” market states. To do this, he constructed the Absorption Ratio. Here is the equation:

The numerator sigma represents the variance of the ith eigenvector, while the denominator one equals the variance of the jth asset. In the paper, n = 1/5 the total number of assets (N). The interpretation is simple, the higher the ratio, the more “fragile” the market state. The intuition behind this ratio is that when its high, it implies that risk is very concentrated. On the other hand, when it is low, risk is dispersed and spread out. Think weak and strong. Following is the raw AR through time of the DJ 30 Components. 

ar

As you can see, the ratio spikes during the tech bubble and the recent financial crisis. How would it look like when used as a filter? Below are two pictures comparing the signals generated by 200 day sma and standardized AR.

SMA ARatio

Pretty good at the timing in my opinion. In line with the paper, I reconstructed the strategy that switches between stocks(DIA) and bonds (VBMFX). When the AR is between 1 and -1, we will split 50/50. When its above 1, we are in love with bonds and when its below -1, we are  in love with stocks. Simple. Results:

EQPerf
Table

And here is the code: (I know its messy, didn’t have a lot of time! :)

Note: There is survivorship bias. I used the current day DJ30.

Thanks for reading

FAA: A Lookback in Time…

In the spirit of wrapping up the FAA model investigation, I thought I would extend the backtest further back to 1926. Data used are all monthly total return series from proprietary databases and they are the best estimates that I have to work with. Looking back so far offers a LOT of insights. One will be able to stress test how the specific strategy performed in different environments.

I employed 7 different asset classes: commodities, emerging market equities, US equities, US 10 year bonds, US 30 year bonds, short term treasuries and European equities. For benchmarking purposes, I constructed a simply momentum portfolio that holds the top 3 assets, an equal weight portfolio, and a traditional sixty-forty portfolio. Lookbacks for momentum are 4 months, in line with what Keller and Putten used.

FAA-Long-SS FAA-Long-Performance

One very interesting aspect I found from this extended backtest is to see how the strategies performed during the Great Depression. While equal weight and sixty forty suffered large draw downs, FAA and relative momentum did comparatively well.  Below is a deeper analysis into the Great Depression. As you can see, momentum strategies in general provided a great buffer against drawdown.

Depression

 

GD-PERF

The main reason for this is that during the drawdown period, the FAA strategy were all loaded with bonds:

GD-Holdings-FAA

 

 

When I am researching trading systems, I really like to break down its components apart and analyse it as much as possible. It is only by understanding how they fit together will you be able to judge its future viability. When it will work and when it won’t work. And since these days TAA strategies have become so pervasive, it begs to questions whether we are taking appropriate precautions to its future performance.

Flexible Asset Allocation

In my last post, I broke down the individual components to look at the performance of each factor. Although by themselves, the correlation and volatility factors weren’t that attractive, as a whole combined together, its a different story.

I’ve always been a proponent of simplistic approaches in system design as adding too many nuts and bolts to ensure sophistication only brings over-fit. In my opinion, when you are designing the alpha portion of your portfolio, you should look to design multiple simplistic strategies that are different in nature (uncorrelated). Take these return streams and overlay a portfolio allocation strategy and you will find yourself with a decent alpha generator with >1 risk return. Ok back to FAA…

Keller and Putten in their FAA system combined the signals of each factor by a simple meta rank function. This ranking function took the following form:

where m, c and v represents the factor rank of momentum, correlation and volatility respectively. Each factor is then given a weight. The meta ranking function is than ranked again and filter based on absolute momentum to arrive at the assets to invest in. Note that any assets that don’t pass the absolute momentum filter will be invested in cash (VFISX). When coding the meta ranking function, I found that there are times when some assets share the same final meta rank. This caused problem for some rebalance period when the assets to hold will exceed top N. I consulted with the authors and they revealed that “with rank ties, we select more than 3 funds.” Below is a replication of the strategy; it is tested with daily data as oppose to monthly data used by the authors.

FAA-perf faa-perf1

The model results are pretty decent. One aspect I may change would be the use of the cash proxy in the volatility ranking factor. By including the theoretical risk free rate that is suppose to have volatility of zero will skew the results to bias cash.

A reader commented on a little error in coding I made in the last post. Don’t sweat, it doesn’t change the performance one bit. I’ve modified the code and placed everything including the current code in to the FAA dropbox folder. Should you have any questions please leave a comment below.

Thanks for reading,

Mike