# Random Subspace Optimization: Max Sharpe

I was reading David’s post on the idea of Random Subspace Optimization and thought I’d provide some code to contribute to the discussion. I’ve always loved ensemble methods since combining multiple streams of estimates makes more robust estimation outcomes.

In this post, I will show how RSO overlay performs using max sharpe framework. To make things more comparable, I will employ the same assets as David for the backtest. One additional universe I would like to incorporate is the current day S&P 100 (survivorship bias).

Random subspace method is a generalization of the random forest algorithm. Instead of generating random decision trees, the method can employ any desired classifiers. Applied to portfolio management, given N different asset classes and return streams, we will randomly select `k` assets `s` times. Given `s` different random asset combinations, we can perform a user defined sizing algorithm for each of them. The last step is to combined them though averaging to get the final weights. In R, the problem can be easily formulated via `lapply` or `for` loops as the base iterative procedure. For random integers, the function `sample` will be employed. Note my RSO function employs functions inside Systematic Investors Toolbox.

```
rso.optimization&amp;amp;lt;-function(ia,k,s,list.param){
size.fn = match.fun(list.param\$weight.function)
if(k &amp;gt; ia\$n) stop(&amp;quot;K is greater than number of assets.&amp;quot;)
space = seq(1:ia\$n)
index.samples =t(replicate(s,sample(space,size=k)))
weight.holder = matrix(NA,nrow = s , ncol = ia\$n)
colnames(weight.holder) = ia\$symbol.names

hist = coredata(ia\$hist.returns)
constraints = new.constraints(k, lb = 0, ub = 1)
constraints = add.constraints(diag(k), type='&amp;amp;;=', b=0, constraints)
constraints = add.constraints(diag(k), type='&amp;amp;lt;=', b=1, constraints)

#SUM x.i = 1
constraints = add.constraints(rep(1, k), 1, type = '=', constraints)

for(i in 1:s){
ia.temp = create.historical.ia(hist[,index.samples[i,]],252)
weight.holder[i,index.samples[i,]] = size.fn(ia.temp,constraints)
}
final.weight = colMeans(weight.holder,na.rm=T)

return(final.weight)
}

```

The above function will take in a `ia` object, short for input assumption. It calculates all the necessary statistics for most sizing algorithms. Also, I’ve opted to focus on long only.

The following are the results for 8 asset class. All backtest hereafter will keep `s` equal to 100 while varying `k` from 2 to N-1, where N equals the total number of assets. The base comparison will be that of simple max sharpe and equal weight portfolio.

The following is for 9 sector asset classes.

Last but not least is the performance for current day S&P 100 stocks.

The RSO method seems to improve all the universes that I’ve thrown at it. For a pure stock universe, it is able to reduce volatility by more than 300-700 basis points depending on your selection of k. In a series of tests across different universes, I have found that the biggest improvements from RSO comes from applying it to a universe of instruments that belong to the same asset class. Also, I’ve found that for a highly similar universe (stocks), a lower `k` is better than a higher `k`. One explanation: since the max sharpe portfolio of X identical assets is equal to that of an equal weight portfolio, we can postulate that when the asset universe is highly similar or approaching equivalence, resampling with a lower `k` Y times where Y approaches infinity, we are in a sense approaching the limit of a equally weighted portfolio. This is in line with the idea behind curse of dimensionality: for better estimates,  the data required grows exponentially when the number of assets increase.  In this case, with limited data, a simple equal weight portfolio will do better which conforms to a better performance for lower `k`.

For a well specified universe of assets, RSO with a higher `k` yields better results than lower `k`. This is most likely caused by the fact that simple random sampling of such universe with a small `k` will yield samples that contain highly mis-specified universe. This problem is magnified when the number of diversifying assets like bonds are significantly out-numbered by other assets like equities as the probability of sampling an asset with diversification benefits are far lower than sampling an asset without such benefits. Another word, with a lower `k`, one will most likely end up with a portfolio that contain a lot of risky assets relative to lower risk assets.

Possible future direction would be to figure out some ways of having to specify the `k` and `s` in a RSO. For example, randomly selecting `k` OR selecting a `k` such that it targets a certain risk/return OR maximize an user defined performance metric.

Mike

1. I think that it’s very important to incorporate effective transaction costs for a real comparison between different optimization algos. Without them every conclusion about the goodness of an algo relative to another could be only on paper.
Do you know a way to introduce percentual commission costs in systematic toolbox? I don’t think that dollar commission is the right thing to do. In the past i worked on percentual commission costs, but it’s too difficult for my actual R knowledge.

1. can you explain what you mean by percentual commission cost? How would you calculate it?

1. For example, you buy 100 eem us at 42.66
Your cost is: 100 * 42.66 = 4266 usd
After you sell at 42.90. Your income is 100 * 42.90 = 4290
Without commission (broker + slippage) your profit is 4290-4266 = 24 usd

Now i want to study my strategy with 0.1% commission.

My new cost will be: 4266 + (4266 * 0.001)
My new income will be: 4290 – (4290 * 0.001)
My profit will be: 24 – 4.266 – 4.29

I hope it’s clear, sorry for my english

2. Hello there, I am don’t think its possible to actually set commission as a percentage directly in SIT. There are definitely other ways of stress testing a strategy that are equivalent I think.

2. What happens if you perform the RSO and on a limited sample and then intentionally walk it forward into a bear market?

1. hey Shaun, what do you mean by limited sample. Do you mean a small `k` or `s`?

1. Your results go from 2000-2013. What happens if you optimize from 2000 to June 2008? Looking at the walk forward performance through the clinician crisis would give you a good idea of the optimization’s robustness to a major shift in market conditions.

2. For example? how do you stress strategy in an equivalent way?
I think it’s very rational.
You have to compare two different optimization algo. It’s evident that you need to penalize the one that makes more trades. And penalize in an effective way, not too much and not too little. Otherwise every conclusion (wich one is risk-adjusted better) is irrealistic, only on paper. I wonder why i am the only one that is so sensible to this problem!

1. When you finish backtesting and looping over all the rebalance dates, SIT will return you a list of objects including (not limited to) equity line over time, weights over time, etc. With these information, you can go back in time and perform any stress test you want. One thing I’ve always found to be pretty cool would be to extract the daily returns from the equity curve, and just do a Monte Carlo simulation and reconstruct it. There are two ways to do this, simply reordering or sampling with replacement. This method does have its drawbacks, namely the assumption that the sequences of the equity return has some form of auto-correlation and when your sample size is small, it may not reflect reality.

But coming back to your point, you can go ahead and go back in time to calculate the turnover between each rebalance and formulate some form of loss function to penalize the return. One heuristic at the top of my head would be to take a fixed percentage of the turnover and subtract it from the portfolio return. From there you can re-construct it and calculate the performance statistics. Again, there are a lots of ways to approach this and the above are just indirect was of doing it. But I agree that your way of approaching the stress test through percentage commission is more direct and gets to the point. I suggest you go ahead and dive deeper in to the core framework of SIT and see how commission is introduced and maybe write a function for it.