The Man Who Solved the Market – Notes

When it comes to the world’s most secretive hedge fund any content is worthwhile to read. I finished the book is 3 days and had to re-read a couple more chapters to ensure I fully absorbed the couple nuggets in there. I would recommend this book to everyone!

The mystery behind how Simons discovered the “truth” is shrouded in mystery. Even googling about what they traded doesn’t yield many answers. This new book by Gregory Zuckerman was an eye opener. It revealed how Renaissance came to be including Simons early struggles.

One of the surprising things I’ve learned was that Simons was actually the money guy. Though he did trade and built up the business in the early years, he wasn’t the main guy leading the research breakthroughs. Instead Simons seem to be running side gigs like investing in start ups back in the day. People like Ax, Berlekamp, Carmona, Laufer, Mercer, and Brown were the main brains behind all the models.

Now to the trading models. Even though the author isn’t trained in finance, I really think he did a decent job explaining some of the broad concepts utilized by Renaissance in the early days. Before 1988 (right before Carmona joined), Renaissance was a typical CTA / point and click trading firm. They utilized breakout models/linear regression (page 83). What changed around that time was both Carmona and Laufer started to data mine for trading patterns as oppose to hand crafting them. This especially stood out for me personally as about a year ago I started to conduct research via data mining. As Renaissance thrived through the 90s, more than 50% of their models were data mined. (page 203) Their reasoning resonated with me a lot. “Recurring patterns without apparent logic to explain them had an added bonus: They were less likely to be discovered and adopted by rivals…”

Additional interesting tidbit:

  • “Laufer’s work also showed that, if markets moved higher late in a day, it often paid to buy futures contracts just before the close of trading and dump them at the markets opening the next day.” Isn’t this the overnight premium hes talking about for equity futures? Sure sounds like it…. (page 144)
  • On the subject of managing models, Laufer insisted on a single model as oppose to multiple models. (page 142) Presented with many different signals, they built a trade selection algorithm that further determines which trades to take. Strategies that did well will automatically without human intervention get allocated more money.(page 144)
  • Started out trading end of day and slowly breaking down in to 2 sessions per day. Simons then suggested going down to 5 minute bars. (page 143)
  • “Did the 188th five-minute bar in cocoa future market regularly fall on days investors got nervous, while bar 199 rebounded?” page(143) Looked at intraday seasonality and conditional signals. Edge layered over edge to increase probability of being right.
  • Mercer and Brown took over Keplers Stat arb operation. Soon stock trading pnl was greater than futures trading.

While I am sure today’s Renaissance is far from what the booked described, the broad concepts like data mining, alternative data collection, and stat arb all play a role in their continued success in some form or fashion. Please let me know if I’ve missed anything interesting. Of course, I may have interpreted it entirely wrong. Please leave a comment below!



Zuckerman, Gregory. The Man Who Solved the Market. Portfolio/Penguin, 2019.

Automated Trading System – Internal Order Matching

Most automated trading systems (ATS) are built such that there are little to no interactions between component models. This is limiting. Here I am referring to a trading system as the overarching architecture that houses multiple individual models.

Without interactions, each model is operating within an environment that it is preconceived in. For example, mean reversion can happen in different time/event frequencies. A model that is parameterized to take advantage on certain frequency will not have knowledge of others.

One component within an ATS that is rather complicated to architect is the order management system. The OMS is the component that handles all order requests generated by the prediction models. It must always be aware of outstanding orders (limit/market, etc), partial fills, and proper handling of rejects,etc. Now the complexity is increased when a portfolio of prediction models all generate an order for a given tick. (Which should be processed first?)

The general rule of thumb in dealing with this is to aggregate all orders by asset to reduce transaction costs. If there are a mix of long/shorts, the net will be the final order quantity. When it is filled, simply dis-aggregate them back in to component fills to the respective models (internal matching). The annoying part, in my opinion, is when you introduce multiple order types. For example, Limit and Market orders. How would one architect the OMS to handle both? Hmm.. This goes back to the debate of the degree of coupling between the strategy and the OMS itself….


The code below represents the CRSM algorithm. It is adopted to the SIT framework. I have refactored the code so that its is easier for the user to use and understand.

I would like to once again thank you David Varadi for his tireless effort in guiding me along this past year. My gratitude also goes out to Adam Butler and BPG Associates for their support  all along the way.

Download Code: here



Shiny Market Dashboard

I’ve been asked multiple times regarding code for the dashboard so thought I’d release it. I coded the whole thing in one night last year so it’s not the best and most efficient, buts it’s a good framework to get your own stuff up and running. It’s long, north of 700 lines of code.


On another note, I’ve recently graduated from University and am excited to be moving to Chicago in a month for work. I am looking forward to the exciting opportunity. Also I will be releasing my graduating thesis in the coming weeks on RSO optimization. David Varadi has been my thesis advisor and mentor and I want to thank him for that!


Natural Language Processing

I’ve recently updated some matlab code about machine learning on to github. Today I’ll be adding to that with some Natural Language Processing (NLP) Code. The main concepts we covered in class were ngram modelling which is a markovian process. This means that future states or values have a conditional dependence on the past values. In NLP this concept is utilized via training n gram probabilities models on given texts. For example, if we specify N to equal to 3, then each word in a given sentence depends on the last two words.

So the equation for conditional probability is given by:


Extending this to multiple sequential events, this is generalized to be (chain rule)

CodeCogsEqn (1)


This above equation is very useful for modelling sequential stuff like sentences. Extension to these concepts to finance are utilized heavily in hidden Markov models that attempts to model states in various markets.  I hope the interested reader comment below for other interesting applications.

The last topic we are covering is class is computer vision. As of now, topics like image noise reduction via Gaussian filtering, edge detection, segmentation are being covered. I will post more about them in the future.

Code Link



Artificial Intelligence

Artificial intelligence surrounds much of our lives. The aim of this branch of computer science is to build intelligent machines that is able to operate as individuals; much like humans. I am sure most of us have watched the Terminator movie series and questioned to what extent will our own society converge to that in the movie. While that may sound preposterous, much of what automated system developers do revolves around building adaptive systems that react to changes in markets. Inspired by a course I am taking at school right now, I would like to use this post as a general fundamental intro to AI.

If you ask people what intelligence is, most will initially find it hard to wrap words around the idea. We just know what it is and our gut tells us we, humans, are the pinnacle of what defines intelligence. But fact is, intelligence encompasses so much. According to the first sentence in wikipedia, there are ten different ways to define it.

“Intelligence has been defined in many different ways including logic, abstract thought, understanding, self-awareness, communication, learning, having emotional knowledge, retaining, planning, and problem solving.” -Wikipedia

Since it encompasses so much it is not easy to define it in a single sentence. What can be said is that intelligence relates to one’s ability to problem solve, reason, perceive relationships, and learn.

Now that I’ve offered a sense of what intelligence means, what, on the other hand, is artificial intelligence? Artificial intelligence is the field of designing machines that is capable of intelligent behavior; machines that is able to reason; machines that is able to learn. More precisely, the definition of AI can be organized in to four different sections:

  • Thinking Humanly
  • Thinking Rationally
  • Acting Humanly
  • Acting Rationally

The first two relates to thoughts processes while the last two relates to behavior. Thinking humanly revolves around whether the entity in question is able think and have a mind of its own. This is essentially making decision, learning and  problem solving. Acting humanly is whether a machine is able to process and communicate language, store information or knowledge and act based on what it knows and learn to adapt based on new information. These set of required traits are formulated based on the famous Turing Test which examines if a machine is able to act like a human through answering questions asked by anther human being. The machine passes the test if the person asking the question isn’t able to determine if its a machine or human. Thinking rationally closely incorporates the study of logic and logical reasoning. It was first introduced by Aristotle who attempted to provide a systematic way of inferring a proposition based on a given premise. An famous example would be “Socrates is a man; all man are mortal; therefore, Socrates is mortal.” Lastly, acting rationally is the idea of choosing the most suitable behavior that produces the best expected outcome. Another word one is rational if given all its knowledge and experience, selects the action that maximizes their own performance measure/utility.


When studying AI, the term agent is used to represent an entity/model that interacts with the environment. More precisely, an agent perceives the environment through its sensors and employ actions through actuators. Comparing this to humans, imagine sensors as eyes/ears and actuators as arms and legs. The sensors will at each time step take inputs, called percepts, which are than processed by the agent program. The agent program then passes the inputs in to an agent function. The agent function maps inputs to correct outputs (actions) which are then sent via the agent program to the actuators. This agent based framework closely relates to automated trading systems. The environment is the market and the changing prices at each time interval. The agent program would be our trading system which takes in daily price information and pipes it into the agent function, or the logic of the trading system. For example, todays new price is updated which is passed in to the trading logic. The logic specifies that if the current price is $10, it will sell. The sell action is passed back to the environment as a sell order.

The above example is a very basic type of agent known as the simple reflex agent. This type of agent only makes decision solely on the current percept (price). It doesn’t have a memory of the previous states. A more complex agent known as the model based reflex agent is one in which it has memory of the past, known as its own percept sequence. Also this agent has an internal understanding of how the environment works which is detailed in its own model. This model of the world takes inputs and identify the state it is in. Given the state, the model forecasts what the likely environment will be like in the next time step. Proper action is then recommended and executed via actuators. (Think of markov models)  So far, the agents I’ve introduced largely reflect that of a function that take input and spits out a output. To make things more humanly, the next agent I will introduce is called a goal based agent. This is similar to how given our current circumstances, we aim to maximize out objective function. The objective function can be money or anything that makes us happy. More concretely, the goal based agent is an extension of the model based reflex agent but it assigns a score for each recommended action. The agent will choose the one that maximizes its own objective function.

The reader will most likely ask how this knowledge helps them make money in the markets. What I can say is that finance is enter a brave new world where together with technology is transforming how money is being made in the markets. Having a understanding in finance and statistics in my opinion is not enough. Those are the areas where your competitors are already fishing (mostly). Knowledge in subject areas like AI, speech recognition, natural language processing, machine learning, and computer vision (just to name a few) will allow you to be more creative in design. I urge the curious minds to explore the unexplored!

Engineering Risks and Returns

In this post, I want to present a framework for formulating portfolio with targeted risk or return. The basic idea was inspired by controlling risk from a different point of view. The traditional way of controlling for portfolio risk was to apply a given set of weights to historical data to calculate historical risk. If estimated portfolio risk exceeds a threshold, we peel off allocation percentages for each asset. In this framework, I focus on constructing portfolios that target a given risk or return on a efficient risk return frontier.

First lets get some data to so we can visualize traditional portfolio optimization’s risk return characteristics. I will be using a 8 asset ETF universe.

con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
tickers = spl('EEM,EFA,GLD,IWM,IYR,QQQ,SPY,TLT')
data <- new.env()
getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
bt.prep(data, align='keep.all', dates='2000:12::')

Here are the return streams we are working with


The optimization algorithms I will employ are the following:

  • Minimum Variance Portfolio
  • Risk Parity Portfolio
  • Equal Risk Contribution Portfolio
  • Maximum Diversification Portfolio
  • Max Sharpe Portfolio

To construct the risk return plane, I will put together the necessary input assumptions (correlation, return, covariance, etc). This can be done with create.historical.ia function in the SIT tool box.

#input Assumptions
prices = data$prices
ret = prices/mlag(prices)-1
ia = create.historical.ia(ret,252)
# 0 <= x.i <= 1
constraints = new.constraints(n, lb = 0, ub = 1)
constraints = add.constraints(diag(n), type='>=', b=0, constraints)
constraints = add.constraints(diag(n), type='<=', b=1, constraints)

# SUM x.i = 1
constraints = add.constraints(rep(1, n), 1, type = '=', constraints)

With the above we can go ahead and input both ‘ia’ and ‘constraints’ in to the above optimization algorithms to get weights. With the weights, we can derive the portfolio risk and portfolio return. These then can be plotted on a risk return plain visually.

# create efficient frontier
ef = portopt(ia, constraints, 50, 'Efficient Frontier')
plot.ef(ia, list(ef), transition.map=F)


The risk return plain in the above image shows all the possible space to which a portfolio’s risk and return characteristic can reside. Anything that is beyond to the left side of the frontier do not exist (unless leverage, to which the EF will also shift leftward too). Since I am more of a visual guy, I tend to construct this risk return plain whenever I am working on new allocation algorithms. This allows me to compare with other portfolio the expected risk and return.

As you can see, each portfolio algorithm has their own set of characteristics. Note that these characteristics fluctuate across the frontier were we to frame this rolling through time. A logical extension to these risk return concepts is to construct a portfolio that aims to target ether a given risk or a given return on the frontier. To formulate this problem in SIT for the return component, simply modify the constraints as follows:

constraints = add.constraints(ia$expected.return,type='>=', b=target.return, constraints)

Note that the target.return variable is simply a variable storing the desired target return. After adding the constraint, simply run a minimum variance portfolio and you will get a target return portfolio. On the other hand, targeting risk is a bit more complicated. If you look at the efficient frontier, you will find that for a given level of risk there is two portfolios that line on it.  (The sub-optimal portion of the efficient frontier is hidden). I solved for the weights using a multi optimization framework which employed both linear and quadratic (dual) optimization.


 max.w = max.return.portfolio(ia, constraints)
 min.w = min.var.portfolio(ia, constraints)
 max.r = sum(max.w * ia$expected.return)
 min.r = sum(min.w * ia$expected.return)
 max.risk = portfolio.risk(max.w,ia)
 min.risk = portfolio.risk(min.w,ia)

 # If target risk exists as an efficient portfolio else
 # return weights of 0
 if(target.risk >= min.risk | target.risk <= max.risk){
 out <-optimize(f =target.return.risk.helper,
 interval = c(0,max.r),
 target.risk = target.risk,
 ia = ia,
 constraints = constraints)$minimum


Below is a simple backtest that takes the above assets and optimizes for the target return or target risk component. Each will run with a target of 8%.

Backtest1Now the model itself requires us to specify a return or risk component. What if instead we make that a dynamic component such that we extract ether the risk or return component of a alternative sizing algorithm. Below are the performance of the dynamic risk or return component extracted from naive risk parity.



Not surprisingly, whenever we target risk, the strategy tends to become more risky. This confirms confirms risk based allocations are superior if investors are aiming to achieve low long term volatility.


Thanks for reading,


Some Shiny Stuff

At the beginning of the summer I knew Shiny was going to be an indispensable tool for my work to connect with my readers. For those of you who don’t yet know what Shiny is, it is a web application package developed by RStudio. It integrates flawlessly with R and has been nothing but excitement playing with it.

In terms of difficultly, Shiny differs in its structure. While it may be intimidating initially to R veterans, I urge you to be patient with it. I learned it by examples, and Systematic Investors’ Michael Kapler has got numerous posts on how the basic framework come together (here).

I thought I’d share my own Market Dashboard. To access the dashboard, simply go to the follow link.


The Market Dashboard is divided into 3 Tabs. The first tab is called “Main.” This tab is entirely created for single-asset and cross asset comparisons. The charts in the entire application are interactive. This is useful when you want to rank assets based on the table at the bottom of the page. I created 6 charts (no reason for any one, just came to mind). These include:

  • normalized Equity prices given lookback
  • 12 Month Performance
  • Annualized CAGR
  • Percent Volatility Rank
  • Financial Turbulence Index
  • Efficiency Ratio
  • Table of Statistics (Can be ranked when you click on the title)


On the second tab, I created a broad based Asset Analytics tab. This tab aims to put all asset classes (ETF as proxy) together in coherent fashion for easy digestion. There are three main sections. First section is all asset comparison (a) (this is my attempt to replicate this: here 😉 ), second is individual asset class comparison (b), and lastly is a cluster chart comparison (Varadi & Kapler).





Last tab is called the “Macro Analytics” Tab. Here I aim to bring together US macro fundamentals in to a single page. Fundamentals include:

  • Real GDP
  • Inflation
  • Yield Curve
  • Inflation Expectation
  • Industrial Production

Untitled4 Untitled5 Untitled6

This app is something I just pulled together real quick. There will be design issues but I just wanted to get the idea across: Shiny can be very powerful! Hope you guys will have fun with this and find it useful.

Thanks for reading,


Note: The application is highly un-optimized (slow). It downloads all data over the internet (Yahoo and Quandl). This is entirely for educational purposes. Please do not make financial decisions based on the applications output. I do not guarantee the correctness of the code.

Principal Component Analysis in Portfolio Management

I’ve spend quite some time with PCA for the last couple of months and I thought I would share some of my notes on this subject. I must caution that this is one of the longest post I’ve done. But I find the subject very interesting as it has a lot of potential in the asset allocation process.

PCA stands for principal component analysis and it is a dimensionality reduction procedure to simpify your dataset. According to Wikipedia, “PCA is a mathematical procedure that uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.” It is designed in a way such that each successive principal components explains a decreasing amount of variation in your dataset. Below is a screeplot of the percentage variance explained by each factor on a four asset class universe.


PCA is done via an eigen decomposition on a square matrix which in finance is ether a covariance matrix or a correlation matrix of asset returns. Through eigen decomposition, we will get eigenvectors (loadings) and eigenvalues. Each eigenvalue, which is the variance of that factor, is associated with an eigenvector. The following equation relates both the eigenvectors and eigenvalues.


The eigen vector E of a square matrix A equals to an eigenvalue (lambda) multiplied by the eigenvector. While I must confess I was bewildered towards the potential application of such values and vectors at school, it does come to make more sense when you place it in to financial context.

In asset allocation, PCA can be used to decompose a matrix of return into statistical factors. These latent factors usually represent unobservable risk factors that are imbedded inside asset classes. Therefore allocating across these may improve one’s portfolio diversification.

The loadings (eigenvectors) of a PCA decomposition can be treated as principal factor weights. Another words, they represents asset weights towards each principal component portfolio. The total number of principal portfolios equals to the number of principal components. Not surprisingly, the variance of each principal portfolio is its corresponding eigenvalue. Note that the loadings are designed to have values ranging from +1 to -1 meaning short sales are entirely possible.

Since I am not a math major, I don’t really understand any math equations until I actually see it being programmed out. Lets get our hands dirty with some data:

sit = getURLContent('https://github.com/systematicinvestor/SIT/raw/master/sit.gz', binary=TRUE, followlocation = TRUE, ssl.verifypeer = FALSE)

con = gzcon(rawConnection(sit, 'rb'))

data <- new.env()

getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)

bt.prep(data, align='remove.na', dates='1990::2013')

ret<-na.omit(prices/mlag(prices) - 1)

To calculate return we can simply


This can be represented by in R by

p.ret<-(weight) %*% t(ret)

Note that if R is a multi-row matrix return series, you will actually get back a vector of return series with the same number of rows as R. This is simply the portfolio return series, assuming daily rebalancing.

There are like 3 different way of doing a PCA in R. I will show a foundational way and a functionalized way of doing it. Such two-step process will help see what’s going on under the hood.

#PCA Foundational
demean = scale(coredata(ret), center=TRUE, scale=FALSE)
evec<-eigen(cov(demean), symmetric=TRUE)$vector[] #eigen vectors
eval<-eigen(cov(demean), symmetric=TRUE)$values #eigen values
#PCA Functional
evec<-pca$rotation[] #eigen vectors
eval <- pca$sdev^2 #eigen values

The foundational way uses the build in “eigen” function to extract the eigenvectors and eigenvalues. This way requires that you demean (scale function) the data before calculating the covariance matrix. The return values are stored in a R list data structure. On the other hand, the  functional PCA just takes in an return series vector and it does the rest.

After calculating the eigenvector, eigenvalues, and the covariance matrix, the following equation will hold true:


In the above equation, E and lambda, again, are the eigenvector and eigenvalues respectively. Sigma represents the demeaded covariance matrix. In R, this can be computed by:

diag(t(evec) %*% covm %*% evec) #reverse calculate eigenvalues

Now we have all the components to calculate principal portfolios. These return series are simply the latent factors that are embedded inside asset classes. Each asset class are exposed to each factor pretty consistently through time which may help further understand the inherent risk structure. To calculate the principal component portfolios, we will use the following formula:


This equation is very intuitive as just like earlier, the eigenvectors are “weights,” therefore applying it to the returns will yield the N different principal portfolio return streams.  In R:

inv.evec<-solve(evec) #inverse of eigenvector
pc.port<-inv.evec %*% t(ret)


In Meucci’s paper “Managing Diversification”, he showed that with the following formula one can convert a vector of weights from the asset space to the principal component space. More specifically, given a vector of asset weights, one could now show the exposure to each principal component.


Next I would like to show how from the above equation one will be able to calculate the exposure of traditional portfolio optimization weights to each principal component factors. I will be using the same universe of assets and applying Minimum Variance, Risk Parity, Max Diversification, Equal Risk Contribution, Equal Weight, and David Varadi’s et al Minimum Correlation Algorithm (mincorr2).ExposurePCA Table_exposures

As you can see, risk base portfolio optimization in the traditional asset space leads to excessive exposure towards the interest rate factor (bonds). The general equity risk factor (factor 1) accounts for the second largest concentration. Prudent investors looking at this chart will immediately notice the diversification potential if one were to allocate to factors 2 and 3.

In R, the principal component exposure equation can be calculated as follows:

factor.exposure<-inv.evec %*% t(weight)

Through simple algebra, one can convert back and forth between the asset space and the principal space easily. In the following example, I am reconstructing using equal weights.

pc.port<-as.xts(t(pc.port))  #correct dimension of principal portfolio returns
p.ret1<-t(factor.exposure) %*%  t(pc.port)

If you are following along, you will notice that the equity curves derived from the variables “p.ret” and “p.ret1” are identical. This confirms that we have correctly converted back to the asset space.

Equity Curves

Next, to calculate the variance contribution due to the N-the principal component we can simply:

A neat aspect is that from the above equation, we can arrive at the portfolio variance in the asset space with the following equation (sum because PCs are uncorrelated):

The following R code confirms that their variance are the same in the PC space and the asset space.

sum((factor.exposure^2)*eval) #variance concentration from PCA
apply(p.ret,2,var) #portfolio risk from equal weight

With this as proof, one can actually reformulate the minimum variance portfolio from the principal space. We know that portfolio variance can be calculated according to the following formula

Just to repeat, we know that given a set of asset weights we can calculate our exposure to each principal factor using:

pca-exposureIsolating asset weights, substituting and then simplifying:

Since we know that


Then portfolio variance is simply

This is intuitive as since the correlation between principal portfolios are zero, the covariance are all zero too. What is left behind are the variances of the eigen portfolios. Minimizing the above and converting the weights back to asset space should give you equivalent weights from a MVO. While I understand that there is no practicality in the above steps as it adds complexity and computing power, I just wanted to illustrate that these two spaces are tied together in many aspects.

This post is going to get out of hands if I don’t stop. As you can see PCA is a very interesting technique, it allows you to access latent factors and how they interact with the asset space. I hope curious readers can go on exploring more and post below if you find any interesting things you stumble across.

Thanks for reading,



“Return = Cash + Beta + Alpha” -Bridgewater

What a coincidence, Zerohege just posted a piece where Bridgewater identifies the origin of their All Weather framework. (Here)

Its interesting to read about their thought process and here are a few quotes I found interesting:

“Any return stream can be broken down into its component parts and analysed more accurately by first examining the drivers of those individual parts.”

“Return = Cash +Beta + Alpha”

“Betas are few in number and cheap to obtain. Alphas (ie trading strategy) are unlimited and expensive. … Betas in aggregate and over time outperform cash. There are sure things in investing. That betas rise over time relative to cash is one of them. Once one strip out the return of cash and betas, alpha is a zero sum game. ”

“there is a way of looking at things that overly complicates things in a desire to be overly precise and easily lose sight of the important basic ingredients that are making those things up”

Separately managing the beta and alpha portion of the portfolio seems like a reasonable long term framework. For example, build a stable portfolio (beta) for the majority of your wealth and then overlay that with your desired amount of alpha to spice up the return. But it is important to make sure you understand how the two return streams (beta and alpha) interact fundamentally, for example factors that contribute to the return of the beta portion should be different compared to the alpha portion. It is only through this can the uncorrelated return stream diversify away your risk.