First off, Happy Thanksgiving! If time permits in the coming months I’d like to explore more on how I look at High Frequency (HF) data. Hopefully along the way I can spark some new discussion and improve on my thought process.
HFT strategy “simulation” is no easy task. I am referring to this as an simulation because its purely an approximation of how a strategy would have performed given a set of execution assumptions the researcher made beforehand. Should the assumptions change, the results would also change (significantly).
In my line of work, the edge we are seeking are generally less than a tick (futures). To make this even worth while, the constraints are that costs must be low AND we need to trade a lot. This may sound foreign to most of my readers as their time frames are generally much longer (days, weeks, even months). But at the end of the day, how much money we make is a simple function of our alpha * number of times we trade.
In HFT, execution is king. You can be right where the market moves the next tick but if you can’t get a fill, you are not making any money. Therefore it is paramount that when we conduct HF simulations, we make accurate execution assumptions.
Queue position, this is something that is worth a lot. Being first in line and getting a fill is like owning a call option in my world (where the premium is exchange fees per contract). The worst that can happen is you scratch assuming you are not the slowest one and there are people behind you. The image below is an analysis done on the expected edge you’d get N-events out (x-axis) assuming you are in various spots within the fifo queue. (QP_0 = first in line, QP_0.1 = 10th in line if there was 100 qty). As you can see, the further behind in line you are, the more you are going to be exposed to toxic flow, fancy word for informed traders.
How does one take this in to account when you simulate a strategy? When you place a limit order on the bid, how do you know when you will be filled? This depends on 2 factors, your place in line and trade flow. As time progresses there will be people who add orders to the fifo queue, people who cancel orders and people who take liquidity (trade). These actions are something one needs to keep track of tick by tick (or packet by packet) during a simulation. While most people assume tick data is the most fine grain dataset one can have in performing such simulations there actually exists packet data. Tick data simply gives you an aggregated snapshot of what an orderbook looks like – best bid, best offer, bid qty, ask qty (this is known as Market by price). Packet data on the other hand contains all the actions taking by all the market participants. This includes, trade matches and order submissions. This feed is also know as Market by order and its up to the market participant to build and maintain their own orderbook. Using packet data for simulation would be the most optimal as you will know exactly where you are in line.
When you only have tick data, the only way to conduct these type of simulations would be to make assumptions. Here is a simple example. When you place a limit buy on the bid you are going to be last in line. You keep track of two variables, qty_in_front and qty_behind. Additions are straight-forward. Just add them to qty_behind. Cancels are a little more tricky because you don’t know whether its coming from people in front of you or people behind. A work around is to have something I call a reduce ratio. Its can take a value between 0 and 1 and it controls the percentage that is cancelling in front of you. For example, in ES simulations, I would set this to around 0.1 ie when there is a total of 100 qty cancells, I’d assume 10 happens in front of me and 90 happens behind me. There are edge cases but I’ll leave the reader to figure it out themselves. This is just a way, not the only way, of going about simulating a fifo queue. More complicated ways include dynamically adjusting the reducing ratio as you approach the front of the queue.
How do you guys go about this? I’d love to hear.