Queue Position Simulation

First off, Happy Thanksgiving! If time permits in the coming months I’d like to explore more on how I look at High Frequency (HF) data. Hopefully along the way I can spark some new discussion and improve on my thought process.

HFT strategy “simulation” is no easy task. I am referring to this as an simulation because its purely an approximation of how a strategy would have performed given a set of execution assumptions the researcher made beforehand. Should the assumptions change, the results would also change (significantly).

In my line of work, the edge we are seeking are generally less than a tick (futures). To make this even worth while, the constraints are that costs must be low AND we need to trade a lot. This may sound foreign to most of my readers as their time frames are generally much longer (days, weeks, even months). But at the end of the day, how much money we make is a simple function of our alpha * number of times we trade.

In HFT, execution is king. You can be right where the market moves the next tick but if you can’t get a fill, you are not making any money. Therefore it is paramount that when we conduct HF simulations, we make accurate execution assumptions.

Queue position, this is something that is worth a lot. Being first in line and getting a fill is like owning a call option in my world (where the premium is exchange fees per contract). The worst that can happen is you scratch assuming you are not the slowest one and there are people behind you. The image below is an analysis done on the expected edge you’d get N-events out (x-axis) assuming you are in various spots within the fifo queue. (QP_0 = first in line, QP_0.1 = 10th in line if there was 100 qty). As you can see, the further behind in line you are, the more you are going to be exposed to toxic flow, fancy word for informed traders.

 

How does one take this in to account when you simulate a strategy? When you place a limit order on the bid, how do you know when you will be filled? This depends on 2 factors, your place in line and trade flow. As time progresses there will be people who add orders to the fifo queue, people who cancel orders and people who take liquidity (trade). These actions are something one needs to keep track of tick by tick (or packet by packet) during a simulation. While most people assume tick data is the most fine grain dataset one can have in performing such simulations there actually exists packet data. Tick data simply gives you an aggregated snapshot of what an orderbook looks like – best bid, best offer, bid qty, ask qty (this is known as Market by price). Packet data on the other hand contains all the actions taking by all the market participants. This includes, trade matches and order submissions. This feed is also know as Market by order and its up to the market participant to build and maintain their own orderbook. Using packet data for simulation would be the most optimal as you will know exactly where you are in line.

When you only have tick data, the only way to conduct these type of simulations would be to make assumptions. Here is a simple example. When you place a limit buy on the bid you are going to be last in line. You keep track of two variables, qty_in_front and qty_behind. Additions are straight-forward. Just add them to qty_behind. Cancels are a little more tricky because you don’t know whether its coming from people in front of you or people behind. A work around is to have something I call a reduce ratio. Its can take a value between 0 and 1 and it controls the percentage that is cancelling in front of you. For example, in ES simulations, I would set this to around 0.1¬† ie when there is a total of 100 qty cancells, I’d assume 10 happens in front of me and 90 happens behind me. There are edge cases but I’ll leave the reader to figure it out themselves. This is just a way, not the only way, of going about simulating a fifo queue. More complicated ways include dynamically adjusting the reducing ratio as you approach the front of the queue.

How do you guys go about this? I’d love to hear.

 

Advertisements

4 comments

  1. How to handle queue position in backtests/simulations has been an important topic inside HFTs for years. First it depends on what kind of market data the exchange supplies and what you have available. If it’s order-by-order based, then you know exactly what happens to each price level’s queue as time passes. If it’s price-qty based then you need to estimate how each change to the qty at a particular price will impact the queue.

    In either situation, you need to make some assumptions about your latency placing orders and getting in to the matching engine. If you’re placing new orders in to the queue right at the same time competitors are, then you’ll need to have a way to assume whether you get in to the queue before or after them.

    1. 100% agree.

      Simulating latency is hard. The way I do it is to apply a fixed delay (which is a function of my latency) when I send orders. I’d love to know how others are doing it as my way is not a smart one given volume bursts.

      Building a proper market data replay engine is hard as it’s not easy to re-create the time between the events (ie you’d get a lot more packets around numbers than normal times).

      1. The places that are best at it use the latency from their own orders at that time (order send to ack receive for example) as a proxy for matching engine latency.

        Data replay engines usually just blast through the updates as fast as possible and all the components that care about time use the timestamps on the updates rather than “wall clock”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s