Archived Discussions

James Goode, Consultant Programmer

Saturday, December 20, 2014

How extensive is this error? Is it just one erroneous value, or is there a series of incorrect values reported over a period of say 10 minutes or more?

I assume you confirmed the incorrect data from independent sources.

How time critical is the data? Is it possible to pause for a second or two to check data from alternate sources? Or would that damage the strategy performance? Last time I used IB it seemed that the fastest data was sub-second, but not much faster (1/3 of a second).

Alex Krishtop, trader, researcher, consultant in forex and futures

Saturday, December 20, 2014

First off, what do you use to access their market data? TWS, gateway, fix api, whatever? What is the architecture of your programmatic solution? Why do you need to connect to two data sources from the same vendor?

What's the market? FX? What are trading sizes in both accounts then? FX liquidity and volume at IB may be seriously different for different accounts depending on a number of factors.

What are the account types — individual, advisor, institutional? Kind of data subscription?

Unfortunately there's just too little input data to advise anything meaningful, if you elaborated on these points I'd be happy to help if I could.

Rob Terpilowski, Software Architect

Sunday, December 21, 2014

The software trades the US equity markets, in 2 separate individual accounts, with 2 instances of the IB Gateway running on my desktop. Each gateway is talking its own instance of my automated trading application, and so each app is subscribing to IB's market data, hence the reason for the 2 market data subscriptions.

The application will buy at the close if the volume for a security is above a certain threshold, so speed is not necessarily critical, and I could in theory try to verify the volume data from a 2nd source if I had to.

What I've been seeing is that throughout the day the volume numbers that the application reports from the 2 accounts start to diverge as the day goes on, until the end of the day when one account will show volume that is about 25% or so higher than the other account.

I was curious if this was potentially an issue with the accounts' data feeds, or with the IB Gateway itself, so yesterday I fired up 2 instance of TWS, one for each account, and pulled up quotes for QQQ. Sure enough, each instance was reporting different volume numbers, with one of the accounts showing about 20% higher volume.

I was able to chat with someone from tech support, and learned that IB has 2 methodologies they use to report volume, "Native" and "Calculated" defined below:

Native volume - Does not update with every tick, but will include delayed transactions, busts, late-reported trades and combos.

Calculated volume - Updates with every tick, but may not include delayed transactions, busts, late-reported trades and combos.

In TWS you can toggle the volume column to use Native or Calculated, but in the Gateway there is no such mechanism to do this, and the IB support analyst I was chatting with was at a loss as to why both gateway instances weren't reporting the same volume, and have escalated the issue up to their developers. The problem first surfaced about a week ago. I haven't made any changes in years to my application, and my Gateway/TWS version hasn't been updated for about 4 months, so it appears to be an issue on IB's server side, but we'll see what they're response if after they've had a chance to research this.

Gary E., Owner, Erdman Computer Consulting, Inc. and Investment Management Consultant

Sunday, December 21, 2014

I get my 5-minute quote data from TD Ameritrade (I used to get data from PC Quote or MarketSmart - that data was real bad but it made me write the logic needed to make sure I only have "clean" data). Volume received is always cumulative volume for the day.

I only use volume to determine if a stock is currently in what I call an "extreme" trading mode.

If I receive a quote with the same volume as a previous quote received the same day, then I know the quote is bad and I reject the quote. This may be what's happening with you. This particular problem has been around the industry for at least the last 10 years. The source of this problem is with the data source itself and everyone uses the same data source (Comstock?). These "bad" volume records will repeat throughout the day periodically and then that will be it for several weeks/months for that symbol.

If a quote comes in with zero volume it is rejected.

If volume is less than the previous quote's volume, then the quote is rejected.

Colin "Soup" Campbell, Trader at IFundTraders

Sunday, December 21, 2014

There is only one way to be sure. Multiple data sources. Multiple brokers may have the same data source with errors, so you have to have multiple independent data sources and a voting system. Exact reporting is unlikely, so you need a comparison with a tolerance built into a voting algo. Not cheap, so you either need it, or you don't.

Alex Krishtop, trader, researcher, consultant in forex and futures

Monday, December 22, 2014

Rob, if volume data is so critical then I believe that you want to go for direct data feed from the exchange, as no retail vendor will deliver you complete information realtime (suppose many of them can't deliver it even in historical data). As to the particular issue with IB — are these 2 accounts have the same subscription to market data? And maybe a stupid question: what are the settings in each instance of IB gateway, especially those marked in this screen: http://edgesense.net/showcase/ib-gateway-settings.png ?

Bill Zhimin Yang, Quantitative Researcher at Taikang Asset Management

Monday, December 22, 2014

not an advertisement but I read on Caltech's website that recommended Quantquote for data. I was going to buy some historical data there but haven't yet, so it's just a reference not from my own experiences(Disclaimer).

Robert Simons, Associate & Senior International Market Strategist

Monday, December 22, 2014

As Colin said and I think we all agree on this "Multiple sources" and therefore "Multiple data". Which is precisely what I've been using and trading with for the last 15 years for the same reasons as you've described in your article and then some. It doesn't fix the issue at hand but at least am not just another dumb ass staring at a screen taking the data for granted.

Rob Terpilowski, Software Architect

Monday, December 22, 2014

Alex, I've verified that both accounts have the same market data subscriptions. I'm running the gateways on 2 separate API sockets and both are set to a 30 second timeout with no Master API client ID set.

Thanks for the link Bill, I'll likely be in the market for historical intraday data, so this may work out well.

Robert/Colin, what do you do when the difference in your data sources exceeds your tolerance?

Alex Krishtop, trader, researcher, consultant in forex and futures

Monday, December 22, 2014

Rob, have you tried to run them on different computers? Is only one particular account reports erroneous volume information or does it happen randomly from one run to another?

Rob Terpilowski, Software Architect

Monday, December 22, 2014

Alex,

Tried it on another machine, but still no joy. The one account consistently reports the "native" volume which is considerably higher than the "Calculated" volume.

For example today QQQ:

IB native volume: 42.1M

IB calculated volume: 34.1M

Still haven't received any word from the IB devs regarding what the issue may be.

Jonathan Kinlay, Quantitative Research and Trading | Leading Expert in Quantitative Algorithmic Trading Strategies

Tuesday, December 23, 2014

I am not at all surprised by this. IB's market data feed is notoriously unreliable, with very poor granularity. The obvious solution is to get a better data feed. Amongst retail platforms, I have found Tradestation's market data to be more consistent and higher quality than IB's, for example. If you want to stick to IB, could you simply set up the two accounts as sub-accounts and allocate trades between them, rather than treating them as separate, independent accounts?

Robert Carver, Proprietary systematic trader, writer and freelance researcher.

Tuesday, December 23, 2014

I don't use volume data except to decide when to roll futures contracts, and I use the same basic checks as Gary. I do however use prices and it's true that IB feed of these can sometimes be flaky, for no particular reason.

Having said that the error count isn't much worse than I saw using 'professional' data eg BB, reuters feeds, though it is definitely a bit worse. What IB is particularly bad at is being able to explain and fix problems; their SLA isn't as high as when you are paying the likes of BB several million bucks a year for data. Not the same issue, but for example sometimes the wrong contract expiry comes back for a fill, which means I get a break. I've pointed out the problem several times but it still happens.

I take a very fatalistic attitude to this - data will sometimes be bad and its better to handle bad data robustly than to try and find perfect data.

I don't believe multiple sources is the answer because (a) to automate collection from multiple sources you need at least 3 to resolve disagreements, (b) you will sometimes see the same error in multiple sources because the original data from the exchange is corrupted, (c) the extra cost and complexity doesn't justify the benefits.

If speed is not hugely important there is a lot to be said for the approach of filter and fallback to manual, which is what I use for prices. First check for zero and negative prices, which IB is fond of producing. A check to see if the move is more than x sigma more than the usual will catch whatever is left. Yes you will be a bit slow reacting to genuinely large moves, like a flash crash, but you might not have wanted to trade on those anyway.

Finally building up the rest of our systems so that one bad price doesn't kill it, eg by using median smooths, is worth doing. All this will be difficult in a high frequency enviroment, but as the OP said that isn't what we're dealing with here.

Jonathan Kinlay, Quantitative Research and Trading | Leading Expert in Quantitative Algorithmic Trading Strategies

Tuesday, December 23, 2014

IB's MDF is a lot worse than that. If you look at their intraday data you will see that they were significantly off the NBBO in many of the top names for most of 2012. (Don't know why this didn't turn into a major issue for them). They also send their data in 250 millisec data packets, rather than tick by tick, which can lead to all kinds of inconsistencies and false signals. If you are doing anything that is at all latency sensitive I strongly advise you to avoid using their data feed.

There are lots of good things I can say about the IB platform, including their reasonable (for retail) execution algos and commission rates. They also have a managed account system which allows you to allocate trades from one master account across several sub-accounts. That way you can avoid discrepancies in execution between the accounts. I think there is a limit of 15 sub-accounts.

James Goode, Consultant Programmer

Tuesday, December 23, 2014

@Rob As you are using the API in both program instances, you could put some code in to cross check the volume numbers, and to choose the 'best / preferred value' before trades. Given your comment on the divergence starting to diverge from the commencement of trading this could be one approach if you wish to stay with IB.

With this you could also give yourself a warning if IB were to alter their algorithms to correct the discrepancy(as a result of your complaint). Data providers have a habit of altering their algorithms which does affect trades relying on their original algorithm, and they don't always warn traders.

Colin "Soup" Campbell, Trader at IFundTraders

Tuesday, December 23, 2014

@Rob - when you cannot determine a valid data point ie. they don't agree within your limits, There is no other choice than to mark all of the data points invalid. Because volume is cumulative and time sensitive, many of your individual data points self correct for differences in sample time. A little thought in designing your voting system could reduce your false invalid labels.

Colin "Soup" Campbell, Trader at IFundTraders

Tuesday, December 23, 2014

@Rob - don't forget different routes means different equipment, and different equipment means different arrival times because of buffering requirements of the data streams. If you try to tighten your limits too tight, you might be reporting the buffering of the internet.

Rob Terpilowski, Software Architect

Friday, December 26, 2014

Jonathan, I'll take a look at the managed account structure and see what (if any) additional effort would be required in order to get my app executing trades in that environment.

@Robert, for this particular strategy and the way it trades, the filter and fallback to manual mode would probably be the best way to go for dealing with discrepancies when they may arise.

Jonathan Kinlay, Quantitative Research and Trading | Leading Expert in Quantitative Algorithmic Trading Strategies

Saturday, December 27, 2014

See this on IB linked accounts: https://www.interactivebrokers.com/en/?f=%2Fen%2Fsoftware%2Fpdfhighlights%2FPDF-LinkedAccounts.php%3Fib_entity%3Dllc

John Devron, Computer Software Professional

Sunday, December 28, 2014

Hi Rob,

I solved my datafeed quality problems by comparing feeds from two different vendors.

I found that no filter will be without flaw because the range of possible valid values and patterns is so diverse. Redundant data feeds worked perfectly for me, without flaw.

For the second feed I just used an account from a different broker.

John

Archived Discussions

How to guard against market data errors?

More links

20 comments on article "How to guard against market data errors?"

Please login or register to post comments.

Newsletters

About our Association

Get in touch

Media

Follow Us