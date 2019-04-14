Vivek Gadodia

"Your Algo Quality is only as good as the Data Quality". A big mistake made by novices as well as professionals is to underestimate the importance of data and lack of willingness to pay for it.

Say you are a contractor and building a bridge with low-quality cement or other materials. What will happen? Sooner or later, the bridge will collapse.

Data in Algo trading is equivalent to materials used in construction. Let us see some of the important aspects to take care while dealing with data.

Corporate Action adjustments:

There are no “Free Lunches” in the world. Many data sources are there on the internet where data is available for free.

Yahoo! Finance or Google Finance are good free sources to start building a model on daily data. However, these data are not adjusted for ‘Corporate Actions’ such as bonus or splits.

Let us say, L&T declared a bonus and the stock price has halved from Rs 2,000 to Rs 1,000. If this is not adjusted, and the algorithm has a running sell signal before the split (i.e. ex-date), the back-test result will show a huge (50 percent additional) profit just because next day price will open around 1,000.

If we are not careful enough to remove this, we will have an over-optimistic expectation. Vice-Versa, if it was a running buy signal we were in, we will have an over-pessimistic expectation.

Missing Data:

In most of the free sources, there may be missing data for a few months or even few years. If plotted as a chart there will be a huge spike or gaps up or down.

All technical indicators and price-action Algos rely on data. Such gaps will spoil the signal of the Algo in the back-test, and give erroneous results. A moving average may suddenly fall or rise too much, or ADX may spurt up or down.

Intraday Data:

Daily data is still easy to obtain, but not so with intraday data. Why would we need this? Remember, Infibeam? The stock fell 72 percent in a single trading session.

One’s heart and pocket, both would burn if the Algo was long on it before the fall, and one exits at the end of the day. Therefore to minimise risk, one may go to lower time-frame.

The time-frame of data depends on the smallest time-frame on which we want to build Algos. If we want to build a 15-Minute Algo, we need 15-Minute or lower time-frame data.

We cannot build a 15-Minute time-frame system with Hourly data, and we must obtain this from authorised data-vendors.

Dead Stocks:

Without dead-stocks, our data is not survivorship-bias free. This may be difficult and expensive to obtain, but it is very valuable. It would include stocks which are de-listed or suspended, like Satyam.

It would give the true picture of the strategy, at that point in time, i.e. say in 2009, did the Algo trade Satyam and if yes, how did the event impacted the Algo performance?

Open Trade:

The last running trade, which is open, has to be either dropped or closed (i.e. notionally squared-off) at the closing price. Else, it may show an over 100 percent profit or loss just because of this single open trade. Although a small point, it needs to be plugged to get a realistic picture.

(The author is Vivek Gadodia, Co-Founder at Dravyaniti Consulting LLP)

