The biggest nightmare of Indian economists is the quality of data. Sample this from the latest Index of Industrial Production (IIP) release for August.
Stainless or alloy steel grew by 160.8 percent, air conditioners grew by 80.1 percent and plastic machinery including moulding machinery grew by 59.4 percent.
Yet again, items showing high negative growth are telephone instruments that include mobile phones and accessories, which shrunk by 57.2 percent, while computers were down 49.5 percent.
How can air conditioner grow by 80 percent year-on-year or plastics by 60 percent when the Gross domestic product (GDP) is growing by 5 percent? How can telephone instruments fall 58 percent in a month when companies have reported more telephone connections?
Economists and businessmen are constantly thrown aback by the amazing volatility of the IIP data. Their horrors magnify when they are told the IIP data is used to calculate GDP and other data.
Other big gaps in India’s statistics are the lack of labour data. India must be one of the few countries with no single unemployment figure, not even a lagged one, specially when 60 percent of India’s GDP comprises services. Nevertheless, the data is infrequent and extremely sketchy. Financial services in the last several quarters grew in double digits, but economic activity remained weak and hence, the number sustains to be a suspect; as several policy makers have said the biggest constraint to good policies is good data.
CNBC-TV18’s Latha Venkatesh spoke about the quality of India's macro-economic data with TCA Anant, Chief Statistician and Pranob Sen, Chairman, National Statistical Commission.
Below is the edited transcript of the discussion:
Q: Why is the IIP data so full of such extreme numbers and it is year-on-year, we cannot even say it is seasonal?
Anant: You may be aware sometime back Maruti factory in India which produces virtually all its cars from Gurgaon had a serious labour trouble. Their output went to zero in a very short period of time. For a fixed single entity, these types of fluctuations do happen. There could be any number of reasons for why a single entity sees volatility in its production.
Now in principle when you disaggregate IIP down to individual items, the volatility you are seeing is a reflection of how many entities are producing that item and how many of them are being captured in IIP. Remember IIP is an index that has a certain statistical construct. In other words, it has a fixed base and a fixed set of entities that are tracked through a period of time. Now you can turnaround and say why not do something else but that is a different problem. This index is defined like this and this definition is given very clearly in its methodology.
Q: While one off volatile data like a minus 160 percent or plus 160 percent can be explained by a strike, my sense is and I am pretty certain about this that the volatility is far more than the number of strikes we have. Now as the head of the statistics commission is this a problem you are addressing, how are you planning to address this if at all?
Sen: In a sense, it links up to what Anant said. The fact of the matter is that as a country for a large number of individual products there are a very few number of companies which totally dominate. Now it is not just a question of strikes, every firm, every unit will have a planned down time, no plant can work 365 days a year on the same basis. There are plant downtimes that get reflected in the way the data gets captured.
Now this would have been less of a problem if there had been a large number of units because then hopefully, what would have happen is things would average out across the companies.
Q: Can we correct it in some fashion? These are days of such high automation, surely we can connect this electronically to either excise duty data or capture it in a different fashion so that a range of sectors or companies in the unorganised sector are captured. Is their some way to correct it given the evolution in electronics?
Sen: There are natural limitations to what can be done in terms of volatility. There are two ways of reducing volatility, the first of course is to increase the number of units but given the fact that we do have a very long tailed industry structure for most products, you cannot get away from what is happening. There are very few companies that produce the bulk of the output and therefore whatever volatility is there in their production is going to get reflected in the data.
The second way of doing it is that you aggregate. So instead of having individual product level you have category level data that will be smoother.
But again, if you look at in terms of the use that people make of this information, you either take it at the totally aggregate level that is the IIP itself which is used essentially by finance companies and finance departments to really track the movement of the economy, it is used by us to calculate the quarterly GDP and so on. On the other hand, you have industry level analysts who what to have the industry level data and there is a product level data. So you have got both the extremes and you just have to live with it. They are telling you different things and what it is telling you is legitimate at both ends.
Q: As we move from WPI to CPI. CPI has not evolved in the same lines as WPI – that same level of disaggregation is not there and there is a lot of confusion, for instance part of the diesel gets reflected in the core CPI – that is transportation. In that miscellaneous and others the level of detail is not there. Is one of the projects before you and the CSO to improve the disaggregation in the CPI?
Sen: The data is disaggregated enough. We don’t just release it and we do not release it for precisely the same reason that Anant said that you are going to get a huge amount of volatility in each of at the product level. These are indices and indices are composed of taking a number of things putting them all together and then expecting that their individual fluctuations will cancel it themselves out. But in so far as the database is concerned, we have it down to the individual products.
Q: I am asking you about why aren’t disaggregated CPI data released?
Anant: There are two things – one, CPI is a new series and before we release disaggregate data, we want to be sure that the data flow is smooth. At the disaggregated level CPI data suffers from one limitation which we are working to make sure that our release pattern confirms to that and in a large number of items this consumes to issues relating to specification change. We collect data on items. Items go in and out of market based on peoples’ preferences and characteristics. When an item goes out, CPI has a method of replacing it with the nearest item that has become more popular. For example, a particular type of clothing or a particular type of fan or electricity product or whatever is being used by the consumer may not be available and something else has come in. however, whenever specifications change the price level changes. CPI has a well-defined method, its there, you will have to read the detailed methodology has to how specification changes are handled and introduced into the index calculation. When we release disaggregated data, these will have to be flagged and pointed out. Yes, it is under consideration but the caveats that will accompany the release have to be clearly delineated, worked out before we can start to process of release.
_PAGEBREAK_
Q: That’s welcome if they come with caveats because my sense is that actually every month when you release even for that matter IIP data, why doesn’t the CSO present that, for instance in recent months we saw a dip in IIP in July and August. It was because of the shutdown of the Nokia plant in Chennai which could have been mentioned. So perhaps a richer list of caveats while collecting and disseminating the data would help.
Anant: Please appreciate that there is a conflict between a very detailed reporting of the individual problems that occur. Every item, every sector has some issue which it certainly worth noting. If all of this were to be documented and incorporated, it would be physically impossible for a small unit to produce it.
Q: As the head of the statistics commission what are the most priority areas that you are concentrating on in terms of data quality improvement?
Sen: The fact is that there is need to improve data quality in a number of dimensions. We have already talked about the price indices and the index of industrial production and following from this it has implications on all sorts of other numbers that come out such as the GDP and so on. But much more important than that what we keep forgetting is that for a country like India it is also the responsibility of the statistical system to provide data not just on the economics but also on social indicators and all social indicators we are likely to say the least.
We do have systems in place, we do bring out certain amount of data but these tend to come out with fairly serious time lags and we are also not being able to track changes very well on an ongoing fashion. So a lot of focus on social indicators has to come about. So if you take for instance indicators on health and well-being, it is done through the national family health survey which happens every six years or so and we just don’t have any device for tracking what is happening between those six years.
We have a similar problem with employment, we get our employment data every five years so we are trying to fill in these gaps which are important.
Q: We still don’t have an unemployment figure which one can understand because we have such a large force of unorganised labour and such a small percentage of organised labour but nevertheless is their anyway in which we can get some kind of high frequency data anytime soon on labour?
Anant: I should distinguish between two things, high frequency data on labour is something that we are working on and we are hoping very soon to be able to introduce a survey mechanism, which will generate more frequent estimates of issues relating to labour.
But having said that I do want to return to something which was there in your opening paragraph, measures of unemployment in India are complicated. It is complicated because the Indian labour market does not function in a manner in which the relationship between employment and unemployment is easily demarkable. 50 percent of our workforce works as self-employed, for which the concept of unemployment doesn’t exist. We are among the highest self-employed workforces in the world. So this is something which we have to keep in mind.
Q: Rural wages, the labour ministry had one series of rural wages, there was a disconnect and then a new series has come about. Are you working on giving us a more uniform time series which will henceforth be adhered to either by the CSO or the labour ministry?
Sen: Well we haven’t actually got down to that yet but I agree that is important. If you think about the series that the labour bureau used to put out, those were collected for very specific purposes. Now we are actually working on a more general series on wages. Hopefully, we should have that out as well.
Q: What about services, we get this one heading, trade hotels transport and communication which accounts for 25 percent of the GDP and 50 percent of the services GDP, we have hardly any clue on how it is put together. Is there any process of refining services data making it more frequent?
Sen: As far as services data is concerned, the data base is in fact extraordinarily poor. The way the services estimates are put out, is from the employment side. So you assume a pretty much fixed productivity per worker and then you try and project it per worker. But there are some components of services which are linked to the commodity productions sector such as trade. Trade itself is actually linked to your agricultural production and to the IIP. So we use a mix and match technique but you are absolutely right on services the data base is extremely poor. We are at the moment trying to work on a services equivalent of the annual survey of industries to see whether or not we can improve things significantly.
Q: Is there anything that we should expect from your table on these lines – services and labour?
Anant: I have already told you about labour but so far as services is concerned principally there are two or three different elements to which data can come from and we are exploring all of them. One, at the moment most of our data on services comes either from employment or from an assessment of corporate returns that is a second source of data on services and so far. As corporate returns are concerned, we are hoping to tap in and that is been part of our plan on a much richer database which has become available through the efforts of the ministry of corporate affairs to bring corporate reporting on to a comprehensive online platform through the MCA database. So, this will be one major source.
The second source that is under examination but I cannot predict any timeframes on it because there are two levels of uncertainty - (a) both the services tax and excise duties and excise is more for production but services tax has got more modernised system but they are still under transition and what we are hoping is they will move to an integrated system with the goods and services tax with whom we have also been working – that will give us a second source of looking at the sector from a slightly different angle which is tax paid.
The third exercise which Pranob referred to is we have been experimenting with and we hope that we will be able to finalise and recommend on that basis - a parallel service similar to what we are doing in industries to cover the service enterprises.
If all three elements are made available then our data picture on services would become very complete. At the moment what we are doing is we are working with partial data on the corporate accounts and the rest is projections from different indicators and sources available in the commodity sector in different forms.
Sen: What is happening is as Anant said that alternative sources of data particularly of corporate data is coming in which means that hopefully our estimates of corporate production and corporate investment an so on, all of that will be significantly better. The problem continues to be the non-corporate sector where we have to rely upon service. What happens is that for high frequency estimation such as quarterly GDP series we are assuming that the unorganised sector is mimicking the behaviour of the corporate. This is okay up to a point but the fact of the matter is that particularly in services even more so than in the case of manufacturing the two may work very differently because the two service very different markets and so the assumption may not be particularly good but there is very little that one can do. So let us do it step by step, get the organised corporate side of it as accurate as possible and then try and see through more detailed, less high frequency survey type of instruments how we can link the corporate data to estimates made for the non-corporate.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!