Everyday we share increasing amounts of data with the government for myriad reasons such as access to public services, better governance and to foster innovation. However, we never stop and think about whether the government may have a different motive for data collection which could be detrimental to our interests. The new Draft India Data Accessibility & Use Policy, 2022, which claims its intent as making all government data shareable and open, showcases how the government views citizen data. Here are five key concerns that the Draft Policy raises:
1.Lack of transparency in the consultation and drafting process: The MeitY website states that the draft “has been evolved in consultation with various stakeholders including academia, industry, and government”, however this process has not been transparent. Therefore, it becomes difficult to gauge the intention of each stakeholder and the rationale behind their input, which compromises accountability. Further, best practices with regard to the pre-legislative process have not been followed, which includes compliance with the provisions of the Pre-Legislative Consultation Policy, 2014, such as an accompanying note explaining the Policy’s financial implications, its impact on the environment and fundamental rights, a study on the social and financial costs of the policy and so on.
2.Perverse revenue objective: Throughout the accompanying background note uploaded on the MeitY website as well as in the ‘Preamble’ and ‘Objectives & Purpose’ section of the Policy itself, revenue generation through sale of citizen data emerges as the primary objective of the Policy. The ‘Preamble’ states that “India’s ambition of becoming a 5 trillion-dollar digital economy depends on its ability to harness the value of data”. This objective can be traced back to the 2018-19 Economic Survey, which contained an entire chapter on transforming citizen data into a public good for revenue generation. However, creation of such an economic incentive for data collection and storage may encourage the government to go against settled best practices such as the principle of data minimisation, which states that data collection should be limited to only that which is directly relevant and necessary to accomplish the specified purpose of the collection. As a result, data collection may be carried out simply for the sake of collection of maximum data to correspondingly maximise revenue generation.
3.Harmful effects on informational privacy of citizens: In the absence of a personal data protection law, the envisaged interdepartmental sharing of data across the government departments may lead to a massive violation of the privacy of citizens and 360 profiling of people. Here, even the Draft Data Protection Bill, 2021 will not be a sufficient safeguard as it provides wide exemptions to the government as it stands. As a result, the Policy fails to fulfill the threshold of legality put in place by the Supreme Court in the right to privacy decision, which states that any invasion into privacy by the state must be on the basis of an anchoring legislation. Without such legislative backing, which would authorise the actions undertaken through the Policy and put in place a clear set of provisions on how these actions must be carried out, there is no legally enforceable remedy for a citizen whose data may be misused. Lastly, the lack of a specific parliamentary legislation also means that the function of law making has essentially been co-opted by the Executive, which is making law through policy frameworks, and bypassing the Parliament, which should ideally be drafting and enacting laws as the duly elected branch of the government tasked with law-making.
4.Lack of clear & concise definitions for key concepts: New concepts introduced by the Policy have been defined in a vague and ambiguous manner which opens them up to misinterpretation. The Policy creates a separate category of ‘High-Value Data Sets’ which it deems essential for governance and innovation, access to which will be accelerated. However, nowhere in the Background Note or the Policy has the category been concisely defined. Clause 10 of the Policy states that ‘High-Value Data Sets’ will be defined by another framework which will be notified at a later stage on the basis of vague and overbroad terms such as degree of importance in the market and impact on India’s AI strategy. Similarly vague is Clause 11 of the Policy which relates to the pricing and licensing of datasets. The pricing has been left to the discretion of each government departments or agencies, which has been referred to as the owner of the data under the policy. This could potentially result in arbitrarily priced datasets. Further, the section fails to lay down any specific conditions for licensing of the datasets, stating only to “incentivise data sharing through creative licensing frameworks”.5.Incorrect and harmful understanding of the concept of data anonymisation: All definitions under the Policy have been situated in the annexure attached to the Background Note. In the annexure, the concept of ‘data anonymisation’ has been incorrectly defined as an “irreversible process” based on certain standards of irreversibility to be specified by the competent authority. For this, “reference anonymization tools” will be provided as per Clause 13 of the Policy. There are multiple issues with the data anonymisation provisions in the Policy. Firstly, according to a study published by researchers at Belgium’s Université catholique de Louvain (UCLouvain) and Imperial College, London, re-identification from anonymised datasets is possible, unless strict access control provisions are put in place. Secondly, the anonymisation tools that are going to be provided may not be sufficient to protect against re-identification. It is not known whether these tools will be opened to qualitative scrutiny by independent actors who will be able to verify their effectiveness. Thirdly, the anonymisation standards which have to be complied with under Clause 13 have not been issued, therefore their adequacy and effectiveness can also not be determined.