Gautam Menon, one of the leading voices in India assessing the COVID-19 pandemic's trajectory, is a professor of physics and biology at Ashoka University and an adjunct professor at the Tata Institute of Fundamental Research, Mumbai. Prior to joining Ashoka, he was associated with the Institute of Mathematical Sciences, Chennai.
Debayan Gupta is an assistant professor of computer science at Ashoka University and a visiting professor and research affiliate at the Massachusetts Institute of Technology in the USA. Prior to his assignment at Ashoka, he was a full-time faculty member at MIT.
In this interview, Menon and Gupta talk about BharatSim, India’s first ultra-large scale simulation of 100 million to 1 billion agents representing the population of India, developed in collaboration with technology company Thoughtworks, that has the potential to enable Indian researchers to formulate strategies for management of transmissible and non-transmissible diseases. Edited excerpts:
How is this disease modelling system different from other systems and how does it work?
Menon: There are many ways of modelling how disease spreads through a population to answer questions like how many people are going to be infected, when will they be infected, how many hospital admissions do you expect, how many people will die due to a disease? There are different ways of doing this. Some are much more granular or detailed, and some are more general.
Frequently Asked Questions
A vaccine works by mimicking a natural infection. A vaccine not only induces immune response to protect people from any future COVID-19 infection, but also helps quickly build herd immunity to put an end to the pandemic. Herd immunity occurs when a sufficient percentage of a population becomes immune to a disease, making the spread of disease from person to person unlikely. The good news is that SARS-CoV-2 virus has been fairly stable, which increases the viability of a vaccine.
There are broadly four types of vaccine — one, a vaccine based on the whole virus (this could be either inactivated, or an attenuated [weakened] virus vaccine); two, a non-replicating viral vector vaccine that uses a benign virus as vector that carries the antigen of SARS-CoV; three, nucleic-acid vaccines that have genetic material like DNA and RNA of antigens like spike protein given to a person, helping human cells decode genetic material and produce the vaccine; and four, protein subunit vaccine wherein the recombinant proteins of SARS-COV-2 along with an adjuvant (booster) is given as a vaccine.
Vaccine development is a long, complex process. Unlike drugs that are given to people with a diseased, vaccines are given to healthy people and also vulnerable sections such as children, pregnant women and the elderly. So rigorous tests are compulsory. History says that the fastest time it took to develop a vaccine is five years, but it usually takes double or sometimes triple that time.
The better models are the more detailed models because they can really say: if you are a 70-year-old, this might happen. If you are a 20-year-old, this might happen and for a 30-year-old with diabetes, this might happen. All of these really represent the behaviour of specific populations. And in general terms, we want to ask about Pune versus Bombay (Mumbai) or Maharashtra versus Tamil Nadu etcetera. These questions need more levels of detail.
So, this approach to thinking about diseases and how they spread is as if you are basically taking the details from everybody in a city. Obviously, we can’t do that, because this would involve an invasion of privacy, private information about people. But what you can do is to make something synthetic or generated on a computer that is as closely representative of that population as possible.
For example, information such as if 20 percent of the population is under 20, and 40 percent of them are under 30, can be fed into your description of your model or synthetic population. That can also include details like work profile, family size, availability of hospitals etc. From there, we can impose certain interventions; you can say, if I have a lockdown, then I can control the spread of the disease by this amount. If I close schools, I can control the spread of disease by this amount. If I vaccinate the population at a rate of 0.1 percent per day, this is the effect that I expect it might have. And instead of that, if I do it at 0.5 per cent per day, this is the effect that it might have. If my vaccines are not perfect, I will expect all of these questions.
Now, you can only run reality once. But on the computer, you can run many different questions and ask what is the best combination of interventions that help me best suppress the growth of the disease.
There are many models being used in India but very few have the level of detail that BharatSim has.
Debayan Gupta: When you are dealing with disease modelling for a country as huge as India it is very foolish to just give a single number because while the epidemic may be increasing in Mumbai, it may be decreasing in Kolkata, so maybe you are getting a very zoomed out view that may not be particularly useful.
So, what we have done is used ward-by-ward, area-by-area, square kilometer-by-square kilometre data with the right amount of people, who are in the right age groups, who have the right comorbidities, on an average. If you take any sample of this population it will give you the correct data, the correct number of people with diabetes, for example.
Once you have that comes the hard part, which is the simulation engine. Now, you need to take this entire population and move them around. All of the people’s interactions and movements have to be somewhat realistic with a very small margin for error.
But this is super complicated and humans can’t look at this data and make any sense of it. So, you need to have a visualisation engine on top, which takes all this data and helps you get insights from it.
Did COVID-19 have any role to play in the development of this model; how did this collaboration come about?
Menon: Long before I moved to Ashoka, I had been thinking for a while that this is the right way to approach public health questions in India. I think COVID-19 provided the impetus for beginning to do it fast and of course, Debayan’s abilities in thinking about the synthetic population brought skills that I didn’t have. So, in many ways, this was the ideal collaboration of people, computer scientists, epidemiologists, students and modellers with Thoughtworks, which helped us to develop the code. Now, multiple people are involved outside Ashoka as well in testing the code in running their own scenarios with it. This is what we would like to see in the future, more and more people producing innovative, exciting and imaginative applications of agent-based codes. And it is not simply for a disease, one can really look upon it as a way of simulating social phenomena in a sense, and asking how a complicated interaction between economic society, personal decisions, etc can happen.
Do you think that if this model had been available at the beginning of the COVID-19 pandemic we could have predicted scenarios in a more realistic way?
Menon: That’s hard to say, because I think much of our difficulties with COVID-19 came from the lack of data. Now, unless your data are good, your models will not be good. Right now, the models that we have for COVID-19 are better than the models that we had a year ago. And these new models are even more powerful than that. So, presumably, in the future, all of this will converge. As we realise the need for better data, better data will feed into models, better models will give you better predictions in the future.
Using this model is it possible to estimate COVID-19 deaths for India? The World Health Organization has estimated nearly 4.9 million coronavirus deaths in India, much higher than the government’s official estimates and the issue has caused a controversy.
Menon: For that particular question, you don’t need this model. There are other types of models that will do perfectly well. And these hinge on sort of very delicate questions of what is called an infection fatality ratio, what is the fraction of people in certain age groups who will die if they get an infection, on average, and from there, you can work backwards and see what is the number of deaths that you might expect. And that’s where a lot of the discrepancy comes — that the government estimates really seem a bit too low, compared to what one would expect.
So, in that sense, this model probably would not have helped; there are other models that are perfectly able to do this. The issue really is we need to understand more about the disease, we need to understand more about the disease in Indian populations. And these are questions of epidemiology, of clinical medicine, and not so much of models.
Gupta: When you were talking about death data, it’s not just WHO just saying one thing. Even though their analysis is reasonable they could still be wrong. But the issue is that the government data is pointing at a certain level. All of these indicators are pointing at a level somewhere above that, and we can feel it ourselves. You don’t need complicated science to think about it. And this is not an argument about numbers as such. It’s about knowing and understanding what happened, so that it doesn’t happen again. And I think there are no evil people in this story. Everyone wants to do the right thing. It’s just a question of things turning into arguments rather than conversations.
Can you cite some examples of diseases, currently existing or that may come up in future, where this model can help?
Menon: So, certainly, we would think about tuberculosis as one example of an infectious disease that moves between people; influenza would be another example, and H1N1 would be a third example. The complications needed to describe a disease like malaria, or dengue, which spread through a vector, is yet to be achieved. That will be the next frontier for these models, to incorporate vectors, and vector borne-diseases into that description.
Gupta: The next step from my point of view would be to have more people looking at more problems using the same code. One problem with a lot of modeling exercises in India is that everyone has written their own piece of code, so it’s very hard to figure out the small differences. There are subtle differences between one person and another person’s approach to a problem. That’s where the advantage of having one programme or framework like BharatSim really lies, because then it’s very easy to compare results between different people. And the second is just more imaginative applications, especially on the social sciences side of BharatSim, which I think is very important. No one has done this before for India. We can try and get answers to complex questions like how many people who were born in Himachal Pradesh and have diabetes are female, now live in Haryana and earn more than Rs 1 lakh a year? Or how many people who live within one kilometre of a river in India have TB?
Are you also looking to collaborate with governments on this model, to generate crucial data that may have policy implications?
Menon: Absolutely; we are completely open to collaborating because the code is open to governments and computer experts within the government can choose to download it and run it on their own. Whatever help we can provide, we will provide. Because of the way this is designed, the barrier to using it is correspondingly lower. You don’t need trained computer scientists doing it at all. Just something like two or three days of a small workshop should be enough to prepare anyone to start off, and someone with a higher level of base competence does not even need that. It’s constructed so as to be used easily and fast.