• Users Online: 256
  • Print this page
  • Email this page


 
 Table of Contents  
ORIGINAL ARTICLE
Year : 2022  |  Volume : 9  |  Issue : 1  |  Page : 4-11

Modeling of COVID-19 new cases and deaths in top 10 affected countries


1 Department of Community Medicine, PCMC’s Postgraduate Institute and YCM Hospital, Pimpri, Pune, India
2 Department of Statistics, Nanded Education Society’s Science College, Nanded, India
3 Department of Computer Science, MIT Arts, Commerce & Science College, Pune, Maharashtra, India

Date of Submission12-Dec-2021
Date of Acceptance17-Feb-2022
Date of Web Publication23-Mar-2022

Correspondence Address:
Dr. Pandurang V Thatkar
Department of Community Medicine, PCMC’s Postgraduate Institute and YCM Hospital, Pimpri, Pune, Maharashtra.
India
Login to access the Email id

Source of Support: None, Conflict of Interest: None


DOI: 10.4103/mgmj.mgmj_105_21

Rights and Permissions
  Abstract 

Introduction: This study aimed to develop a model utilizing the data from the top 10 countries (as of August 22, 2020) with the maximum number of infected cases. These countries are the United States of America, Brazil, India, Russia, South Africa, Peru, Mexico, Colombia, Chile, and Spain. The model is developed using the newly infected cases, new deaths, cumulative infected cases, and cumulative deaths due to COVID-19 starting from the day on which the first infected cases of COVID-19 in each of these countries is diagnosed to the date August 19, 2020. Materials and Methods: This study includes data such as the newly infected cases, new deaths, cumulative infected cases, and cumulative deaths due to COVID-19 starting from the day on which the first infected case of COVID-19 in each of these countries is diagnosed to the date August 19, 2020, in the top 10 most affected countries. The data were obtained from World Health Organization (WHO) website. To fit the data into a regression model, IBM SPSS Statistics 21.0 was used. The linear, logarithmic, quadratic, and cubic curves were fitted to the newly infected COVID-19 cases and daily deaths due to COVID-19. In choosing the best-fitted model, the coefficient of determination (R-square) was used. Results: Cubic regression model is the best fit model for new infected COVID-19 cases as well as COVID-19 deaths. It has the highest R-square value as compared to the linear, logarithmic and quadratic. Conclusion: To control the spread of infection, there is a need for aggressive control strategies from the administrative departments of all countries.

Keywords: COVID-19, COVID-19 modeling, cubic model, disease prediction, forecasting, trend analysis


How to cite this article:
Thatkar PV, Pawar DD, Tonde JP. Modeling of COVID-19 new cases and deaths in top 10 affected countries. MGM J Med Sci 2022;9:4-11

How to cite this URL:
Thatkar PV, Pawar DD, Tonde JP. Modeling of COVID-19 new cases and deaths in top 10 affected countries. MGM J Med Sci [serial online] 2022 [cited 2022 May 18];9:4-11. Available from: http://www.mgmjms.com/text.asp?2022/9/1/4/340577




  Introduction Top


The novel coronavirus disease-2019 (COVID-19) was first reported on December 31, 2019 in the city of Wuhan, Hubei Province, China.[1] It is a highly contagious virus that has spread rapidly and efficiently. At present, the domestic outbreak in China has been effectively controlled, whereas the new coronavirus is spreading rapidly in other parts of the world. It started spreading rapidly worldwide and it was declared as a “global pandemic” on March 11, 2020 by the Director-General of the World Health Organization (WHO). As of August 19, 2020, the disease has spread over the 213 countries and territories around the world and infected (confirmed) 23117813 cases across 213 counties, which has led to 803200 deaths, reports the WHO.

The new coronavirus disease-2019 (COVID-19) has caused an immense threat to the health and safety of people all over the world as the virus is highly contagious in terms of spreading power and potential harm. During the last 8 months, the world is facing an unprecedented disaster due to COVID-19. WHO has defined COVID-19 as a family of viruses that varies from the common cold to the Middle East Respiratory Syndrome (MERS) coronavirus and the severe acute respiratory syndrome (SARS) coronavirus.[2] These COVID-19 family viruses can cause respiratory symptoms in humans, along with other symptoms of common cold and fever.[3] Although the COVID-19 is increasing rapidly across the globe, leaving the healthcare system weighed down with a large number of patients, in the nonexistence of an efficient treatment or vaccine. Personal hygiene, social distancing measures, containment measures (testing and strategy and contact tracing), usage of the mask by the public, and healthcare preparedness are the key public health interventions to slow down the spread of the virus and reduce the impact of the pandemic on the healthcare system.[4] These measures can be extended to travel restrictions, screening of travelers (at borders, airports, and seaports), lockdown and social security measures, and free healthcare.

Peng et al. suggested a mathematical model that has been applied to investigate the spread of the COVID-19 pandemic in China.[5] Among other researchers, Calafiore et al. suggested a modified SIR model for COVID-19 contagion in Italy, Danon et al. suggested a spatial model of COVID-19 transmission in England and Wales.[6],[7] The predictions obtained from these suggested models became a platform for assessing the spread of COVID-19. These models will be helpful for public health professionals and decision-makers in the planning of controlling and/or mitigating the spread of the COVID-19 pandemic.[8],[9]

This study aims to develop a model utilizing the data from the top 10 countries (as of August 22, 2020) with the maximum number of infected cases. These countries are the United States of America, Brazil, India, Russia, South Africa, Peru, Mexico, Colombia, Chile, and Spain. The model is developed using the newly infected cases, new deaths, cumulative infected cases, and cumulative deaths due to COVID-19 starting from the day on which the first infected cases of COVID-19 in each of these countries is diagnosed to the date August 19, 2020.


  Materials and methods Top


This study includes data like the newly infected cases, new deaths, cumulative infected cases, and cumulative deaths due to COVID-19 starting from the day on which the first infected case of COVID-19 in each of these countries is diagnosed to the date August 19, 2020, in the top 10 most affected countries viz. United States of America, Brazil, India, Russia, South Africa, Peru, Mexico, Colombia, Chile, and Spain. These countries belong to various WHO regions AFRO, AMRO, EURO, and SEARO. These countries consist of 66.36% of global infected cases and 61.90% of global deaths due to COVID-19 [Table 1]. The case fatality rate (CFR) in these top 10 countries stands at 3.24% which is slightly lower than the global CFR of 3.47%.
Table 1: Total infected cases and total deaths due to COVID-19

Click here to view


Indicators

Indicators used for the modeling are the daily COVID-19-infected cases, daily deaths due to COVID-19 for the top 10 countries with the highest number of cases till 19 August 2020. The total cases were also standardized according to the population size of each country, whereas the deaths are standardized by calculating the Total Fatality Rate (TFR %). The data were obtained from the WHO website.[10] To fit the data into a regression model, IBM SPSS Statistics 21.0 was used. The linear, logarithmic, quadratic, and cubic curves were fitted to the newly infected COVID-19 cases and daily deaths due to COVID-19. In choosing the best-fitted model, the coefficient of determination (R-square) was used.[11] For all the countries included in the study, both for newly infected cases and new deaths, the cubic model gave the best estimation (highest value of R-square) [Table 2] and [Table 3]. [Figure 1] shows the fitting of linear, logarithmic, quadratic, and cubic curves to the newly infected COVID-19 cases. [Figure 2] shows the fitting of linear, logarithmic, quadratic, and cubic curves to the new deaths due to COVID-19 cases.
Table 2: R-square values (new infected cases)

Click here to view
Table 3: R-square values (new deaths)

Click here to view
Figure 1: Curve fitting for newly infected cases (day-wise) up to August 19, 2020

Click here to view
Figure 2: Curve fitting for new deaths (day-wise) up to August 19, 2020

Click here to view


Statistical model

The cubic regression equation is as given below.

Y = b0 + (b1*t) + (b2*t2) + (b3*t3)

where Y is the daily COVID-19-infected cases/new deaths due to COVID-19; b0 is constant, and b1, b2, and b3 are regression coefficients and t is time (day)


  Results Top


[Table 1] shows country-wise total infected cases and total deaths due to COVID-19 as of August 19, 2020. The data in [Table 1] shows total infected cases, total deaths, CFR, and proportion of global infected cases and deaths in all top 10 countries. The case fatality rate of top 10 affected countries ranges between 1.71% and 10.86%. The highest and lowest case fatality rate was seen respectively in Mexico (10.86%) and Russia (1.71%). The proportion of global infected cases was highest in USA (23.33%) and lowest in Spain (1.57%). The proportion of global deaths was highest in Brazil (21.10%) and lowest in Chile (1.31%).

[Table 2] shows R-squared values for linear, logarithmic, quadratic, and cubic models fitted to newly infected cases in top 10 affected COVID-19 countries. For all 10 countries included in study, the R-squared values for cubic model are highest as compared to linear, logarithmic, and quadratic models. This indicate that the cubic model has higher ability to predict the new cases of COVID-19 in near future. [Table 3] shows R-squared values for linear, logarithmic, quadratic, and cubic models fitted to new deaths in top 10 affected COVID-19 countries. For all 10 countries included in study, the R-squared values for cubic model are highest as compared to linear, logarithmic, and quadratic models. This indicate that the cubic model has higher ability to predict the new deaths of COVID-19 in near future.

[Table 4] shows the cubic model fitted to the data of COVID-19 newly infected cases for 10 of the most affected countries. The table shows model summary such as R-square, F-stat, and its significance (P-value), it also shows estimates of the model parameter. Using estimates of model parameters, the estimate of newly infected cases of COVID-19 can be predicted shortly. The R-square values for all countries ranges between 0.256 and 0.980. The highest and lowest coefficient of determination value were observed for India (R-square = 0.980) and Chile (R-square = 0.256). The regression coefficients b1, b2, and b3 were significant in the cubic model applied to newly infected COVID-19 cases for all 10 countries.
Table 4: Cubic regression model fitting to the newly infected COVID-19 cases

Click here to view


[Table 5] shows the cubic model fitted to the data of daily COVID-19 deaths for 10 most affected countries. The table shows model summary such as R-square, F-stat, and its significance (P-value), it also shows estimates of the model parameter. Using estimates of model parameters, the estimate of deaths due to COVID-19 can be predicted soon. The R-square values for all countries ranges between 0.251 and 0.950. The highest and lowest coefficient of determination value were observed for Colombia (R-square = 0.950) and Chile (R-square = 0.251). The regression coefficients b1, b2, and b3 were significant in the cubic model applied to daily deaths due to COVID-19 for all 10 countries.
Table 5: Cubic regression model fitting to the daily deaths due to COVID-19

Click here to view


[Figure 1] shows curve fitted to newly infected cases (day-wise) up to August 19, 2020 among top 10 affected countries. From the figure it is clear that for all top 10 affected countries, the newly infected cases can be best fitted to cubic model. [Figure 2] shows curve fitting for new deaths (day-wise) up to August 19, 2020 among top 10 affected countries. From the figure it is clear that for all top 10 affected countries, the new deaths can be best fitted to cubic model.

[Figure 3] depicts the country-wise COVID-19-infected cases per million populations. As of August 19, 2020, among the top 10 countries with the highest number of COVID-19 cases, Chile has the highest infected cases per million followed by the USA, Peru, and Brazil. Currently, among the top 10 countries with the highest COVID-19-infected cases, India has the lowest cases per million population followed by Mexico and Russia. [Figure 4] depicts the country-wise COVID-19 CFR (%). As of August 19, 2020, among the top 10 countries with the highest number of COVID-19 cases, Mexico has the highest infected CFR (%) followed by Spain, Peru, and the USA. Currently, among the top 10 countries with the highest COVID-19-infected cases, Russia has the lowest CFR (%) followed by South Africa and India.
Figure 3: Country-wise COVID-19-infected cases per million population

Click here to view
Figure 4: Country-wise COVID-19 case fatality rate (%)

Click here to view



  Discussion Top


Use of statistical techniques is inevitable while dealing with the large amount of data. The COVID-19 pandemic has generated a huge amount of data about new COVID-19 cases and new deaths. The number of new cases and deaths are increasing rapidly all over the globe. So, it is necessary to study the pattern of spread of disease which is causing large number of deaths. The statistical techniques has ability to analyse huge amount of real time infectious disease data.

For the purpose of this study, we used the linear and and non-linear regression models like linear model, logarithmic model, quadratic model, and cubic model. These models were applied to the newly detected COVID-19 cases and new deaths in top 10 affected countries. Syazali et al.[12] used multiple linear and non-linear regression analysis to study the influence of volume, quality of goods, and the brand name on buying value from consumers. For prediction disasters, the multiple linear regression -TOPSIS was used by Luu et al.[13] Du et al.[14] used multiple linear model technique to study the relationship between the mechanical properties of the tea stem and their impact factor to improve the picking efficiency of the tea plucking machine. Rath et al.[15] applied a linear regression and multiple linear regression model to forecast COVID-19 cases in India and obtained R-squared values of 0.99 and 1.0 which indicates a strong prediction capability of the model.

The near future predictions of new COVID-19 cases and new deaths may be helpful in understanding the actual situation in near future and an appropriate advisory may be issued to public and various policy measures can be taken such as “partially lockdown,” “stay at home,” “home quarantine,” “self-isolation,” and “work from home” to control the COVID-19 outbreak in the affected countries.


  Conclusion Top


The global pandemic COVID-19 has already started spreading rapidly. To control the spread of infection, there is a need for aggressive control strategies from the administrative departments of all countries. This study discusses the modeling of new cases of COVID-19 infection. The results of this study suggest that the new COVID-19 cases and new deaths follows cubic curve. The coefficients of cubic regression equation were obtained for top 10 affected countries which will allow predicting the new COVID-19 cases and new deaths in near future. For all top 10 COVID-19 affected countries, the coefficients of cubic model were significant for new COVID-19 cases and new deaths due to COVID-19. The higher values of R-squared values of the model indicate the strong prediction capability in near future. Thus the near future predictions of new COVID-19 cases and deaths due to COVID-19 will be helpful for governments, administrators, frontline health workers, and researchers in better planning and policymaking.

Ethical consideration

Institutional Ethical Committee approval and informed consent for this study are not needed as this study is based on open-source data that available in the public domain.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.



 
  References Top

1.
Coronavirus disease (COVID-19) Situation Report–1. 2020. Available from: https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200121-sitrep-1-2019-ncov.pdf. [Last accessed 2020 Apr 5].  Back to cited text no. 1
    
2.
Coronavirus disease (COVID-19) Situation Report–137. Available from https://www.who.int/docs/default-source/sri-lanka-documents/what-is-coronavirus english.pdf?sfvrsn=a6b21ac_2. [Last accessed 2020 Apr 5].  Back to cited text no. 2
    
3.
Zhu W, Xie K, Lu H, Xu L, Zhou S, Fang S. Initial clinical features of suspected coronavirus disease 2019 in two emergency departments outside of Hubei, China. J Med Virol 2020;92:1525-32.  Back to cited text no. 3
    
4.
Corman VM, Landt O, Kaiser M, Molenkamp R, Meijer A, Chu DK, et al. Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR. Euro Surveill 2020;25:2000045.  Back to cited text no. 4
    
5.
Peng L, Yang W, Zhang D, Zhuge C, Hong L. Epidemic analysis of COVID-19 in China by dynamical modeling. medRxiv. doi: https://doi.org/10.1101/2020.02.16.20023465.  Back to cited text no. 5
    
6.
Calafiore GC, Novara C, Possieri C. A time-varying Sird model for the Covid-19 contagion in Italy. Annu Rev Control 2020;50:361-72.  Back to cited text no. 6
    
7.
Danon L, Brooks-Pollock E, Bailey M, Keeling MJ. A spatial model of CoVID-19 transmission in England and Wales: Early spread and peak timing. medRxiv. doi:https://doi.org/10.1101/2020.02.12.20022566.  Back to cited text no. 7
    
8.
Jewell NP, Lewnard JA, Jewell BL. Predictive mathematical models of the Covid-19 pandemic: Underlying principles and value of projections. JAMA 2020;323:1893-4.  Back to cited text no. 8
    
9.
Enserink M, Kupferschmidt K. With Covid-19, modeling takes on life and death importance. Science 2020;367:1414-5.  Back to cited text no. 9
    
10.
World Health Organization. Coronavirus disease (COVID-19) Pandemic. Geneva: WHO; 2020. Available from: https://www.who.int/emergencies/diseases/novel-coronavirus-2019. [Last accessed on 2020 Aug 20].  Back to cited text no. 10
    
11.
Ankaralli H, Ankaralli S, Erarslan N. COVID-19, SARS-CoV2 infection: Current epidemiological analysis and modeling of disease. Anatol Clin 2020;25:1-22.  Back to cited text no. 11
    
12.
Syazali M, Putra F, Rinaldi A, Utami L, Widayanti W, Umam R, et al. Partial correlation analysis using multiple linear regression: Impact on business environment of digital marketing interest in the era of industrial revolution 4.0. Manag Sci Lett 2019;9:1875-86.  Back to cited text no. 12
    
13.
Luu C, von Meding J, Mojtahedi M. Analyzing Vietnam’s national disaster loss database for flood risk assessment using multiple linear regression-TOPSIS. Int J Disast Risk Reduct 2019;40:101153.  Back to cited text no. 13
    
14.
Du Z, Hu Y, Buttar NA. Analysis of mechanical properties for tea stem using grey relational analysis coupled with multiple linear regression. Sci Hortic 2020;260.  Back to cited text no. 14
    
15.
Rath S, Tripathy A, Tripathy AR. Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression model. Diabetes Metab Syndr 2020;14:1467-74.  Back to cited text no. 15
    


    Figures

  [Figure 1], [Figure 2], [Figure 3], [Figure 4]
 
 
    Tables

  [Table 1], [Table 2], [Table 3], [Table 4], [Table 5]



 

Top
 
 
  Search
 
Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
Access Statistics
Email Alert *
Add to My List *
* Registration required (free)

 
  In this article
Abstract
Introduction
Materials and me...
Results
Discussion
Conclusion
References
Article Figures
Article Tables

 Article Access Statistics
    Viewed452    
    Printed24    
    Emailed0    
    PDF Downloaded85    
    Comments [Add]    

Recommend this journal


[TAG2]
[TAG3]
[TAG4]