Development of an Assessment Method for Investigating the Impact of Climate and Urban Parameters in Confirmed Cases of COVID-19: A New Challenge in Sustainable Development.

Sustainable development has been a controversial global topic, and as a complex concept in recent years, it plays a key role in creating a favorable future for societies. Meanwhile, there are several problems in the process of implementing this approach, like epidemic diseases. Hence, in this study, the impact of climate and urban factors on confirmed cases of COVID-19 (a new type of coronavirus) with the trend and multivariate linear regression (MLR) has been investigated to propose a more accurate prediction model. For this propose, some important climate parameters, including daily average temperature, relative humidity, and wind speed, in addition to urban parameters such as population density, were considered, and their impacts on confirmed cases of COVID-19 were analyzed. The analysis was performed for three case studies in Italy, and the application of the proposed method has been investigated. The impacts of parameters have been considered with a delay time from one to nine days to find out the most suitable combination. The result of the analysis demonstrates the effectiveness of the proposed model and the impact of climate parameters on the trend of confirmed cases. The research hypothesis approved by the MLR model and the present assessment method could be applied by considering several variables that exhibit the exact delay of them to new confirmed cases of COVID-19.


Introduction
Sustainable development is a concept which seeks to meet the needs of today's generation without creating problems for the next generations. The sustainable development process consists of three main aspects, namely the environment, economy, and society, with the government policies to be considered as the fourth important aspect of this process from the current decade onwards [1,2]. Although the sustainable development process has a specific framework, there are many serious problems in implementing its process, and epidemic diseases are examples of this kind of problem, which can lead to temporary or permanent barriers and could affect the previous attempts on different criteria of sustainable development [3,4].
The coronavirus disease is one of the epidemic diseases that many governments have experienced in recent years. Hence, there are several studies of coronavirus disease in recent years [5][6][7][8][9][10]. Kampf et al. (2020) investigated the number of days of persistence of coronaviruses on inanimate surfaces. They found out that this virus can survive up for to 9 days [11]. Chinazzi et al. (2020) investigated the impact of travel restrictions on the spread of COVID-19. Their results show that travel restrictions can be very useful in decreasing the transmission of disease in the community [12]. Nkengasong and Mankoula (2020) evaluated the looming threat of COVID-19 from China to Africa. They investigated ways of transmitting the disease like trade between China and Africa, and also considered health-care systems. The obtained results show that for preventing a major disaster, Africa needs to be supported collectively, and fast [13]. Pandey et al. (2020) investigated the confirmed cases in India using two mathematical approaches, including the SEIR and regression models. Their models show a suitable performance in the predicate of the number of confirmed cases in short-term and long-term intervals, which would be useful for planning the health system in India [14]. Nazari Harmooshi (2020) investigated environmental conditions in the spread of COVID-19 and found that humidity and temperature play critical roles [15]. Pirouz and Violini (2020) assessed the outbreak patterns of COVID-19 by the investigation of 117 countries. Their results determine two mechanisms of normal and accumulation in the development of Coronavirus [16]. Gilbert et al. (2020) carried out a modeling study about the readiness and vulnerability of African countries against risks of COVID-19. Based on their results, they made some recommendations for some African countries with moderate to high risk of importation of COVID-19 [17]. Wen et al. (2020) investigated the role of the media and their effects on during times of crisis of the COVID-19 outbreak. Their results show that the biased and misleading coverage of mass media can constrain the positive efforts of treatment groups and of other groups involved in facing this crisis [18]. Hu et al. (2020) used artificial intelligence to predict the transmission period of COVID-19 in China. According to their results, artificial intelligence provided high-performance capacity in predicting the outbreak period [19]. In another study, Chen et al. (2020) proposed a time-dependent susceptible infected-recovered (SIR) model that is applicable for the prediction of the total number of confirmed cases [20]. Li and Feng investigated the trend of the COVID-19 outbreak in China. Their results show that quick and active strategies could be very effective in reducing and control the epidemic of COVID-19 [21].  evaluated the relationship between environmental parameters and confirmed cases of COVID-19, which is a serious challenge in the sustainable development process. They used an artificial intelligence algorithm to investigate several case studies. They found that there is a meaningful relationship between confirmed COVID-19 cases and environmental factors (urban and climate parameters) [22].
By reviewing the previous studies, it can be concluded that COVID-19 as an epidemic disease can create many barriers to economic, environmental, and social development, which in the involved countries might lead to a temporary or a long-term negative impact on sustainable development. The main aim of this study is to investigate the impact of climate factors on the confirmed COVID-19 cases and to propose a multivariate linear regression (MLR) model to improve the prediction.

Materials and Methods
We used two approaches to find out the possible correlations between the trends of confirmed cases, climate data, and previous positive cases, and according to the results, the final multivariate equations have been provided. The two approaches are as follows: • The multivariate linear regression analysis in three regions in Italy with the highest confirmed cases of COVID-19;

•
The trend analysis of the confirmed cases and climate parameters; Conditions of analysis: • The environmental and climate parameters in the analysis include density, average temperature, relative humidity, and wind speed; • To decrease the impact of the different start dates of lockdown, different population density, and other unforeseen parameters in each region on the correlations between the confirmed cases and climate factors, a separate MLR analysis was done in each region.

•
The climate data are based on the weather stations in the center of each region; • The weather data of the three regions are presented in Tables A1-A3 (See Appendix A The analysis period is from 14 February 2020 to 14 March 2020 (1 month).
It must be mentioned that there is always some delay between the actual date when the case is infected and the registration as confirmed contagion in the media. The reasons are as follows: • The estimated incubation period of COVID-19 is about 2-14 days [23]. However, the mean observed incubation periods were 3.0 days (study based on 1324 cases) and 5.2 days (based on 425 cases), [24,25]; • According to the policy for the COVID- 19, not everybody will be tested, especially not those with no symptoms [26]. The NHS, UK is only testing people with symptoms such as a fever, cough, or shortness of breath, [27];

•
The results of the laboratory are typically available within one day (24 h) [28,29]. In Lombardy, there are some delays in testing, and the data may not reflect the actual numbers [30]. Moreover, the daily information of the confirmed cases refers to the previous 24 h, [31].

Case Study
To carry out the analysis, the dataset of three regions in Italy, namely Lombardy (Milan), Veneto (Venice), and Emilia-Romagna (Bologna), are displayed in Table 1, and the location of the case studies is shown in Figure 1.  To decrease the impact of the different start dates of lockdown, different population density, and other unforeseen parameters in each region on the correlations between the confirmed cases and climate factors, a separate MLR analysis was done in each region.  The climate data are based on the weather stations in the center of each region;  The weather data of the three regions are presented in Tables A1-A3 (See Appendix A);  The analysis period is from 14 February 2020 to 14 March 2020 (1 month).
It must be mentioned that there is always some delay between the actual date when the case is infected and the registration as confirmed contagion in the media. The reasons are as follows:  The estimated incubation period of COVID-19 is about 2-14 days [23]. However, the mean observed incubation periods were 3.0 days (study based on 1,324 cases) and 5.2 days (based on 425 cases), [24,25];  According to the policy for the COVID- 19, not everybody will be tested, especially not those with no symptoms [26]. The NHS, UK is only testing people with symptoms such as a fever, cough, or shortness of breath, [27];  The results of the laboratory are typically available within one day (24 hours) [28,29]. In Lombardy, there are some delays in testing, and the data may not reflect the actual numbers [30]. Moreover, the daily information of the confirmed cases refers to the previous 24 hours, [31].

Case Study
To carry out the analysis, the dataset of three regions in Italy, namely Lombardy (Milan), Veneto (Venice), and Emilia-Romagna (Bologna), are displayed in Table 1, and the location of the case studies is shown in Figure 1.   The lockdown could affect the trend of positive cases, and after around 13-14 days from full lock down the number of confirmed cases is expected to decrease, as mentioned in a previous study by [22]. The dates of quarantine in Italy can be seen in Table 2.

Mathematical Modeling and Multivariate Linear Regression Analysis
Mathematical modeling has been used successfully for modeling many natural and human-made hazards [44][45][46][47][48]. One of the most important aspects of any optimization is the expression of the problem in the form of a mathematical model. Due to the complexity of the problems, different approaches may be used to express a problem as a mathematical model [49][50][51][52]. Many studies have been carried out in various fields of science to obtain mathematical models and solutions for determining problems [53][54][55]. To be able to develop a prediction model, one method can be regression analysis. By regression analysis, the effect of two or more variables on the dependent variable can be assessed. Multivariate linear regression (MLR) is one of the most applicable mathematical methods to determine a linear relationship between independent and dependent parameters [56,57]. The MLR model for the i th sample is as Equation (1): where: ε is random error βi, (i = 0, 1, . . . , n) are the regression coefficients In the equation, the dependent variable, y, linearly depends on a combination of independent variables, x.
In the analysis, the linearity, invariability, and normality of datasets are essential, along with the independence of observations. The final correlation can be evaluated through R 2 and the regression beta coefficient. The beta coefficients are the degree of change by each variable in the outcome and can be positive or negative. The highest beta coefficient means the maximum possible impact and can be used in a regression model to compare the relative importance of each coefficient. [56,57].

The SPSS Model Set Up
To conduct advanced analyses, SPSS (IBM, Armonk, NY, USA) has been used. The critical factors in the SPSS Model set up are as follows: • The main factor that affects the future positive cases would be the positive cases of previous days up to 14 days before, since the incubation period of COVID-19 is about 2-14 days.

•
The second factor in the positive cases is the weather condition, and this might be the reason why the graphs of confirmed cases exhibit short-time fluctuations.

•
Shifting of variables to confirmed cases from one to nine days, in accordance with the existing delay between the actual date of infection and the announcement of confirmed cases, as mentioned above in the conditions of analysis.
The independent and dependent variables in the model are as follows:

Results
This section is divided by subheadings. It will provide a concise and precise description of the experimental results, their interpretation, and the experimental conclusions that can be drawn.

The Multivariate Linear Regression Analysis
The results of multivariate linear regression for three case studies can be seen in Tables 3-5. The tables show the regression results, including p-value and beta coefficients for the variables x1, x2, x3, and x4, as well as R 2 and the p-value of the overall regression models by shifting the variables with respect to confirmed cases of COVID-19 from one to nine days. The results show that the effect of the independent variable x4 was significant in all models. Although the effect of the variables x1, x2, x3 in some intervals was not significant, the R 2 values and the significance level of the regression models indicate the confirmation of these regression models.
As it is clear from Tables 3-5, the accuracy of the model in the three case studies depends on the specific delays between independent and dependent variables.
To form a regression model that includes the maximum effect of independent variables x1, x2, x3 on the dependent variable, the time intervals with the highest beta coefficient (maximum possible effect) for each of three variables are specified and presented in Table 6.

The Correlations between the Confirmed Cases and Climate Factors
Daily confirmed cases and the average temperature in three case studies are presented in Figures 2  and 3, respectively. The trend of daily positive cases of COVID-19 in Lombardy and average temperature and wind speed are shown in Figures 4 and 5. The blue arrows in each graph are at the same length (a specific period, for example, four days), which means a repeat of impact happened. However, since the impact of wind, humidity, and temperature are all together, in some rare cases, the higher impact of one (a severe wind) could be dominant in comparison with other variables. As it is clear from Tables 3-5, the accuracy of the model in the three case studies depends on the specific delays between independent and dependent variables.
To form a regression model that includes the maximum effect of independent variables x1, x2, x3 on the dependent variable, the time intervals with the highest beta coefficient (maximum possible effect) for each of three variables are specified and presented in Table 6.

The Correlations between the Confirmed Cases and Climate Factors
Daily confirmed cases and the average temperature in three case studies are presented in Figures  2 and 3, respectively. The trend of daily positive cases of COVID-19 in Lombardy and average temperature and wind speed are shown in Figures 4 and 5. The blue arrows in each graph are at the same length (a specific period, for example, four days), which means a repeat of impact happened. However, since the impact of wind, humidity, and temperature are all together, in some rare cases, the higher impact of one (a severe wind) could be dominant in comparison with other variables.

The Final Multivariate Equations
According to the best configuration of the model, the final multivariate equations have been provided, and the coefficients, significance level of constant, independent variables, and standardized coefficients beta are displayed in Table 7.

The Final Multivariate Equations
According to the best configuration of the model, the final multivariate equations have been provided, and the coefficients, significance level of constant, independent variables, and standardized coefficients beta are displayed in Table 7. The values of R 2 in all of these regression models are an appropriate level, and these regression models are significant (p-value = 0.000). In the case of variables x1, x2, x3 (with the high R 2 and very desirable and meaningful regression models), and due to the main goal of the study, more emphasis is placed on beta coefficients. However, the results show a significant effect of variable x2 in Milan and also a significant effect of variable x3 in Venice. The final multivariate equations for each of the three case studies are presented in Table 8.

Discussion
The results of initial multivariate linear regression for three case studies show high accuracy. However, the accuracy changed by shifting the variables from one to nine days prior to observations. The shifting of variables with respect to confirmed cases reflected the fact that the positive cases are defined as positive-test cases with symptoms such as fever, cough, or shortness of breath, and the symptoms of COVID-19 occur about 3 to 5 days earlier. Moreover, there are two other reasons for delays, each of about one day, namely the delay in knowing the results of the laboratory analysis and the 24 hour delay between the confirmation of the infection and its announcement. After consideration of the impacts during nine days, the best combination was determined, and the final MLR model was developed accordingly. Comparisons between the final multivariate equations we obtained and previous ones by R 2 , and the coefficients and significance level of the constant show that accuracy was improved by the proposed method. The hypothesis of the study approved by the MLR model and the assessment method could be used by consideration of any new variables to exhibit the exact delay of each of them to new confirmed cases of coronavirus.
The trend analysis showed that the number of confirmed cases in all three case studies was equal on 29 February; however, in Lombardy, it grew more in the following days. This might be due to the population density of Lombardy (422) being larger those of Veneto (272) and  and to the fact that the average temperature of Lombardy was lower than in the other two regions since 25 February.
The number of confirmed cases in Emilia-Romagna was equal to that of Veneto until 2 March, increasing nearly with the same trend, and this might be because the average temperatures of the two regions were similar before 2 March. The fluctuation of daily positive cases of COVID-19 in comparison with average temperature and wind speed in all three case studies also showed the correlations with some delay and proved the results of regression analysis.
It must be mentioned that the equations we provided are a regression based on the multivariable technique. They present the best combination of all factors in prediction, not each of them separately. Therefore, the impact of one can overcome the other. For example, a wind with very high speed might be dominant with respect to a small decrease in average temperature. In addition, the developed prediction equations are based on the dataset of three case studies that were in an increasing trend of positive cases of COVID-19. Therefore, they cannot be used to predict the behavior of positive case in a future decreasing trend. However, the same delays for the impact of climate factors would also exist in a decreasing trend. As mentioned in [22], due to the incubation period of COVID-19, which is about 2-14 days, it should be expected that the increasing trend would stop in a period, which, according to Chinese experience, would be expected to be about 13 days from the starting date of full lockdown. However, in the Italian case, it will probably last longer because of the different features of the lockdown implementation. Finally, to improve the accuracy of the prediction, the same analysis with the mentioned condition and by consideration of delays are suggested for future studies using MnLR and artificial intelligence techniques.

Conclusions
In this study, an assessment method for investigating the impact of climate and urban parameters in confirmed cases of COVID-19, as a new challenge in sustainable development was investigated using the multivariate linear regression and trend analysis. The results of MLP for three case studies in Italy showed a high accuracy correlation between the variables and observations. However, the accuracy changed by shifting the impact time. The results of the analysis demonstrate the effectiveness of the proposed model and the impact of climate parameters on the trend of confirmed cases. The results of shifting the variables with respect to confirmed cases demonstrate that there is a delay from four to eight days between the impact of weather parameters and the new confirmed cases. The reason could be the fact that the positive cases are among those with symptoms, and the symptoms of COVID-19 occur after about 3 to 5 days, and there are also two other delays, each of about one day, including the results of the laboratory analysis and the announced time that belongs to the previous 24 h. Comparisons between the final developed MLR and the first ones showed an improvement in the accuracy of the proposed technique. According to the trend analysis, it seems that the population density and weather conditions could affect the daily positive cases of COVID-19. The fluctuations of daily positive cases in all three case studies confirmed the impact of climate factors with some delay and proved the results of MLR. As a result, the developed prediction model can be applied as an assessment method for investigating the impact of climate and urban parameters in confirmed cases of COVID-19.

Conflicts of Interest:
The authors declare no conflict of interest.