Effect of Meteorological Conditions and Anthropogenic Factors on Air Concentrations of PM2.5 and PM10 Particulates on the Examples of the City of Kielce, Poland

The paper analyzes the influence of meteorological conditions (air temperature, wind speed, humidity, visibility) and anthropogenic factors (population in cities and in rural areas, road length, number of vehicles, emission of dusts and gases, coal consumption in industrial plants, number of air purification devices installed in industrial plants) on the concentration of PM2.5 and PM10 dusts in the air in the region of Kielce city in Poland. Spearman correlation coefficient was used to evaluate the relationship between the mentioned independent variables and air quality indicators. The calculated values of the correlation coefficient showed statistically significant relationships between air quality and the amount of installed air purification equipment in industrial plants. A statistically significant effect of the population in rural settlement units on the increase in air concentrations of PM2.5 and PM10 was also found, which proves the influence of the so-called low emission of pollutants on the air quality in the studied region. The analyses also revealed a statistically significant effect of road length on the decrease in PM2.5 and PM10 air content. This result indicates that a decrease in traffic intensity on particular road sections leads to an improvement in air quality. The analyses showed that despite the progressing anthropopression in the Kielce city region the air quality with respect to PM2.5 and PM10 content is improving. To verify the results obtained from statistical calculations, parametric models were also determined to predict PM2.5 and PM10 concentrations in the air, using the methods of Random Forests (RF), Boosted Trees (BT) and Support Vector Machines (SVM) for comparison purposes. The modelling results confirmed the conclusions that had been made based on previous statistical calculations.


Introduction
A major problem of growing urban agglomerations is the deterioration of air quality [1][2][3][4][5][6][7]. Contaminated air has many components, but priority research includes the determination of concentrations of PM2.5 and PM10 particulates in the air. The finest particulate matter present in the atmospheric air poses a great danger to living organisms as they easily enter the respiratory system and consequently the bloodstream [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]. In connection with increased public awareness, a number of documents have been introduced to regulate the amount of substances entering the air, including the Directive of the European Parliament and European Council on ambient air quality and clean air in Europe [24] and in Poland the Regulation of the Minister of Environment on the assessment of pollutant levels in the air [25].
Unfortunately, despite the above regulations, the air quality in Poland is still bad. The basic sources of emissions in urban agglomerations include mainly industry, road transport and low buildings fuelled with hard coal [26][27][28][29][30][31][32][33]. Due to a high threat, the influence of dusts on human health has recently become a subject of research frequently discussed in publications [10][11][12]34]. In order to limit the negative impact of dusts on human health, various actions have been taken for many years, including programs limiting, in particular, the so-called low emission, such as PONE -Low Emission Reduction Program in Poland [35], aimed at improvement of air quality.
Assessing the impact of launched pollution abatement programs on air quality is not easy and obvious. This is due to the fact that despite many works on air quality analysis, they usually consider only meteorological conditions and pollutant emissions from emitters [2, 30,36]. In many works, authors limit themselves only to the dynamics of meteorological conditions and the impact of transport on air quality, which in many cases constitutes a significant simplification of analyses [29,31,32]. The remaining variables of anthropogenic-regional character (expansion of roads, changes in land use, introduction of new standards in air purification devices, increase in population, change in vehicle quality standards) are usually omitted in the studies, which, from the point of view of the dynamics of air quality changes in the selected area, may lead to inappropriate decisions in the field of ambient air quality management. Bearing in mind the negative impact of air pollutants on human health, numerous attempts have been made to model them, including particulate matter in particular. However, due to the speed of the modelled processes and omission of anthropogenic factors in the models, the application of those models is limited [37][38][39].
Taking into account the above considerations and the need to improve the accuracy of air quality modelling, the paper analyses in detail the influence of meteorological, anthropogenic and local development factors on the change in the dynamics of PM2.5 and PM10 air pollution on the example of the Polish city Kielce, having data for calculations from the years 2010-2015. On the basis of the annual Reports on the State of the Environment from 2010-2016 in Poland [40], relevant data from the national Central Statistical Office [41] (population, coal consumption, number of rural settlements in the region), meteorological conditions, information on ongoing programs to improve air quality, a statistical analysis was performed to determine the impact of the abovementioned factors on the air content of PM2.5 and PM10.
At the same time, calculations were made for modelling PM2.5 and PM10 concentrations in the air, using three above mentioned parametric modelling methods: RF, BT and SVM, for comparison purposes. The modelling results obtained were used to verify the conclusions that had been made based on previous statistical calculations.

General characteristics of the study area
The city of Kielce with an area of 109.45 km 2 is located in southern Poland at the foot of the Świętokrzyskie Mountains in Świętokrzyskie voivodeship. A significant part of the city lies in a valley which is bounded from the north and south by local hills. The result of this location is a diversity of terrain ordinates, which range from 206 to 408 m above sea level. Kielce lies in an upland climate region. The city has 200,000 inhabitants. During the year, westerly winds prevail in the city (16.5%), and the least frequent are northerly winds (4.1%) [42]. The average annual air temperature is 7 o C, with July being the warmest month (average temperature 17.20 o C) and January being the coldest month (average temperature -5.20 o C). Thunderstorms occur about 23 days per year, and the annual rainfall is at 724 mm, with the highest rainfall recorded in the month of July: 96 mm, and the least in the month of October: 34 mm. The average annual insolation is about 4.5 hours per day.
During the analysed period, i.e., 2010-2015, a number of changes took place in and around the city to improve urban air quality. The key role is played by the S7 expressway, opened in 2012 as part of the E77 European route, which bypasses the city of Kielce from the west; it has significantly relieved congestion on city streets. Another step taken by the local authorities to reduce the amount of pollutants entering the air was the launch of the Low Emission Reduction Programme (PONE) in 2014 [35]. The main goal of the program is to reduce the level of low emissions caused by burning fuels in individual heat sources.
Based on the Statistical Yearbooks of the Świętokrzyskie Voivodeship from 2010 to 2016, data was obtained that relates to factors that directly affect ambient air quality [42]. By 2015, the number of motor vehicles increased by nearly 14 percent compared to the number recorded in 2010, which was then about 784 thousand. However, this increase does not have a significant impact on urban air quality. This is due to the Euro 6 European Exhaust Emission Standard The sum of pollutants emitted to the air from fuel combustion and from the cement and lime industry has accounted for more than 85 percent of the total dust pollutants that enter the air since the beginning of the analysed time, i.e., since 2010. The dust emission from the cement and lime industry has remained at a constant, acceptable level, while the emission from fuel combustion has been decreasing. As part of the monitoring carried out in the city of Kielce, air quality measurements are conducted at one point in the city. It is an urban background measurement station located at 275 m above sea level. Measurement data collected at the station in the period from February 2010 to December 2015 were used to perform the analyses. These measurements are available online on the website of the Regional Inspectorate for Environmental Protection in Kielce [42]. At the station, by means of measuring devices API100, API200, API300 and BAM1020, the following air pollution indicators are measured: sulfur dioxide, tropospheric ozone, carbon monoxide, nitric oxide, nitrogen dioxide, PM2.5 and PM10 particulate matter, benzene, toluene, o-xylene, m+p-xylene and ethylbenzene. In the area where the measuring station is located, there are loose multi-family residential buildings, as well as commercial and service facilities, schools, etc. The measuring station is located at a distance of 64 meters from the main road, the average traffic volume is 8,000 vehicles per day, the distance from the measuring point to the nearest parking lot is about 20 meters, and the average number of parked vehicles is 15 vehicles per day. This type of data on a local or global scale has been collected and compiled for many years in special databases available on the Internet [44,45].

Results of statistical analyses
In order to assess the influence of selected variables (meteorological conditions, emission levels, anthropogenic factors, information on ongoing programmes aimed at improving air quality) on the air content of PM2.5 and PM10 particulate matter, the correlation matrix between these variables was determined using the STATISTICA software.
On the basis of daily measurements of particulate matter in the air, charts of monthly average concentrations of PM2.5 and PM10 in the air were prepared for the analysed period 2010-2015, and on the basis of these charts average and linear trend courses were determined for each particulate matter. The determined curves of average concentration of the studied dusts and their trend lines are presented in Figure 2.
The highest average annual concentration of PM10 particulate matter was recorded in 2011, when the permissible level of 40 μg/m 3 was exceeded by more than 20 μg/m 3 . It can be noted that the annual average exceedance of the permissible standard for PM10 is gradually decreasing, although its concentration still does not meet the requirements imposed by the national Regulation of the Minister of Environment of 24 August 2012 on the levels of certain substances in the air [25]. Monitoring of PM2.5 dust concentration started in Kielce since 2010 and since then the cyclic mean annual concentration of dust of this fraction is higher than the permissible one, which is 25 μg/m 3 . Based on the data analysis of PM10 and PM2.5 concentrations in 2010-2015 (5 years), their seasonal variability was noticed. The highest values of concentrations of particulate matter in the atmospheric air were recorded in the winter period (December, January, February), which is associated with increased emission of pollutants from burning coal and wood for heating purposes by city residents. The lowest values were recorded in the summer period (June, July, August). The obtained seasonal dependence is shown in Fig. 3.  For the statistical analysis of the results, the sources of pollution that could have the greatest impact on the concentration of particulate matter in ambient air were selected, such as the number of rural settlements in the region, population, total length of hard-surfaced roads, number of motor vehicles, amount of coal consumed, amount of dust and gas emissions from industrial plants, and meteorological parameters. In addition, the activities of the region's authorities aimed at improving air quality, such as the introduction of the low emission program [35] and the European emission standard (Euro6) in 2014 [43] and the construction of a ring road around Kielce in 2012, were taken into account. Based on the collected information, a correlation matrix of the studied variables presented in Table 1 was prepared using the STATISTICA program.
Based on the data in Table 1, it can be concluded that there is a very high correlation between the 'Air temperature' variable and the concentration of PM2.5 and PM10. This result is also confirmed by the variable 'Months of the year', which determines the belonging of individual months to summer and winter periods, and to a slightly lesser extent by the variable 'Air humidity', which is in turn correlated with air temperature. The obtained relations can be easily explained because in the winter period the turbulence and movement of air masses are much smaller than in the summer period, and the intensive movement of air masses causes lower concentrations of components polluting the atmosphere.
A very high correlation was also found for the variable 'Visibility'. It is known that the concentration of particulate matter in the air increases as the temperature decreases. This dependence is closely related to visibility, which decreases with an increase in pollution correlated with a decrease in air temperature. A different dependence from the dependence typical for the influence of meteorological factors on atmospheric pollution was found for the variable 'Wind speed'. It has been also shown in other works [46] that an increase in wind speed leads to a reduction in the content of PM2.5 and PM10 particles in the atmosphere, which results from their greater dispersion in the air volume, while our calculations do not show that. However, this can be explained by the fact that during the whole analyzed period 2010-2015, the average wind speed did not exceed the value of 4 m/s, which indicated the occurrence of mild winds (according to the Beaufort scale), and in this case the obtained low value of the correlation coefficient can be justified. Probably the extension of the study period and the occurrence of more dynamic changes in atmospheric conditions in successive years would confirm the results obtained by other researchers. The influence of 'Atmospheric pressure', judging from the values of correlation coefficients, is located between the influence of air humidity and wind speed. An interesting group of variables usually ignored in air quality studies, which is included in this analysis, is the change in the number of air purification devices such as cyclones, multicyclones, fabric filters and electrofilters. As shown by statistical data [42], despite the reduction in the number of dust removal devices, the value of their emission to air is gradually decreasing. The above relationship can be justified by the improvement of the efficiency of these devices in recent years. This is confirmed by the obtained results, as a statistically significant correlation was found between high class of installed devices (the so-called super devices) and improvement of air quality. The PGE heat and power plant has been successively modernising equipment reducing emissions of pollutants to air and at present all its kettles cooperate with highly efficient dedusting devices (electrofilters with efficiency of 99.5%, preliminary axial dedusters and multicyclones with dedusting efficiency of 90%, bag filter and cyclone with efficiency of 99%). Most studies emphasize the role of purification equipment in improving air quality [5,6], but the present work is an example of a study in which this was demonstrated analytically for a separate urban area. The obtained results of calculations ( Fig. 1 and Table 1) indicate that although numerous industrial plants located in the Kielce region lead to high concentrations of PM2.5 and PM10 dust in the air, the emission is decreasing due to the improved efficiency of the dedusting equipment, as demonstrated above.
Increasing the total length of hard-surfaced roads is another important factor that has a positive impact on reducing air pollutant concentrations. Increasing the length of such roads reduces traffic volume and improves traffic flow. The obtained relationship is also confirmed by the variable 'Ring road', which shows that express road built in 2012 significantly contributed to the reduction of pollutants entering the air, primarily it reduced traffic volume in Kielce, and the pollutants from cars are emitted over a larger area, which causes faster disappearance of pollutant concentrations in the air. The age structure of vehicles itself is also important, as can be seen from the above analysis (variable 'Vehicles'). Despite the increase in the number of cars in 2015 by about 15 percent compared to the number of cars registered in 2010, this did not contribute to the deterioration of air quality, as a new category of vehicles was introduced in 2014 (variable 'Euro6'), which imposed stricter requirements on pollution from fuel combustion in new cars.
So-called low emission, which is the result of burning hard coal with low heating parameters in an inefficient way in local coal-fired boiler houses and home heating ovens, has a major impact on ambient air quality. Thus, as can be seen from the results obtained, the increase in the number of 'Rural settlements' adjacent to the city of Kielce, which have mainly point sources of heat (heating ovens), usually fired by hard coal, causes an increase in the amount of particulate matter PM10 and PM2.5. This relationship is closely correlated with the low emission reduction program (PONE) [35], which was introduced in 2014 and minimizes the negative effects of low emission. The influence of the variable 'Population' on air quality is small, which is due to the fact that the population of Kielce has been decreasing by about 3% in recent years [42].

Mathematical modelling of particulate matter content in the air
The variability of the content of PM2.5 and PM10 dusts is very diverse and depends on many factors, such as atmospheric conditions, emission volumes and their dynamic changes, urbanization of the area, etc. Therefore, the mathematical description of changes in the content of particulate matter in the air is complicated and giving an analytical relation describing these changes is very difficult. Hence, machine learning methods are increasingly used to simulate the content of particulate matter in the air, as a result of which parametric models are determined. Usually, authors who determine parametric models of complex environmental processes deal with single methods [47,48], which makes it difficult to generalize conclusions and establish a reference method for simulation of the studied phenomenon. Therefore, in the present work, several computational methods of varying complexity were considered for comparative purposes.
Of the simpler methods, Random Forest (FR) and Boosted Trees (BT) methods were used, with the BT method being a modification of the Regression Trees method. The inclusion of a gradient enhancement algorithm in these methods and the replacement of a single tree with a forest significantly improved the simulation results. The parameter optimized when constructing the model is the number of trees, which should not exceed 300 [49].
In the more complex and also used in calculations Support Vector Machine (SVM) method, the value of dust content in the air is determined based on the equation: When constructing the model of the process under study, the capacitance (C), the value of the insensitivity threshold (ε) and the values of the coefficients in the Kernel function are optimized in this method. A Gaussian Kernel function was considered in the calculations: which determines high fit of the calculation results to the measurements (where γ -empirical coefficient sought by the method of successive substitutions for successively assumed values of C and ε [50]. The data used to determine the models were related to the same variables that were included in the statistical analyses performed earlier and had a significant impact on air quality. These are: parameters defining atmospheric conditions, local conditions connected with reconstruction of the transport system in the area of the city of Kielce, factors connected with reconstruction of existing air purification devices at industrial plants and anthropogenic factors connected with expansion of the city and adjacent areas. The data sets available for modelling were divided into three subsets: learning set (50%), test set (25%) and validation set (25%). Mean absolute error (MAE), mean relative error (MAPE) and correlation coefficient (R) were used to assess the predictive ability of the determined models: 1) mean absolute error: where: n, N -size of the data set, yi,obs,obl -measured or calculated values of the model output, yi,obs,predarithmetic mean of measured or calculated values of the model output.
Before proceeding to the determination of models on the basis of the considered independent variables (atmospheric conditions, atmospheric emissions, anthropogenic factors), the Fischer-Snedecor method [51] was applied to the analysis of the available data in order to eliminate the variables having negligible impact on the calculation results, i.e., on the content of PM2.5 and PM10 dust in the air. It was assumed that only variables for which the value of the determined test probability F does not exceed p = 0.05 will be included in further studies.

The analysis
The analysis of the obtained data shows that the air quality is significantly affected by atmospheric conditions, local conditions related to the reconstruction of the transport system in the city of Kielce, factors related to the reconstruction of the existing air purification devices in industrial plants and anthropogenic factors related to the expansion of the city and adjacent rural areas.
RF, BT and SVM models were determined based on these statistically significant variables. The obtained values of the measures of matching results of calculations with measurements and the parameters describing the structure of the determined models are presented in Table 2

Conclusions
The paper presents the results of research on the influence of meteorological and anthropogenic factors, i.e., caused by human activity, on the atmospheric pollution with PM2.5 and PM10 particles, which are particularly harmful to human health. In the study, statistical analysis of this influence was carried out using the Spearman correlation coefficient, and the results obtained were verified with statistical models using three modelling methods.
Research involving analysis of sources polluting the atmosphere, the diseases they cause and the establishment of acceptable standards of their concentration in the air has been carried out continuously for many years until now, which is documented by a large number of earlier publications listed in the introduction of the article and also selected publications from recent years [52][53][54][55][56], included at the end of the references. However, it can be noted that these publications generally deal with the analysis of a single issue, for example, the influence of meteorological conditions on atmospheric pollution, the influence of road traffic on PM2.5 and PM10 emissions, the influence of air pollution on a specific type of diseases or the study of pollutant concentrations in the atmosphere of a specific city or region, while they do not conduct comprehensive analyses and also do not address the issue of mathematical modelling of the air pollution process.
The current paper is an attempt to comprehensively approach to the problem of atmospheric pollution with PM2.5 and PM10 particles, taking into account meteorological and anthropogenic factors affecting their concentrations in the air in the statistical analysis, verifying the results of the statistical analysis with modelling results and conducting research on real data from a medium-sized city.
Turning to the evaluation of the results obtained, it should be stated that the statistical analysis carried out showed that when studying the concentration of particulate matter in the atmosphere, both meteorological and anthropogenic factors should be taken into account, because the influence of anthropogenic factors on the emission of particulate matter is as important as meteorological conditions.
On the other hand, evaluating the results of modelling we may state that all three applied methods of statistical modelling qualify for application in the simulation of concentrations of PM2.5 and PM10 particulate matter in the air, although the SVM method is slightly more accurate and at the same time more complex than the remaining RF and BT methods.
At the same time it should be emphasized that the usefulness of the determined models consists primarily in the fact that with their help it is possible to forecast the concentration of particulate matter in the air on the basis of predicted or simulated atmospheric conditions and anthropogenic factors taken into account in the models.