Hourly predictive artificial neural network and ...

6 downloads 55273 Views 1MB Size Report
Those results are consistent with the Spearman's rank correlation analysis. © 2010 Elsevier ... was to determine by means of statistical methods, which meteoro- logical factor .... one standard error (1 SE) from the best predictive tree (Breiman et al., 1984) ... using StatSoft software Statistica 9.0 (StatSoft Inc., 2009; Lula, 2000;.
Science of the Total Environment 409 (2011) 949–956

Contents lists available at ScienceDirect

Science of the Total Environment j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / s c i t o t e n v

Hourly predictive artificial neural network and multivariate regression trees models of Ganoderma spore concentrations in Rzeszów and Szczecin (Poland) Idalia Kasprzyk a, Agnieszka Grinn-Gofroń b,⁎, Agnieszka Strzelczak c, Tomasz Wolski d a

Institute of Biology and Nature Protection, University of Rzeszów, Poland Department of Plant Taxonomy and Phytogeography, University of Szczecin, Wąska 13, 71-415 Szczecin, Poland Department of Food Process Engineering, West Pomeranian University of Technology, Szczecin, Poland d Physical Oceanography Laboratory, University of Szczecin, Poland b c

a r t i c l e

i n f o

Article history: Received 29 June 2010 Received in revised form 26 November 2010 Accepted 1 December 2010 Available online 23 December 2010 Keywords: Ganoderma Weather variables Artificial neural networks (ANN) Multivariate regression trees (MRT)

a b s t r a c t Ganoderma spores are one of the most airspora abundant taxa in many regions of the world, and are considered to be important allergens. The aerobiology of Ganoderma basidiospores in two cities in Poland was examined using the volumetric method, (Burkard and Lanzonii Spore Traps), from selected days in 2004, 2005 and 2006. Spores of Ganoderma were present in the atmosphere from June to November, with peak concentrations generally occurring from late July to mid-October. ANN (artificial neural network) and MRT (multivariate regression trees), models indicated that atmospheric phenomenon, hour and relative humidity were the most important variables influencing spore content. The remaining variables (air temperature, dew point, air pressure, wind speed and wind direction), also contributed to the high network performance, (ratio above 1), but their impact was less distinct. Those results are consistent with the Spearman's rank correlation analysis. © 2010 Elsevier B.V. All rights reserved.

1. Introduction Ganoderma Karst. is a well established polyporoid genus of the Aphyllophorales, easily recognized because of the large, perennial basidiocarps producing a thick, recinous crust over the upper surface of the pileus, (Pegler and Young, 1973). The basidiospores are so distinctive that Donk (1948) removed Ganodermatoideae from the polyporaceae to establish a new family, the Ganodermataceae. Ganoderma spores are morphologically easy to identify, and more frequently reported in studies, than many other basidiospores. Immunoblot inhibition studies with Ganoderma and other basidiospores indicated that, Ganoderma had a slight cross-reactivity with other basidiomycetes (Bush and Portnoy, 2001). Despite the fact that Ganoderma spp. have been studied by more laboratories than any other basidiospores, no allergens have yet been isolated or characterized (Helbling et al., 2002). However, many authors have reported human sensitizativity to Ganoderma extracts, and considered them to be a potential source of allergens involved in asthma and rhinitis, (Hasnain et al., 1984, 1985; Cutten et al., 1988; Hasnain, 1993). These authors also suggested that persistent exposure to Ganoderma basidiospores could stimulate a base level of allergic responsiveness in susceptible atopic patients. Because of its potentially strong sensitizing properties and cross-reactivity, it seems to be important

⁎ Corresponding author. Tel.: +48 91 4441670. E-mail address: [email protected] (A. Grinn-Gofroń). 0048-9697/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.scitotenv.2010.12.002

to predict days with high concentration of spores in the air, and indicate those meteorological parameters which have the strongest influence on these concentrations. The aim of the studies was to compare hourly fluctuations of spores in two Polish cities, to appoint the influence of selected weather factors on Ganoderma spore concentration and to determine the greatest potential influence of different factors on the various fungal components of the airspora in these regions. Our essential goal was to determine by means of statistical methods, which meteorological factor had the strongest influence on the increase of fungal spore concentration, and create predictive models based on such weather parameters. 2. Materials and methods In this paper, we use various statistical methods to analyse the effect of certain variables — air temperature, dew point temperature, relative humidity, air pressure, wind direction and speed, and atmospheric phenomena, on the atmospheric levels of Ganoderma hourly spore concentrations in Rzeszów and Szczecin, during three consecutive years, (2004, 2005 and 2006). For this study we have chosen days in three seasons, (34 in Rzeszów and 30 in Szczecin), with optimal conditions, in which concentration of Ganoderma spores reached the highest values, (daily values above 100 spores/m3). The spore data were analysed to determine the start, end and duration of the season, using the 90% method. The start of the season was defined as the date when 5% of the seasonal cumulative spore count was

950

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

trapped, and the end of the season, as the date when 95% of the seasonal cumulative spore count was reached. As recommended by the International Association for Aerobiology for this study, the nonviable, volumetric method has been employed using VPPS Lanzoni 2000 Spore Trap in Szczecin and Burkard Spore Trap in Rzeszów. The instrument and method was designed by Hirst (1952). Both types of samplers can be used for the continuous monitoring of airborne particles concentration, and for comparative study. The sample is sucked in by vacuum pump, and air with pollen grains, fungal spores and other particles, deposited on the tape. The air flow is provided by an external vacuum pump, and must be constant, usually 10 l/min. Both traps were installed on roof tops in the centers of the cities at different heights above ground level: Szczecin — 21 m and Rzeszów — 12 m. The reason for the different sampling heights was technical, and dependent on, (1) the absence of physical features that would affect spore movement to the trap, and (2) easy access for installation and maintenance of equipment. Spores were identified in one horizontal line of the microscopic glass under ×400 magnification. The results were expressed as daily average spores/m3 — with every microscopic slide corresponding to each day of a year. The city of Rzeszów (50°01′N; 22°02′E) is located in the southeastern part of Poland. There are no natural barriers in this region, and altitude varies between 80 and 200 m above sea level. Its climatic conditions are mainly affected by polar maritime and polar continental air masses. The average relative humidity is 76%, the highest — 87%, which occurs in December and January, and the lowest about 73% in April and May. In the last decade mean annual temperature was 8.1 °C, and mean temperature for July, (the warmest month), and January (the coldest month) were 18.3 °C and −2.1 °C, respectively.

Average annual rainfall is 616 mm, precipitations occurs on 171 days a year. The period of growth is from 215 to 220 days. Rzeszów is a medium-sized city with typical urban development. Its flora and vegetation are strong influenced by of human activity, and most of the vegetation has been planted by man. Several trees and shrubs have been planted near the spore trap and synanthropic plant communities are also present. The environment of Rzeszów is a mosaic of the forests and crop fields; agricultural land occupies a large area in the land use structure, (Kożuchowski and Degrimendžić, 2005). Szczecin is situated in the Odra river valley in the north-west part of Poland (53°26′N, 14°32′E). The altitude varies between 0.1 m under sea level and 148 m above sea level. The city is surrounded by forests, farmland and abandoned farmland areas, which provide suitable media for the spore production. The ‘Baltic’ climate of Szczecin is influenced by air masses from over the Northern Atlantic, and is characterized by mild winters and cool summers. The presence of large aquatic reservoirs such as the Szczecin Lagoon, Miedwie Lake and the Odra river valley, increases the humidity in these areas. The average relative humidity is 75%, the highest — 88%, which occurs in November, December and January, and the lowest about 72% in April and May. The average air temperature in Szczecin ranges from 8 to 8.4 °C. The warmest month is July, with temperatures of 15.8 °C to 20.3 °C, the coldest — January from −4.1 °C to 2.6 °C. Air temperature below 0 °C occurs on average 86 days a year, usually in January and February. Average annual rainfall is 537 mm, and within a year there are approximately 167 rainy days. The season of vegetative growth lasts from 210 to 220 days (Kożuchowski and Degrimendžić, 2005). The meteorological data covering three years of studies were provided by the Automatic Weather Station (Vaisala MAWS101) in both cities. They included hour and meteorological variables: air

Fig. 1. Matrix scatter plots between Ganoderma spore concentration and explanatory variables in Rzeszów and Szczecin (whole data set).

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

temperature, dew point temperature, relative humidity, air pressure, wind direction (0–360°), wind speed and atmospheric phenomenon (clear — 1, scattered clouds — 2, partly cloudy — 3, mostly cloudy — 4, cloudy — 5, fog — 6, occasional showers — 7, and storm — 8). The incorporation of hour into the data set was to demonstrate the exact hourly pattern of Ganoderma spore content. Meteorological factors are partly correlated with time of day, but the “hour” variable could also represent some other, unknown factors influencing spore concentration, not measured or taken into account in the analyses, but also influenced by time. The hourly patterns in the external conditions affecting high spore concentrations in the air were investigated. We then created the predictive models of diurnal periodicity of the spore concentrations on the basis of selected meteorological factors. 2.1. Statistical analysis Due to non-linearity and non-normality of the data set as well as the presence of one categorical variable (atmospheric phenomenon), a set of appropriate statistical analyses was used. Relationships between the variables were investigated with the help of Spearman's rank correlation analysis. To reveal differences in meteorological parameters and Ganoderma spore content between Rzeszów and Szczecin, the U Mann–Whitney non-parametric test was applied. Additionally, the significance of differences in Ganoderma abundance between consecutive hours and atmospheric phenomena was studied with the use of non-parametric ANOVA Kruskal–Wallis test. Further, more advanced modelling techniques were applied — multivariate regression trees (MRT) and artificial neural networks

951

(ANN). Meteorological parameters and hour were used as input variables while Ganoderma spore content was an output variable. Those methods were chosen because they allow to analyse community data but make no assumptions about the form of relationships between airborne mycoflora and explaining factors. Moreover, they are applicable for complex ecological data with imbalance, non-linear relationships and high-order interactions. MRT method was used to reveal if there were any threshold values of explaining variables which would have separated different levels of Ganoderma spore concentration. General idea of MRT is to form clusters of sites by repeating splitting of the data along axes of the explanatory, environmental variables. Each split is chosen to minimise the dissimilarity (sum of squared Euclidian distances, SSD) of data within the clusters (Breiman et al., 1984; De'ath and Fabricus, 2000; De'ath, 2002). The clusters and their dependence on input variables are presented graphically by a tree. The overall fit of the tree is specified as relative error (Error; SSD in the clusters divided by SSD of the undivided data) and the predictive accuracy is assessed by CVRE (cross-validated relative error) (Breiman et al., 1984; De'ath and Fabricus, 2000). In this study, the finally selected tree was the most complex model within one standard error (1 SE) from the best predictive tree (Breiman et al., 1984), using 2000 multiple cross validations, to stabilize CVRE. Analyses were carried out in R 2.10.1 (The R Foundation for Statistical Computing 2009) using mvpart (Multivariate Partitioning) package (De'ath, 2002). In order to create a regression model of relationships between meteorological parameters, hour and Ganoderma spore concentrations, the ANN technique was used. In this study multi layer perceptrons (MLP) were applied, which mathematically perform a

Fig. 2. Matrix scatter plots between Ganoderma spore concentration and explanatory variables in Rzeszów.

952

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

Fig. 3. Matrix scatter plots between Ganoderma spore concentration and explanatory variables in Szczecin.

stochastic approximation of multivariate functions (Osowski, 1996; Carling, 1992; Lek and Guegan, 1999). Calculations were performed using StatSoft software Statistica 9.0 (StatSoft Inc., 2009; Lula, 2000;

Tadeusiewicz, 1993, 2001). The consecutive neural networks were designed and trained using back propagation (Fausett, 1994; Haykin, 1994) and conjugate gradient algorithms (Bishop, 1995) by

Table 1 Spearman's rank correlation matrix in Rzeszów, Szczecin and for the whole data set. p-value: ⁎p b 0.05. Ganoderma Ganoderma

Hour

Air temperature

Dew point temperature

Humidity

Air pressure

Wind direction

Wind speed

Atmospheric phenomenon

All Rz Sz All Rz Sz All Rz Sz All Rz Sz All Rz Sz All Rz Sz All Rz Sz All Rz Sz All Rz Sz

−0.09⁎ −0.15⁎ −0.02 −0.16⁎ −0.32⁎ 0.01 −0.04 −0.05 0.04 0.16⁎ 0.32⁎ 0.03 0.02 −0.07 0.15⁎ −0.08⁎ −0.08⁎ −0.03 −0.13⁎ −0.19⁎ −0.05 −0.07⁎ −0.27⁎ 0.10⁎

Hour

Air temperature

Dew point temperature

Humidity

Air pressure

Wind direction

Wind speed

Atmospheric phenomenon

−0.09⁎ −0.15⁎ −0.02

−0.16⁎ −0.32⁎ 0.01 0.43⁎ 0.43⁎ 0.43⁎

−0.04 −0.05 0.04 0.22⁎ 0.22⁎ 0.21⁎ 0.52⁎ 0.47⁎ 0.57⁎

0.16⁎ 0.32⁎ 0.03 −0.40⁎ −0.40⁎ −0.40⁎ −0.83⁎ −0.89⁎ −0.78⁎ −0.04 −0.08⁎ −0.01

0.02 −0.07 0.15⁎ −0.05⁎ −0.06 −0.06 −0.18⁎ −0.28⁎ −0.08⁎ −0.38⁎ −0.43⁎ −0.31⁎ 0.02 0.12⁎ −0.10⁎

−0.08⁎ −0.08⁎ −0.03 0.08⁎ 0.14⁎ −0.01 −0.03 0.06 −0.18⁎ 0.00 0.10⁎ −0.17⁎ 0.03 −0.01 0.07 −0.11⁎ −0.03 −0.23⁎

−0.13⁎ −0.19⁎ −0.05 0.23⁎ 0.24⁎ 0.21⁎ 0.47⁎ 0.58⁎ 0.34⁎ 0.12⁎ 0.21⁎ 0.03 −0.52⁎ −0.57⁎ −0.46⁎ −0.25⁎ −0.26⁎ −0.22⁎ 0.02 −0.00 0.05

−0.07⁎ −0.27⁎ 0.10⁎ −0.02 0.00 −0.05 −0.07⁎ 0.05 −0.18⁎ 0.14⁎ 0.18⁎ 0.12⁎ 0.19⁎ 0.08⁎ 0.33⁎ −0.04 0.05 −0.12⁎ 0.20⁎ 0.13⁎ 0.32⁎ −0.04 0.01 −0.10⁎

0.43⁎ 0.43⁎ 0.43⁎ 0.22⁎ 0.22⁎ 0.21⁎ −0.40⁎ −0.40⁎ −0.40⁎ −0.05⁎ −0.06 −0.06 0.08⁎ 0.14⁎ −0.01 0.23⁎ 0.24⁎ 0.21⁎ 0.02 0.00 −0.05

0.52⁎ 0.47⁎ 0.57⁎ −0.83⁎ −0.89⁎ −0.78⁎ −0.18⁎ −0.28⁎ −0.08⁎ −0.03 0.06 −0.18⁎ 0.47⁎ 0.58⁎ 0.34⁎ −0.07⁎ 0.05 −0.18⁎

−0.04 −0.08⁎ −0.01 −0.38⁎ −0.43⁎ −0.31⁎ 0.00 0.10⁎ −0.17⁎ 0.12⁎ 0.21⁎ 0.03 0.14⁎ 0.18⁎ 0.12⁎

0.02 0.12⁎ −0.10⁎ 0.03 −0.01 0.07 −0.52⁎ −0.57⁎ −0.46⁎ 0.19⁎ 0.08⁎ 0.33⁎

−0.11⁎ −0.03 −0.23⁎ −0.25⁎ −0.26⁎ −0.22⁎ −0.04 0.05 −0.12⁎

0.02 −0.00 0.05 0.20⁎ 0.13⁎ 0.32⁎

−0.04 0.01 −0.10⁎

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

tory variables. There were some differences in the shape of dependencies between Rzeszów (Fig. 2) and Szczecin (Fig. 3) which could affect the matrix scatter plots of the whole data set. The analysis of Spearman's rank correlation coefficients also revealed differences in the relationships between Ganoderma spore content and meteorological variables (Table 1). In general those dependencies were much weaker and mostly insignificant in Szczecin comparing to Rzeszów. Only air pressure and atmospheric phenomenon turned out to significantly influence Ganoderma while in Rzeszów those were air temperature (negative correlation), humidity (positive), wind speed (negative) and finally hour, wind direction and atmospheric phenomenon (all negative correlation coefficients). Moreover, some relationships between meteorological parameters and hour were insignificant in Szczecin but significant in Rzeszów (hour–wind direction, dew point temperature–wind speed) or significant in Szczecin but insignificant in Rzeszów (air temperature–wind direction, air temperature–atmospheric phenomenon, air pressure–wind direction). Some dependencies also differed in their strength between the two cities studied (air temperature–air pressure, humidity–atmospheric phenomenon). The U Mann–Whitney non-parametric test revealed significant (p b 0.05) differences in Ganoderma spore concentrations, which were lower in Rzeszów than in Szczecin (Table 2). In turn, dew point temperature and humidity were significantly higher in Rzeszów and the south–south-west wind prevailed there while in Szczecin south-east. Further statistical modelling was performed for the two cities separately. First, the differences in Ganoderma abundance between consecutive hours and atmospheric phenomena were assessed with the use of non-parametric ANOVA Kruskal–Wallis test. Significant results (p b 0.05) were obtained only for the data set recorded in Rzeszów, where Ganoderma spore concentration between 9 am and 18 pm were in general lower than in the early morning and in the evening (Table 3, Fig. 4). Moreover, Ganoderma abundance under clear sky was statistically significantly (p b 0.05) higher than when the clouds were scattered or when it was mostly cloudy (Table 4, Fig. 5). For the data set recorded in Szczecin no significant differences were obtained. MRT analysis of data from Szczecin showed high cross-validated relative error and poor overall fit and therefore the tree was not further analysed. For Rzeszów better performance was obtained (Fig. 6). It

Table 2 Results of the U Mann–Whitney non-parametric test for Ganoderma spore concentration and meteorological parameters between Rzeszów and Szczecin. p-value: ⁎p b 0.05. Variable

Median Rzeszów

Median Szczecin

U

Z

p

Ganoderma⁎ [spores/m3] Air temperature [°C] Dew point temperature⁎ [°C] Humidity⁎ [%] Air pressure [hPa] Wind direction⁎ [o] Wind speed [m/s]

4.0 18.0 13.0 82.0 1017.0 202.5 2.1

6.0 17.0 11.0 77.0 1017.0 135.0 2.6

202,582.5 268,172.5 240,483.5 249,132.5 258,393.5 237,340.5 272,921.5

−10.51 0.76 4.12 3.07 1.95 4.50 −0.19

0.00 0.45 0.00 0.00 0.20 0.00 0.85

953

Automatic Problem Solver. Cases were divided with bootstrap method into 3 subsets: – training (Tr) — used for training a neural network (67% of the cases); – verification (Ve) — used for verifying performance of a network during training (33% of the cases); – testing (Te) — used for assessing predictability and accuracy of a neural model on data not presented during training and validation (cases remained after creating a training subset during bootstrap). The choice criterion of the best neural network was Correlation (Pearson's correlation coefficient between experimental and predicted data) calculated separately for three subsets. Special emphasis was placed on sensitivity analysis. It creates ranking of input variables and is based on calculations of the error, when a given input variable is removed from the model. The ratio of the error for the complete model to the one with ignored variable is the basis of ordering variables according to their importance. 3. Results As shown in Figs. 1, 2 and 3 the Ganoderma spore data approximated an exponential distribution. In turn, meteorological parameters mostly approximated normal distribution, however the Shapiro–Wilk test confirmed significant deviations from normality (results not shown). The scatter plots indicated rather weak and sometimes non-linear dependencies between Ganoderma spore concentration and explana-

Table 3 p-values from ANOVA Kruskal–Wallis test between hours for Rzeszów. Significant differences are highlighted for p b 0.05. Hour 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 1.00 1.00 1.00 1.00 1.00 1.00 0.26 0.00 0.01 0.05 0.00 0.00 0.00 0.04 0.00 0.00 0.00 0.38 1.00 1.00 1.00 1.00 1.00

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

1.00

1.00 1.00

1.00 1.00 1.00

1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00

0.26 0.09 0.01 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.00 0.00 0.00 0.07 1.00

0.01 0.00 0.00 0.21 0.21 0.13 1.00 1.00 1.00

0.05 0.01 0.00 0.60 0.60 0.37 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.00 0.00 0.00 0.12 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.00 0.00 0.00 0.04 1.00 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.02 0.02 0.01 0.42 1.00 1.00 1.00 1.00 1.00 1.00

0.04 0.01 0.00 0.47 0.46 0.29 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.02 0.02 0.01 0.46 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.01 0.01 0.00 0.16 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

0.00 0.00 0.00 0.07 0.06 0.04 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

0.38 0.13 0.02 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 0.49 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.93 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 0.80 1.00 1.00 1.00 1.00 1.00 0.86 1.00 1.00 1.00 0.57 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.15 0.44 0.00 0.00 0.01 0.34 0.01 0.00 0.04 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.09 1.00 1.00 0.15 0.06 0.51 1.00 0.56 0.20 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.26 0.75 0.01 0.00 0.02 0.58 0.03 0.01 0.08 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 0.09 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.13 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.49 0.80 1.00 1.00 1.00

1.00 1.00 1.00 1.00 0.00 0.21 0.60 0.00 0.00 0.02 0.47 0.02 0.01 0.07 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 0.00 0.21 0.60 0.00 0.00 0.02 0.46 0.02 0.01 0.06 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 0.00 0.13 0.37 0.00 0.00 0.01 0.29 0.01 0.00 0.04 1.00 1.00 1.00 1.00 1.00 1.00

1.00 0.07 1.00 1.00 0.12 0.04 0.42 1.00 0.46 0.16 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.86 0.00 0.09 0.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.15 1.00 0.26

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.44 1.00 0.75

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.15 0.01

1.00 1.00 1.00 1.00 1.00 1.00 0.93 0.57 0.00 0.06 0.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.01 0.51 0.02

1.00 1.00 1.00 1.00 1.00 1.00 0.34 1.00 0.58

1.00 1.00 1.00 1.00 1.00 0.01 0.56 0.03

1.00 1.00 1.00 1.00 0.00 0.20 0.01

1.00 1.00 1.00 0.04 1.00 0.08

1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00

1.00 1.00 1.00

1.00 1.00

1.00

954

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

Fig. 4. Box and whisker plot for Ganoderma spore concentration. Comparison between hours for Rzeszów. Fig. 6. Multivariate regression tree for hourly Ganoderma spore concentration in Rzeszów.

turned out, that the lowest spore content values (on average 3.19 spores/ m3) were between 7.30 am and 8.30 pm while the highest before 7.30 am (6.59 spores/m3) and after 8.30 pm (6.02 spores/m3). Similarly to MRT results, the ANN modelling also brought poor results in case of the data set from Szczecin both for original and Table 4 p-values from ANOVA Kruskal–Wallis test between atmospheric phenomena for Rzeszów. Significant differences are highlighted for p b 0.05. Atmospheric phenomenon code 1 2 3 4 5 6 7 8

1

0.01 1.00 0.00 1.00 1.00 0.23 0.01

2

3

4

5

6

7

8

0.01

1.00 1.00

0.00 1.00 1.00

1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00

0.23 1.00 1.00 1.00 1.00 1.00

0.01 1.00 0.51 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 0.51

1.00 1.00 1.00 1.00

1.00 1.00 1.00

1.00 1.00

1.00

Fig. 5. Box and whisker plot for Ganoderma spore concentration. Comparison between atmospheric phenomenon for Rzeszów.

transformed variables (Correlation below 0.4). Satisfactory performance of ANN regression model was obtained for Rzeszów. The best network was MLP 8:8-5-1:1 with 8 input neurons, 5 hidden neurons and 1 output neuron, trained with 100 epochs of back propagation and 8 epochs of conjugate gradient algorithm. It reached Correlation at the level of 0.50–0.63. Direct comparison of the observed and calculated values (Fig. 7) confirmed quite good predictability of the obtained model. Sensitivity analysis (Table 5) indicated atmospheric phenomenon, hour and relative humidity to be the most important variables in the model. The remaining variables also contributed to the high network performance (ratio above 1) but their impact was less distinct. Those results are consistent with the Spearman's rank correlation analysis. 4. Discussion Ganoderma spores have been reported to be an important and prevalent fungal airspora component worldwide, (Halwagy, 1994; Hasnain, 1993; Lehrer et al., 1986, 1994; Levetin, 1990, 1991; Li and Kendrick, 1994, 1995; Mitakakis and Guest, 2001; Stępalska and

Fig. 7. Comparison of Ganoderma spore concentrations observed in Rzeszów and those calculated from MLP 8:8-5-1:1 neural network.

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

955

Table 5 Sensitivity analysis for MLP 8:8-5-1:1 neural network. Regression model for Ganoderma spore concentration in Rzeszów. Variable

Atmospheric phenomenon

Hour

Relative humidity

Air temperature

Wind direction

Air pressure

Dew point temperature

Wind speed

Rank

2.87

1.78

1.29

1.17

1.15

1.11

1.11

1.08

Wołek, 2005) but there are only few studies on the relationship between its spores concentration and meteorological parameters, (Craig and Levetin, 2000; Hasnain et al., 2004). In addition, all analysis was done using simple statistical analysis, (e.g. Pearson correlation, multiple regression), most of them usually displaying unpredictability, therefore it is difficult to compare these analyses with our results using more advanced statistical modelling. Moreover, many techniques of data analysis are based on assumptions of linearity and normality. Such requirements often cannot be fulfilled. This situation is particularly typical for time series with short spore seasons. The artificial neural network and multivariate regression trees method function as a universal approximating system with the ability to learn, adapt and generalize the knowledge acquired, and are especially applicable to multivariate data sets with non-linear dependencies. Moreover, they do not require variables to fit any theoretical distribution. Our results indicated differences in the dynamics of hourly Ganoderma spore content–meteorological condition relationships between Rzeszów and Szczecin. Similar positive correlations in Szczecin were noted using multiple regression in 2004 by Grinn-Gofroń (2008). The correlation results obtained for Rzeszów are in agreement with the results from Cracow, Poland, (Stępalska and Wołek, 2005), Mexico City, Mexico, (Calderon et al., 1995), and San Juan, Puerto Rico, (Quintero et al., 2010). Both sites in Poland, (Rzeszów and Cracow), are located in the southern part of the country with a similar climate, and this fact can explain the similarity of the results obtained. In our study, the negative relationship noted in Rzeszów was also recorded in New Zealand, (Hasnain, 1993) and Puerto Rico (Quintero et al., 2010). Strong winds which are dominant in the southern Poland region, (where Rzeszów is located), can be the reason for these negative relationships. Calderon et al. (1995), noted that high concentrations of basidiospores are often associated with daily mean wind speed of 2–3 m/s. These results were compared with the reports of Lopez and Salvaggio (1983) and Hasnain (1993), in which wind velocities N5 m/s were associated with decreased concentrations of basidiospores, perhaps because of the diluting effect of high wind speeds on concentrations of airborne particles. Negative hourly correlation between basidiospores and air temperature occurred in three cities in Oklahoma State, (Tulsa, Bixby and Hectorville). The basidiospores were found in the air during the cooler night and evening hours, (Burch and Levetin, 2002). These correlations reflect the diurnal rhythm of the spore release, with high basidiospore concentrations during the cooler, wetter night hours (Hirst, 1953; Inglot, 1971). In other cases, air temperature only the positive relationships were noted (Craig and Levetin, 2000; Hasnain, 1993; Oliveira et al., 2009; Quintero et al., 2010). However, it must be remembered that none of the comparative data concern deals with the hourly values as in our case, but daily values. This could be the reason for differences in the results obtained. The results of the sensitivity analysis of ANN regression model for Rzeszów are consistent with Spearman's rank correlations, and indicate atmospheric phenomenon, hour and relative humidity, to be the most important variables in the model. Positive relationships between relative humidity were also noted by Calderon et al. (1995) and Quintero et al. (2010). This agrees with the observations from New Zealand (Hasnain, 1993), where basidiospore concentrations showed distinctive seasonal differences in the highest numbers during the wet season, when weather

conditions favoured the development of sporophores and the spore release. Calderon et al. (1995) also noted the largest concentrations of basidiospores when relative humidity was between 70 and 80%. Burch and Levetin (2002) summarized that basidiospores require moist conditions for their release, and this would be expected to correlate both with relative humidity and dew point temperature. The hourly concentrations of Ganoderma spores also differ throughout the day. In Tulsa (Oklahoma), a diurnal rhythm with peak concentrations occurred at approximately 4 am and the lowest concentration levels were observed at 4 pm. Similar early morning maximum was observed by Haard and Kramer (1970) and Ho et al. (2005) at 6 am, and between 4 and 6 am, respectively. Ho et al. (2005) noted a distinct decrease in fungal spore concentration until the late afternoon. The same authors observed that Ganoderma spores followed a night pattern, with the highest concentrations occurring between sunset and sunrise. These results were in agreement with Gregory (1973). Calderon et al. (1995) reported that concentrations of airborne basidiospores showed a diurnal periodicity typical for this fungal spore type, with maximum concentrations at night, between 2 am and 4 am, when dew is sufficient for spore liberation. The reason for this diurnal pattern is probably associated in part, with the vertical mixing depth of the air and the radiation inversion at night. Basidiospores are released when humidity increases, and thus tend to be more concentrated at late night and before dawn, (Burch and Levetin, 2002). In the conditions of a temperate climate, the maximum concentration of most spores occurs in summer or early autumn. In Europe, such differences in the pattern are fairly inconspicuous. Daily concentrations may differ significantly in the following seasons and in different habitats. Such differences may appear in the time of maximum occurrence. In conclusion, the combined humidity, air temperature, wind speed and hour were the most consistent and accurate predictors for the appearance of high Ganoderma spore concentrations in Rzeszów. Our study shows that the relationship between hourly Ganoderma spore concentrations and meteorological variables differs in strength and direction between various places. In some cases those relationships are very weak and probably other factors influence spore dynamics, as in the cause of Szczecin. Such predictive models proposed in our study cannot be created. For each locality, preliminary analysis should be performed, in order to assess the potential performance of predictive models based on meteorological factors. Subsequently, the models can be used to predict hourly Ganoderma spore content on the basis of weather forecast for the following day. Weather conditions affect the sporulation, dispersal and deposition of spores and their elements correlate with each other. Aeromycological investigations show that the effect of external parameters on the growth and development of colonies, sporulation rhythm, and the method of spore release is strong, but we should also take into account the impact of endogenous factors such as the ecological dependences of the parasite, or the accessibility of substrates where they can grow. Therefore, the best statistical methods which take into account the analysis of multiple environmental parameters, must be selected and considered collectively. These methods should also tolerate any data gaps and rapid changes in concentration of spores in the air. In addition to the statistical models to fully understand the complex relationship between the life cycle of fungi and environmental parameters, long-term international and interdisciplinary research is needed.

956

I. Kasprzyk et al. / Science of the Total Environment 409 (2011) 949–956

References Bishop C. Neural networks for pattern recognition. Oxford: University Press; 1995. Breiman L, Friedma JH, Olshen RA, Stone CG. Classification and regression trees. Belmont, California, USA: Wadsworth International Group; 1984. Burch M, Levetin E. Effects of meteorological conditions on spore plumes. Int J Biometeorol 2002;46:107–17. Bush RK, Portnoy JM. The role and abatement of fungal allergens in allergic diseases. J Allergy Clin Immunol 2001;107:430–40. Calderon C, Lacey J, McCartney HA, Rosas I. Seasonal and diurnal variation of airborne basidiomycete spore concentrations in Mexico City. Grana 1995;34:260–8. Carling A. Introducing neural networks. Wilmslow, UK: Sigma Press; 1992. Craig RL, Levetin E. Multi-year study of Ganoderma aerobiology. Aerobiologia 2000;16:75–81. Cutten AEC, Hasnain SM, Bai TR, Mckay EJ. The basidiomycetes Ganoderma and asthma: collection, quantitation and immunogenicity of the spores. N Z Med J 1988;101:361–3. De'ath G. Multivariate regression trees: a new technique for modelling speciesenvironment relationships. Ecology 2002;83:1105–17. De'ath G, Fabricus KE. Classification and regression trees: a powerful and simple technique for ecological data analysis. Ecology 2000;81:3178–92. Donk MA. New and revised nomina generica conservanda proposed for Basidiomycetes (Fungi). Bull Bot Gard Buitenzorg III 1948;25:98–9. Fausett L. Fundamentals of neural networks. New York, USA: Prentice Hall; 1994. Gregory PH. The air spora near the earth's surface. The microbiology of the atmosphere. Plymouth, England: Leonard Hill; 1973. p. 146–9. Grinn-Gofroń A. The variation in spore concentrations of selected fungal taxa associated with weather conditions in Szczecin, Poland. Grana 2008;47:139–46. Haard RT, Kramer CI. Periodicity of spore discharge in the Hymenomycetes. Mycologia 1970;62:1145–69. Halwagy MH. Fungal airspora of Kuwait City, Kuwait. Grana 1994;33:340–5. Hasnain SM. Influence of meteorological factors on the airspora. Grana 1993;33:340–5. Hasnain SM, Newhook FJ, Wilson JD, Corbin JB. First report of Ganoderma allergenicity in Auckland, New Zealand. N Z J Sci 1984;27:261–7. Hasnain SM, Wilson JD, Newhook FJ. Allergy to basidiospores: immunologic studies. N Z Med J 1985;98:393–6. Hasnain SM, Al-Frayh A, Khatija F, Al-Sedairy S. Airborne Ganoderma basidiospores in a country with desert environment. Grana 2004;43:111–5. Haykin S. Neural networks: a comprehensive foundation. New York, USA: Macmillan Publishing; 1994. Helbling A, Brander KA, Horner WE, Lehrer SB. Allergy to basidiomycetes. In: Breitenbach M, Crameri R, Lehrer SB, editors. Fungal allergy and pathogenicity. Chem Immunol BaselKarger; 2002. p. 28–47. Hirst JM. An automatic volumetric spore trap. Ann Appl Biol 1952;39:257–65. Hirst JM. Changes in atmospheric spore content: diurnal periodicity and the effects of weather. Trans Br Mycol Soc 1953;36:375–93. Ho H-M, Rao CY, Hsu H-H, Chiu Y-H, Liu Ch-M, Chao HJ. Characteristics and determinants of ambient fungal spores in Hualien, Taiwan. Atmos Environ 2005;39:5839–50.

Inglod CT. Fungal spores: their liberation and dispersal. London: Oxford University Press; 1971. Kożuchowski K, Degirmendžić J. Contemporary changes of climate in Poland: trends and variation in thermal and solar conditions related to plant and vegetation. Pol J Ecol 2005;53(3):283–7. Lehrer SB, Lopez M, Butcher BT, Olsen J, Reed M, Salvaggio JE. Basidiomycete mycelia and spore allergen extracts: skin reactivity in adults with symptoms of respiratory allergy. J Allergy Clin Immunol 1986;78:478–85. Lehrer SB, Hughes JM, Altman LC, Bousquet J, Davies RJ, Gell L, et al. Prevalence of basidiomycete allergy in the USA and Europe and its relationship to allergic respiratory symptoms. Allergy 1994;49:460–5. Lek S, Guegan JF. Artificial neural networks as a tool in ecological modelling, an introduction. Ecol Modell 1999;120:65–73. Levetin E. Studies on airborne basidiospores. Aerobiologia 1990;6:177–80. Levetin E. Identification and concentration of airborne basidiospores. Grana 1991;30: 123–8. Li D, Kendrick B. Functional relationships between airborne fungal spores and environmental factors in Kitchener–Waterloo, Ontario, as detected by canonical correspondence analysis. Grana 1994;33:166–76. Li D, Kendrick B. A year-round outdoor aeromycological study in Waterloo, Ontario, Canada. Grana 1995;34:199–207. Lopez M, Salvaggio EJ. Climate–weather–air pollution. In: Middleton E, Reed CE, Ellis EF, editors. Allergy: principle and practice. 2nd. ed. St. Louis: CV Mosby; 1983. p. 1203–14. Lula P. Selected applications of artificial neural networks using STATISTICA Neural Networks. Kraków, Poland: StatSoft Polska; 2000. Mitakakis TZ, Guest DI. A fungal spore calendar for the atmosphere of Melbourne, Australia, for the year 1993. Aerobiologia 2001;17:171–6. Oliveira M, Ribeiro H, Delgado JL, Abreu I. The effects of meteorological factors on airborne fungal spore concentration in two areas differing in urbanization level. Int J Biometeorol 2009;53:61–73. Osowski S. Algorithmic approach to artificial neural networks. Warszawa, Poland: WNT; 1996. Pegler DN, Young TWK. Basidiospore form in the British species of Ganoderma Karst. Kew Bull 1973;28(3):351–64. Quintero E, Rivera-Mariani F, Bolaños-Rosero B. Analysis of environmental factors and their effects on fungal spores in the atmosphere of a tropical urban area (San Juan, Puerto Rico). Aerobiologia 2010;26:113–24. StatSoft, Inc.. STATISTICA (data analysis software system), version 9.0. www.statsoft. com 2009. Stępalska D, Wołek J. Variation in fungal spore concentrations of selected taxa associated to weather conditions in Cracow, Poland, in 1997. Aerobiologia 2005;21: 43–52. Tadeusiewicz R. Neural networks. Warszawa, Poland: Akademicka Oficyna Wydawnicza; 1993. Tadeusiewicz R. Introduction to neural networks. The R Foundation for Statistical Computing R version 2.10.1. Kraków, Poland: Statsoft Polska; 2001.