Site-specific early season potato yield forecast by neural network in Eastern Canada

Share Embed


Descripción

Precision Agric (2011) 12:905–923 DOI 10.1007/s11119-011-9233-6

Site-specific early season potato yield forecast by neural network in Eastern Canada Je´roˆme G. Fortin • Franc¸ois Anctil • Le´on-E´tienne Parent Martin A. Bolinder



Published online: 22 May 2011 Ó Springer Science+Business Media, LLC 2011

Abstract Deterministic potato (Solanum tuberosum L.) growth models hardly rely on driving seasonal field variables that directly characterize spatial variation of plant growth. For example, the SUBSTOR model computes the leaf area index (LAI) as an auxiliary variable from meteorological conditions and soil properties. Empirical models may account for seasonal LAI functions and accurately predict potato yield. The objective was to evaluate multiple linear regression (MLR) and neural networks (NN) as predictive models of potato yield. Using data from several replicated on-farm experiments conducted over 3 years, model performance was evaluated for their capacity to forecast tuber yields 9, 10 and 11 weeks before harvest compared to SUBSTOR. A 3-input NN using LAI functions and cumulative rainfall yielded the most accurate estimations and forecasts of tuber yields. This NN showed that tuber yield of contrasting zones was mostly a function of meteorological conditions prevailing during the first 5–8 weeks after planting. Subsequent development of tubers was essentially controlled by biomass allocation to tubers. The NN models were more coherent than MLR and SUBSTOR for two reasons: (1) the use of seasonal LAI directly as input rather than computed as an auxiliary variable and (2) the non-linearity of the modeling process resulting in more accurate estimation of the temporal discontinuities of potato tuber growth. This model showed potential for application in precision agriculture by accounting for temporal and spatial real-time climatic and crop data. Keywords Crop growth model  Neural network  SUBSTOR  Genetic algorithm  Leaf area index

J. G. Fortin (&)  F. Anctil Department of Civil and Water Engineering, Universite´ Laval, Que´bec, Canada e-mail: [email protected] Le´on-E´tienne Parent  M. A. Bolinder Department of Soils and Agrifood Engineering, Universite´ Laval, Que´bec, Canada M. A. Bolinder Department of Soils and Environment, Swedish University of Agricultural Sciences, Uppsala, Sweden

123

906

Precision Agric (2011) 12:905–923

Introduction For potato modeling, the driving environmental variables are temperature (Prange et al. 1990), photoperiod (Wheeler and Tibbetts 1986), intercepted radiation (MacKerron and Waister 1985) and precipitation (Stark and Wright 1985). Management factors can modify potato response to these variables, especially nitrogen (N) fertilization (Be´langer et al. 2001) and soil water management (Trebejo and Midmore 1990). At levels of N applications far in excess of those that contribute to higher tuber yields, there may be a pronounced effect on haulm growth and a lowered percentage of total plant biomass that will be transferred to tubers early in the season (Vos and Biemond 1992). Excessive N application early in the season thus increases the risk of nitrate leaching, particularly in coarse-textured soils, and can delay tuber initiation (Joern and Vitosh 1995). It is therefore recommended to progressively increase the N supply during the growing season using split N applications (Errebhi et al. 1998) as assisted by a growth models. The SUBSTOR model, an acronym for Simulate Underground Bulking Storage Organs (Singh et al. 1998) was devised to simulate seasonal carbon assimilation by root and tuber crops. SUBSTOR accounts for the multiple interactions in the atmosphere-soil–plant system at both regional (e.g. Supit et al. 1994) and field scales (e.g. Paz et al. 2003). Sitespecific simulations with this family of models rely on the ability to detect differences in factors affecting plant growth pattern, such as soil N level and water availability. Potato growth patterns can be directly assessed by measuring leaf area index (LAI), a measure of plant growth. However, models such as SUBSTOR do not use the LAI as input variable but rather compute it as an auxiliary variable from climatic and soil data. In deterministic models such as SUBSTOR, where soil attributes, cultivars and management practices are fixed input data, crop growth and yield forecasts must rely on dynamic climatic variables such as temperature and rainfall. Consequently, this type of model must rely on some constructed time-series of daily climatic data to fill the gap between the time of the simulation and harvest time. This can be problematic since this gap is usually longer (i.e. 5–10 weeks) than the common meteorological forecasts. Empirical models forecasting tuber yield from a minimum dataset and directly taking into account growth patterns (i.e. LAI) are alternative solutions (Fortin et al. 2010). The objective of this study was to compare the capacity of SUBSTOR and empirical models to forecast in-field variations in tuber yield using LAI measurements and meteorological observations as inputs. It was hypothesized that tuber development was mostly determined by conditions prevailing during the first weeks after planting, and that the LAI is representative of spatial variation of tuber growth.

Materials and methods Data were collected from 2005 to 2007 on experimental fields at Saint-Ubalde de Portneuf, Quebec, Canada (46°450 2300 N, 72°190 5700 W). Potatoes were planted at a density of 31 000 plants ha-1 and band-fertilized at planting at a rate of 160 kg N ha-1. The P, K and Mg were applied together with N according to local recommendations (CRAAQ 2003). Seasonal tuber growth of three cultivars (Goldrush, Chieftain, and Eramosa) was measured to parameterize the models. Table 1 presents the planting and harvesting dates for each site of the study. A total of 15 sampling sites of 5 m by 5 m were randomly located in a single field of about 150 by 150 m which, historically, had shown in-field variation in tuber growth. Each year, marketable size tubers from 9 plants per plot were

123

Precision Agric (2011) 12:905–923

907

Table 1 Planting and harvesting dates and partitioning of training and testing datasets between years and cultivars Site

Year

Cultivar

Planting date

Harvest date

1

2005

Chieftain

25-May

12-Sept

2

2006

Chieftain

22-May

27-Sept

3

2006

Goldrush

25-May

27-Sept

4

2007

Chieftain

20-May

19-Sept

5

2007

Chieftain

20-May

19-Sept

6

2007

Chieftain

20-May

19-Sept

7

2007

Chieftain

20-May

19-Sept

8

2007

Chieftain

20-May

19-Sept

9

2007

Chieftain

20-May

19-Sept

10

2005

Goldrush

14-May

12-Sept

11

2006

Eramosa

19-May

13-Sept

12

2007

Chieftain

20-May

19-Sept

13

2007

Chieftain

20-May

19-Sept

14

2007

Chieftain

20-May

19-Sept

15

2007

Chieftain

20-May

19-Sept

Training dataset

Testing dataset

collected, counted and weighed at 8–14 occasions during the season. Samples were collected every week during growth and every 2 weeks during senescence. Data were linearly interpolated to obtain daily time series. This methodology is justified by the observed linearity of potato tuber growth between tuber initiation and harvest date. The total number of observed and interpolated data for the 3 cultivars and the 15 sites over the 3 years was 1 834. We sampled 22 plants per plot at harvest. Fresh and dry (48 h at 60°C) tuber weights were determined. Soils were randomly sampled at three repetitions among each 5 m 9 5 m site and were analyzed at the beginning of the season down to 400 mm (two layers of 200 mm each) to parameterize the soil inputs of the SUBSTOR model: pH, organic matter (LECO CNS2000, Leco Corporation, St. Joseph, MI, USA), P, K, Ca, Mg, Cu, Zn, Mn, Fe and extractable (Mehlich 1984). SUBSTOR also required nitrate (NO3) concentration as extracted by CaCl2 (0.01 M) (Dou et al. 2000) and analysed by ion chromatography (Dionex 4000i, Dionex Corporation, Sunnyval, Ca, USA). Inter-annual and site-specific variability of the soil parameters is presented on Table 2. Daily rainfall and maximum and minimum air temperature time series were obtained from an on-site meteorological station. Since solar radiation was not available on site or at any nearby meteorological stations, time series for solar radiation were obtained from a NN model (Fortin et al. 2008) (Table 3). Leaf area index (LAI) of young fully grown fresh leaves was measured every week during the growing season and every 2 weeks during senescence using a LI-COR LAI2000 Plant Canopy analyzer (LI-COR inc., Lincoln, Nebraska, USA). The LAI was averaged across 12 plants within 5 m by 5 m permanent quadrats in each site. The LAI2000 estimated LAI based on the amount of canopy radiation transmittance measured

123

908

Precision Agric (2011) 12:905–923

Table 2 Variability of the soil conditions and maximum LAI among the experimental plots Site

pH

%C

Residual NO3 (PPM)

1

5.4

4.2

50.9

4.3

Loamy sand

2

5.0

3.0

NA

6.1

Sandy loam

3

5.0

3.5

NA

7.69

Sandy loam

4

4.7

1.8

170.0

4.6

Sandy loam

5

5.0

2.0

316.5

5.4

Sandy loam

6

5.4

3.3

638.5

7.0

Loamy sand

7

5.3

2.4

416.8

5.6

Sandy loam

8

5.4

1.3

165.5

4.4

Sandy loam

9

5.1

0.4

124.2

4.5

Sand

10

5.7

2.6

47.5

5.5

Sandy loam

11

4.7

2.5

NA

5.3

Sandy loam

12

5.1

1.2

167.2

5.9

Sand

13

5.4

1.1

325.8

5.2

Sand

14

5.1

0.4

128.1

5.2

Sandy loam

15

5.3

2.5

587.2

6.6

Sandy loam

CV (%)

5.3

52.2

78.8

17.6

LAI max

Type

Table 3 Mean daily values of meteorological parameters from planting to harvest (mid-May to midSeptember) Year

Estimated solar radiation (MJ m-2)

Maximum temperature (°C)

Minimum temperature (°C)

Rain (mm day-1)

2005

20.52

23.99

10.59

3.37

2006

20.18

22.68

10.46

4.17

2007

21.75

24.67

9.81

3.51

across a hemispherical field by five concentrically nested sensors centered on 7°, 23°, 38°, 53° and 68° (Welles and Norman 1991). The LAI was computed as follows: Zp=2 LAI = 2

 lnðT ðHÞÞ cos H sin HdH

ð1Þ

0

where T is canopy transmittance measured across zenith angle by the LI-COR sensors. The LAI is generally 3.0 at ground cover, where approximately 85% of the available radiation is intercepted. Table 4 presents average LAI and tuber yield values for each year across

Table 4 Mean maximum LAI values and tuber yields across years and cultivars

123

Year

Mean maximum LAI

Mean tuber yield (kg DM ha-1)

2005

4.9

5317

2006

6.5

8080

2007

5.4

8031

Precision Agric (2011) 12:905–923

909

cultivars. As expected, a smaller LAI value in 2005 compared to 2006 and 2007 corresponded to smaller tuber yields in 2005. Table 2 also presents the variability of the seasonal maximum LAI observed among the experimental plots. SUBSTOR The SUBSTOR simulations were run using the DSSAT v4.0.2.0 software (Decision Support System for Agrotechnology Transfer—www.icasa.net/dssat). This version does not allow the user to modify the structure of the model, but it allows the calibration of ‘‘genetic coefficients’’ that are cultivar-specific. A comprehensive schematic representation of the SUBSTOR model is shown in Fig. 1. SUBSTOR uses as inputs: (i) meteorological data, (ii) soil characteristics and (iii) management data. Temperature and photoperiod response functions containing cultivar-specific parameters estimate the time of tuber initiation. The state variables that define the daily status of the crop system are dry masses of plant organs—leaves, stems, tubers and roots—per unit of soil area. The state variables represent the main part of the model output, while some important auxiliary variables such as LAI, growth stages and N balances are also available as outputs. It was not possible to directly adjust the simulated LAI with seasonal field values in the DSSAT v4.0.2.0 software. Yield forecast with SUBSTOR was performed using a 20 year meteorological chronicle from Environment Canada. For each site, twenty simulations were conducted for each time horizon (9, 10 and 11 weeks) and averaged to obtain a single value. A total of 900 simulations were conducted. Each of them required building a different meteorological data file.

Fig. 1 The general scheme of the SUBSTOR model

123

910

Precision Agric (2011) 12:905–923

Multiple linear regression Multiple linear regression (MLR) relates p explanatory variables X to a response variable Y by fitting a linear equation to the data as follows: Y ¼ b0 þ b1 X1 þ b2 X2 þ    þ bp Xp

ð2Þ

where the Xs are the explicative variables and the bs are regression coefficients. Neural networks Due to the non-linearity of most processes in biological systems, neural networks (NN) were implemented for fast efficient modelling. The NN algorithms mimic the four basic functions of biological neurons: (1) receive inputs from other neurons or sources, (2) combine them, (3) perform operations on results, and (4) provide an output final result (Klerfors 1998). A NN is characterized by its architecture, training algorithm and activation function. The architecture consists of an input layer, an output layer, and generally one or more intermediate hidden layers. A robust NN model relies on the proper selection of inputs and of representative training and testing datasets. The problem of generalization (Anctil and Lauzon 2004) must be addressed to build a model that can most reliably infer the behaviour of the system under study for conditions represented in the training and validating datasets, but also for conditions not present in datasets but inherent to the system. In this paper, NN analyses and computations were conducted using functions embedded into the MATLAB NN toolbox. This study resorts to multiple-layer perceptron networks (MLPs) because of their known ability to approximate any function with a finite number of discontinuities (Hornik et al. 1989). MLPs are made of a single layer of hidden neurons and a single output neuron. The MLP function Y must be optimized across the following linear combination of multivariate functions: " ! #! X X ð3Þ xj G1 xij xi þ bj þ b Y ðx; x; bÞ ¼ G2 j

ij

where x is an ith-dimensional input vector, j is number of hidden neurons, x are neural weights and b are neural biases. The sigmoid tangent activation function G1 and the linear activation function G2, following many such as Yonaba et al. (2010), are computed as follows: G1 ðnÞ ¼

2 1 1 þ expð2nÞ

ð4Þ

G2 ðnÞ ¼ n

ð5Þ

where n is the weighted sum of information from the previous layer of neurons. The Levenberg–Marquardt back-propagation optimization algorithm (Coulibaly et al. 2000) coupled with Bayesian regulation used to train MLPs modifies the usual cost function Fe (the sum of squared errors) by considering an additional term, namely the sum of squared neural weights Fx: F ¼ aFe þ cFx

ð6Þ

where a and c are objective function parameters automatically set at their optimum values by the Bayesian regularization proposed by MacKay (1992). Bayesian regulation reduces

123

Precision Agric (2011) 12:905–923

911

variance errors because the minimization constrains the weights to small values. Hence, large fluctuations in network response due to inputs of large magnitude are less likely to occur, and generalization will improve. Model calibration For model calibration, the dataset was divided into training and testing subsets made of 1 103 and 731 observations, respectively (Table 1). Eramosa (an early season cultivar) was planted at only one site in 2006, and was therefore combined with Chieftain (a mid-season cultivar) for the optimization process. Because the 5 genetic coefficients have to be simultaneously optimized within a non-differentiable function, a genetic algorithm (GA) was used. The basic idea of evolutionary algorithms is to mimic biological evolution. The algorithm starts with an initial population of 10 elements randomly chosen in the domain of genetic coefficients. The elements are called individuals or chromosomes. A probabilistic selection is performed based upon results of the evaluation of the cost function for each element. After selection, the operator of crossover (an operator insuring evolution of the population) is applied with a probability of 0.6 and a mutation factor with a rate of 0.03. An algorithm stop criterion at ten generations was used based on the maximum number of evaluations of the cost function to avoid over-optimization. A second optimization with 20 generations resulted in a slight reduction in performance, suggesting that a stop criterion of ten generations was appropriate. MLR and MLP model performance is determined by the selection of input variables and of an optimal number of neurons in the hidden layer. The most important environmental variables influencing tuber development and growth are temperature, photo-period or day length, intercepted radiation, and precipitation. Hence, besides cumulative and maximum LAI as direct measurement of plant growth, candidates for the input vector were cumulative temperature, cumulative estimated solar radiation and cumulative rainfall. In a preliminary analysis, the choice of the variables appeared more significant than the number of neurons in the hidden layer. The most adequate combination of variables was thus searched using three hidden neurons, and the optimum number of neurons in the hidden layer was determined afterwards. All variables were tested individually. The one that yielded the highest model performance was selected. The selected variable was combined with each of the remaining variables individually, and the pair that produced the best modelling performance was retained and combined with each of the remaining variables individually. This iterative process continued across variables until supplemental inputs decreased model performance. Since back propagation leads to local minima, calibrations were repeated 20 times, starting from a different random initialization of neural weights. The cost function was fully explored by choosing a model among the best 20% of the distribution of all possible models at the 99% confidence level (Iyer and Rhinehart 1999). MLR and MLP models were trained for both estimation and forecast. In the first case, models were trained using LAI functions from planting to harvest. The use of such models with data collected until estimation times (smaller values) could lead to a general under-estimation of tuber yield. MLR and MLP models were thus re-trained using data retrieved from planting to estimation time, leading to a different model for each time horizon tested. As per example, if the harvest date was September 21 and the date for the forecast of later yield was July 13, the target associated with the input vector for training was the yield at harvest. The target associated with the input vector values on July 12 was yield on September 20, and so on.

123

912

Precision Agric (2011) 12:905–923

Evaluation of model performance Model performance was evaluated using the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE). The MAE is a linear scoring rule that describes the average magnitude of errors without considering their direction (it is a linear score because all errors are equally weighted): MAE ¼

n   1X Yk  Y^k  n k¼1

ð7Þ

where Y is observed tuber yield, Y^ is simulated tuber yield, n is the number of observations, and k is time step. The RMSE is very similar to MAE except for the weighting of errors. It expresses the average magnitude of the errors as follows: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n  2 1X ð8Þ Yk  Y^k RMSE = n k¼1 Since errors are squared before averaging, more weight is given to large errors. Scores of the MAE and the RMSE range from 0 to ? (lower values are preferred), and errors are in the same unit and scale as observations. By definition, RMSE is larger or equal to MAE. A large difference between MAE and RMSE indicates a large variation in the error time series. The RMSE is more widely used than MAE probably because it is most useful when large errors are undesirable. Their scale dependency can be overcome using a skill score, a simple score standardization method that compares the performance of the simulation with the performance of a reference simulation. A RMSE-based skill score is computed as follows (Nash and Sutcliffe 1970): 0 1 RMSE B C ffiA SSRMSE ¼ @1  qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn 2 1  k¼1 ðYk  Y Þ n

ð9Þ

The MAE-based skill score is computed as follows: SSMAE ¼

! MAE 1  1 Pn  k¼1 jYk  Y j n

ð10Þ

where Y is the mean of observed tuber yield. Skill scores range between -? and 1. A SS of 1 represents a perfect fit between forecast and observed values, while a SS of 0 indicates a ‘‘no-knowledge’’ model, namely the mean of observations at every time step. Since scoring rules are averaged across the dataset, scatter plots were drawn to visually assess the agreement between simulated and observed tuber yields. The NS criterion was used to compare model performance and combinations of model inputs. The NS is the proportion of initial variance unaccounted for by a reference model (here the SUBSTOR model) defined as follows (Nash and Sutcliffe 1970): NS ¼

SS  SSref 1  SSref

ð11Þ

where SSref is SUBSTOR skill score value. Negative values for NS indicate that the alternative model has negative effects on the performance. Senbeta et al. (1999) suggested

123

Precision Agric (2011) 12:905–923

913

that NS values larger than 10% indicate the significance of alternative models in the performance of the simulation. It is of course expected that a model driven by any cumulative functions (e.g. a cumulative function of random numbers) may approximately determine a cumulative process such as tuber growth, but without any capacity of generalization. It is one of the reasons why the coefficient of determination (R2) was not used during this study (Legates and McCabe 1999). Even with a high R2 for a specific year and site, a model driven by cumulative functions of random numbers would inevitably fail when simulating the interannual and spatial variation of the increasing rate of any cumulative process. The interest in cumulative functions resides in the inter-annual and spatial variations of their slopes (the area under the curves). Because the statistics in this study were computed from daily values of a cumulative process (tuber growth) that is by nature auto-correlated, it is important to note that the performance might be over-estimated for all models. On the other hand, graphical evaluation of the model, such as the scatter plot of the observed and simulated growth, will still be appropriate and should be given a larger weight.

Results Comparison between models for yield estimation Empirical models developed using data covering the complete cropping season from planting to harvest are compared to SUBSTOR in Table 5. For simplification, the term ‘‘yield’’ was used to qualify the daily tuber growth value in dry weight over the entire season, rather than just the final yield at harvest. Considering the validation with the optimized SUBSTOR model first, the MAE-based and RMSE-based skill score values were 0.73 and 0.63, respectively. Hence, SUBSTOR performed better than a ‘‘noknowledge’’ model that would always give as prediction the average value of the phenomenon. The validation scatter plot between the observed and simulated potato yield was not so well structured (Fig. 2). This follows from the poor ability of SUBSTOR to model the effect of in-field variation on tuber growth, i.e., all replicates for the same year and genotype leading to the same forecast yield value. It is important to note that tuber growth in SUBSTOR is driven by LAI, which is an auxiliary variable computed from soil parameters and meteorological variables. Because the meteorological variables do not change for a given field, the capacity of SUBSTOR to model in-field spatial variation relies mostly on its capacity to model an accurate theoretical LAI from the in-field variation of soil parameters. In this study, these soil parameters did not vary significantly within a single field, which can explain the poor results obtained with SUBSTOR. Unless the computed LAI can be adjusted during the season with seasonal LAI values, it is difficult to model spatial variation of tuber growth at the field scale with SUBSTOR. For example, the simulations for the 11 replicates in 2007 where Chieftain was planted gave the same values of 6753 kg DM/ha. Therefore, the model simulated a mean tuber yield value across the whole field. The MLR models including LAI and rainfall functions yielded best results using a 3 input-MLR with cumulative LAI, maximum LAI and cumulative rainfall as follows: 0 b 1 ! Z a X RAIN ð12Þ Y ¼ 800 þ 14:6@ LAIA þ 13:6ðLAI maxÞ þ 14:6 a

b

123

914

Precision Agric (2011) 12:905–923

Table 5 Statistics of model performance for yield simulation MAE (kg DM ha-1)

SSMAE

NS (%)

RMSE (kg DM ha-1)

SSRMSE

NS (%)

907.3

0.62

Reference

1508.2

0.46

Reference

Cumulative LAI

905.2

0.61

0

1127.3

0.60

25.9

Cumulative LAI ? LAI max

903.1

0.60

2.6

1131.0

0.60

25.9

Cumulative LAI ? LAI max ? cumulative rainfall

891.2

0.60

2.6

1093.6

0.60

27.8

Cumulative LAI ? LAI max ? cumulative rainfall ? cumulative SRAD

880.6

0.60

2.6

1088.5

0.60

27.8

Cumulative LAI

489.9

0.80

47.4

842.8

0.70

44.4

Cumulative LAI ? LAI max

390.6

0.81

57.9

538.5

0.80

64.8

Cumulative LAI ? LAI max ? cumulative rainfall

345.0

0.90

63.2

542.0

0.80

64.8

Cumulative LAI ? LAI max ? cumulative rainfall ? cumulative SRAD

175.8

0.91

81.6

248.9

1.0

98.1

613.2

0.73

Reference

957.1

0.63

Reference

Cumulative LAI

941.0

0.58

-55.6

1179.2

0.54

-24.3

Cumulative LAI ? LAI max

935.2

0.58

-55.6

1154.1

0.55

-21.6

Cumulative LAI ? LAI max ? cumulative rainfall

866.4

0.61

-44.4

1057.4

0.59

-10.8

Cumulative LAI ? LAI max ? cumulative rainfall ? cumulative SRAD

848.5

0.62

-40.7

1049.3

0.59

-10.8

Cumulative LAI

519.3

0.77

14.8

840.1

0.67

10.8

Cumulative LAI ? LAI max

406.2

0.81

29.6

653.0

0.77

37.8

Cumulaive LAI ? LAI max ? cumulative rainfall

272.4

0.88

55.6

483.3

0.81

48.6

Cumulaive LAI ? LAI max ? cumulative rainfall ? cumulative SRAD

394

0.82

33.3

593.2

0.77

37.8

Performance on training SUBSTOR MLR with different inputs

MLP with different inputs

Performance on testing SUBSTOR MLR with different inputs

MLP with different inputs

where Y is simulated yield, a is planting date, and b is harvest time. The MAE was 866 kg DM/ha, and RMSE was 1057 kg DM/ha. Skill scores for MAE and RMSE were 0.61 and 0.59, respectively. The NS criterion decreased by 44.4 and 10.8%, respectively. Adding cumulative solar radiation only slightly improved the performance. Figure 3 shows the fit between observed and simulated data. Although estimation errors were larger compared to SUBSTOR, the scatter plot appeared more coherent compared to Fig. 2. The MLR models

123

Precision Agric (2011) 12:905–923

915

Fig. 2 Relationship between simulated and measured tuber dry weight (kg DM ha-1) for SUBSTOR using data from planting to harvest. Each line represents a site

overestimated tuber yield at the beginning of the growing season because a linear procedure cannot model the discontinuity of tuber growth before tuber initiation. The model started to compute tuber growth immediately after planting at which stage the plant could not produce tuber growth. On the other hand, a single input MLP with cumulative LAI performed better than SUBSTOR with a MAE and RMSE-based skill score values of 0.77 and 0.67, respectively

123

916

Precision Agric (2011) 12:905–923

Fig. 3 Relationship between simulated and measured tuber dry weight (kg DM ha-1) for the 3-input MLR using data from planting to harvest. Each line represents a site

(Table 5). Compared to the reference model, combining cumulative LAI, maximum LAI, and cumulative rainfall improved the MAE-based skill score by 55.6% (NS criterion). Considering the RMSE, skill score improvement was as high as 48.6%. Figure 4 shows the fit between observed and simulated tuber yield for the 3-input MLP.

123

Precision Agric (2011) 12:905–923

917

Fig. 4 Relationship between simulated and measured tuber dry weight (kg DM ha-1) for the 3-input MLP using data from planting to harvest. Each line represents a site

Comparison between models for yield forecast The high MLP performance is promising for evaluation of the capacity of empirical models to forecast tuber yield without relying on constructed time-series of daily climatic input data. The term forecast is used here to qualify estimations of the tuber growth occurring

123

918

Precision Agric (2011) 12:905–923

Table 6 Model’s performance for yield forecast in the validation dataset Model

MAE (kg DM ha-1)

SSMAE

RMSE (kg DM ha-1)

SSRMSE

0.44

9 weeks before harvest SUBSTOR

869.8

0.61

1441.2

3-input MLR

759.9

0.60

962.3

0.57

3-input MLP

452.8

0.76

594.2

0.73

SUBSTOR

596

0.73

959.5

0.63

3-input MLR

814.1

0.52

996.8

0.51

3-input MLP

651.8

0.61

799.6

0.60

SUBSTOR

438

0.80

742.5

0.72

3-input MLR

815.5

0.45

964.4

0.46

3-input MLP

1126.7

0.24

1323.8

0.27

10 weeks before harvest

11 weeks before harvest

Fig. 5 Relationship between observed and forecasted tuber dry weight using MLR and MLP models for 9-, 10- and 11-week forecasts on validation. Each line represents a site

later in the season, even though they are actually hindcast (i.e. they have been run retrospectively). Table 6 and Fig. 5 present MLR and MLP performance during validation for forecasting yield for 9-, 10- and 11-week horizons. A different model was built for each time

123

Precision Agric (2011) 12:905–923

919

horizon, which was trained with data from planting up to the time of the forecast of the later yield. Maximum LAI in these cases refers to the maximum leaf area index value obtained up to the estimation time. The MAE and RMSE-based skill scores for forecast made 9 weeks before harvest with SUBSTOR were 0.61 and 0.44, respectively (Fig. 6). The 10-week forecast gave MAE and RMSE-based skill scores of 0.73 and 0.63, respectively. For the 11-week forecast, skill scores increased to 0.80 and 0.72, respectively. Therefore, the model performed better with increasing the time horizon, which represents an anomaly. The MAE and RMSE-based skill scores for forecast 9 weeks before harvest with MLR were 0.60 and 0.57, respectively (Table 6). The 10-week forecast gave MAE and RMSEbased skill scores of 0.52 and 0.51, respectively. For the 11-week forecast, skill scores dropped to 0.45 and 0.46, respectively. Hence, empirical models could accurately forecast yield 9 to 11 weeks before harvest. As in the case of estimations using data across cropping seasons, the non-linear transformation performed by the 3-input MLP improved the regression. The MAE and RMSE-based skill scores for the 9-week forecast were 0.76 and 0.73, respectively, compared to 0.61 and 0.6, respectively, for the 10-week forecast and to 0.24 and 0.27, respectively, for the 11-week forecast (Table 6). The error distributions were narrower for MLP than for MLR, except for the 11-week forecast (Fig. 5). Furthermore, for the MLP forecasts, the relationship between forecast and measured values decreased as the time horizon increased. This indicates that the MLP was more sensitive to input variables than the linear procedure. Consequently, final tuber yield depended primarily on conditions prevailing during the first *5 weeks for early and mid-season cultivars and *8 weeks after planting for the late cultivar. This period coincides approximately with the flowering stage under Quebec growing conditions.

Discussion Comparison between models for yield estimation The non-linear transformation performed by the MLP improved the performance of the regression due to its greater ability to simulate tuber initiation date and final yield (Fig. 4). An example of the estimation of tuber initiation date versus the observed one for each model under study is shown in Fig. 6. The MLP was the only one adequately simulating the beginning of tuberization followed by SUBSTOR and the MLR. Finally, the optimization for the number of hidden nodes of the 3-input MLP model confirmed that the 3-hidden-neuron MLP performed best. A better performance of MLP compared to MLR is attributable to the fact that a linear model does not account for the temporal discontinuity with respect to tuber initiation. The linear model over-estimated tuber production at the beginning of the cropping season, but under-estimated it during the following growth stages (Fig. 3). Under-estimation occurred when the LAI started to decrease because the MLR falsely assumed proportional decrease of tuber growth during senescence, not accounting for the biomass translocation from aerial part to tubers (Ojala et al. 1990). The LAI function as substitute for climatic parameters in MLP should not be interpreted as a lack of sensitivity of crop response to climate factors. Considering that LAI is a direct measure of plant growth, it integrates the effect of climatic parameters, soil parameters, management factors and mother tuber age. Only cumulative rainfall contained

123

920

Precision Agric (2011) 12:905–923

Fig. 6 Simulated tuber initialization date and observed tuber initialization date for Goldrush in 2005 (site 10). DAP is the day after planting

supplementary information about the behaviour of the system, probably because this variable was the only one with significant inter-annual variation. Seasonal values of solar radiation and temperature were statistically close across the 3 years of this study, making it difficult to assess their relative importance. Comparison between models for yield forecast The results presented in Table 6 suggest that, where no further N fertilization is applied to increase the LAI, tuber yield mostly depends on meteorological conditions prevailing during the first 5–8 weeks after planting. It is during this period that the physiological factors controlling the differentiation of the stolons leading to tuber initiation are in action. Development of tubers after the flowering stage essentially depends on allocation rate of assimilates to the tubers and follows a linear trend until harvest. After tuber initiation, growth rate of the aerial part decreases for the benefit of tubers and tuber filling will

123

Precision Agric (2011) 12:905–923

921

Fig. 7 Relationship between observed and forecasted tuber dry weight on validation using the SUBSTOR models for different time horizons. Each line represents a site

continue at approximately the same rate during senescence (Ojala et al. 1990). The LAI at flowering stage would largely determine the slope of the time-tuber growth relationship. As shown in Table 6 and Fig. 7, SUBSTOR forecast tuber yield more accurately compared to the simulation using meteorological data covering the entire crop season. Performance of the model should normally decrease as the time horizon increases between forecast and harvest time. For this reason, it is impossible to interpret the results of SUBSTOR for yield prediction. Even if the MLP showed promising results, a larger and more diverse data set would be needed to measure to what extent meteorological conditions following the flowering stage affect the development of tubers. Conclusion The results suggest that tuber yield for a single N application rate at planting depended primarily on meteorological conditions prevailing during the first 5–8 weeks (depending

123

922

Precision Agric (2011) 12:905–923

on cultivar) after planting. Development of tubers after this period of time seemed essentially controlled by biomass allocation to tubers. In-field spatial variation of tuber yield was mainly expressed by local variations of LAI, suggesting that LAI integrates information about soils, meteorological and genetic factors. The best forecasts of final tuber yields were obtained with a 3-input MLP using cumulative LAI, maximum LAI and cumulative rainfall. The SUBSTOR model did not give coherent results probably due to its inability to model spatial in-field tuber growth variation. Other studies involving the seasonal adjustment of LAI computed by SUBSTOR are needed to conclude whether or not the present model version can be used in the context of precision agriculture. Although NN is a promising alternative model to forecast yield and calculate optimum N rate, empirical approaches may not replace more complex deterministic models across all applications. Acknowledgments This research was supported by Cultures H. Dolbec Inc., Groupe Gosselin Inc., Agriparmentier Inc., Prochamps Inc., Ferme Daniel Bolduc (1980) Inc. and the Natural Sciences and Engineering Research Council of Canada (CRDPJ 305166-03). We thank Nicolas Samson and Philippe Parent for technical assistance.

References Anctil, F., & Lauzon, N. (2004). Generalisation for neural networks through data sampling and training procedures, with applications to streamflow predictions. Hydrology and Earth System Sciences, 8(5), 940–958. Be´langer, G., Walsh, J. R., Richards, J. E., Milburn, P. H., & Ziadi, N. (2001). Critical nitrogen curve and nitrogen nutrition index for potato in eastern Canada. American Journal of Potato Research, 78, 355–364. Coulibaly, P., Anctil, F., & Bobe´e, B. (2000). Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. Journal of Hydrology, 230, 244–257. CRAAQ. (2003). Guide de Re´fe´rence en Fertilisation (Reference Fertilization Guide) (p. 294). Sainte-Foy: Centre de Re´fe´rence en Agriculture et en Agroalimentaire du Que´bec, Canada. Dou, H., Alva, A. K., & Appel, T. (2000). An evaluation of plant-available soil nitrogen in selected sandy soils by electro-ultrafiltration, KCl, and CaCl2 extraction methods. Biology and Fertility of Soils, 30, 328–332. Errebhi, M., Rosen, C. J., Gupta, S. C., & Birong, D. E. (1998). Potato yield response and nitrate leaching as influenced by nitrogen management. Agronomy Journal, 90, 10–15. Fortin, J. G., Anctil, F., Parent, L. E., & Bolider, M. A. (2008). Comparison of empirical daily surface incoming solar radiation models. Agricultural and Forest Meteorology, 148, 1332–1340. Fortin, J. G., Anctil, F., Parent, L. E., & Bolinder, M. A. (2010). A neural network experiment on the sitespecific simulation of potato tuber growth in Eastern Canada. Computers and Electronics in Agriculture, 73, 126–132. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366. Iyer, M. S., & Rhinehart, R. R. (1999). A method to determine the required number of neural-network training repetitions. IEEE Transaction on Neural Networks, 10(2), 427–432. Joern, B. C., & Vitosh, M. L. (1995). Influence of applied nitrogen on potato part II: Recovery and partitioning of applied nitrogen. American Potato Journal, 72, 73–84. Klerfors, D. (1998). Artificial neural networks. Project MISB-420–0. St Louis, USA: Saint Louis University. Legates, D. R., & McCabe, G. J. (1999). Evaluating the use of ‘‘goodness-of-fit’’ measures in hydrologic and hydroclimatic model validation. Water Resources Researches, 35(1), 233–241. MacKay, D. J. C. (1992). Bayesian interpolation. Neural Computation, 4, 415–447. MacKerron, D. K. L., & Waister, P. D. (1985). A simple model of potato growth and yield. Part 1. Model development and sensitivity analysis. Agricultural and Forest Meteorology, 34, 241–252. Mehlich, A. (1984). Mehlich 3 soil test extractant: A modification of Mehlich 2. Communications in Soil Science and Plant Analysis, 15, 1409–1416.

123

Precision Agric (2011) 12:905–923

923

Nash, J. E., & Sutcliffe, J. V. (1970). River flow forecasting through conceptual models: a discussion of principles. Journal of Hydrology, 10, 282–290. Ojala, J. C., Stark, J. C., & Kleinkopf, G. E. (1990). Influence of irrigation and nitrogen management on potato yield and quality. American Potato Journal, 67, 29–43. Paz, J. O., Batchelor, W. D., & Pedersen, P. (2003). WebGro—a web-based soybean management decision support system. Agronomy Journal, 96, 1771–1779. Prange, R. K., McRae, K. B., Midmore, D. J., & Deng, R. (1990). Reduction in potato growth at high temperature: role of photosynthesis and dark respiration. American Potato Journal, 67, 357–369. Senbeta, D. A., Shamseldin, A. Y., & O’Connor, K. M. (1999). Modification of the probability-distributed interacting storage capacity model. Journal of Hydrology, 224, 149–168. Singh, U., Matthews, R. B., Griffin, T. S., Ritchie, J. T., Hunt, L. A., & Goenaga, R. (1998). Modeling growth and development of root and tuber crops. In G. Y. Tsuji, G. Hoogenboom, & P. K. Thorton (Eds.), Understanding options for agricultural production (pp. 129–156). Dordrecht, The Netherlands: Kluwer Academic Press. Stark, J. C., & Wright, J. L. (1985). Relationship between foliage temperature and waterstress in potatoes. American Potato Journal, 62(2), 57–68. Supit, I., Hooijer, A. A., & van Diepen, C. A. (1994). System description of the Wofost 6.0 crop simulation model implemented in CGMS. Volume 1: Theory and algorithms. Luxembourg: European Commission. Trebejo, I., & Midmore, D. J. (1990). Effect of water stress on potato growth, yield and water use in a hot and a cool tropical climate. The Journal of Agricultural Science, 114, 321–334. Vos, J., & Biemond, H. (1992). Effects of nitrogen on the development and growth of the potato plant. 1. Leaf appearance, expansion growth, life spans of leaves and stem branching. Annals of Botany, 70, 27–35. Welles, J. M., & Norman, J. M. (1991). Instrument for indirect measurement of canopy architecture. Agronomy Journal, 83, 818–825. Wheeler, R. M., & Tibbetts, T. W. (1986). Utilization of potatoes for life support systems in space: I. Cultivar-photoperiod interactions. American Potato Journal, 63, 315–323. Yonaba, H., Anctil, F., & Fortin, V. (2010). Comparing sigmoid transfer functions for neural network multistep ahead streamflow forecasting. Journal of Hydrologic Engineering, 15(4), 275–283.

123

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.