Enhancing tidal prediction accuracy in a deterministic model using chaos theory

Descripción

Advances in Water Resources 27 (2004) 761–772 www.elsevier.com/locate/advwatres

Enhancing tidal prediction accuracy in a deterministic model using chaos theory S.A. Sannasiraj a

d

a,*

, Hong Zhang b, Vladan Babovic c, Eng Soon Chan

d

Department of Ocean Engineering, Indian Institute of Technology Madras, Chennai 600036, India b School of Engineering, Griﬃth University, Gold Coast Campus, QLD 4215, Australia c Tectrasys AG, Sihleggstrasse 23, Wollerau 8832, Switzerland Tropical Marine Science Institute, National University of Singapore, Singapore 119223, Singapore Received 22 January 2003; received in revised form 18 March 2004; accepted 25 March 2004 Available online 9 June 2004

Abstract The classical deterministic approach to tidal prediction is based on barotropic or baroclinic models with prescribed boundary conditions from a global model or measurements. The prediction by the deterministic model is limited by the precision of the prescribed initial and boundary conditions. Improvement to the knowledge of model formulation would only marginally increase the prediction accuracy without the correct driving forces. This study describes an improvement in the forecasting capability of the tidal model by combining the best of a deterministic model and a stochastic model. The latter is overlaid on the numerical model predictions to improve the forecast accuracy. The tidal prediction is carried out using a three-dimensional baroclinic model and, error correction is instigated using a stochastic model based on a local linear approximation. Embedding theorem based on the time lagged embedded vectors is the basis for the stochastic model. The combined model could achieve an eﬃciency of 80% for 1 day tidal forecast and 73% for a 7 day tidal forecast as compared to the deterministic model estimation. Ó 2004 Elsevier Ltd. All rights reserved. Keywords: Embedding theorem; Genetic algorithm; Tidal forecasting; Local model; Time delay

1. Introduction In ship navigation and harbour operations, tidal information is extremely important. In many cases, tidal prediction of a forecast horizon from 6 h to a few days would be exceptionally helpful in the ﬁnalization of work schedules inside harbours or, indeed, for many other coastal activities. The current practice of tidal prediction is undertaken by either using a tidal predictive deterministic model or by a time series forecasting model. Each of the above two approaches has its own capabilities of prediction. Numerical models can predict the physics of the tidal movement. However, even if the tidal ﬂow governing system of equations can model the prediction framework with good aptness, there are many factors that diminish the prediction capability of the *

Corresponding author. Fax: +91-44-22578625. E-mail addresses: [email protected] (S.A. Sannasiraj), hong.zhang@griﬃth.edu.au (H. Zhang), [email protected] (V. Babovic), [email protected] (E.S. Chan). 0309-1708/$ - see front matter Ó 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.advwatres.2004.03.006

model. These delimiting factors are the initial conditions and external forcing, such as boundary ﬂuxes and surface driving wind forces. If the initial state of the system and boundary conditions were not predicted with a good accuracy, the prediction at any later time would become questionable irrespective of the higher prognostic capability of the model. These classes of problems which are sensitive to initial and boundary conditions, thus, show chaotic behaviour in its predictive state variables. Alternatively, statistical forecasting models can be either linear such as auto-regressive integrated moving average (ARIMA) models or nonlinear models based on chaos. However, the linear models would fail to retain higher accuracy for more than a few hours in the forecast horizon in the class of nonlinear dynamical problems and chaotic systems. Diﬀerent classes of methods are called upon to bring out the chaotic behaviours. The nonlinear models based on chaos theory, such as neural networks and embedding theorem, are based on the correlation between diﬀerent state variables of the

762

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772

systems using the input-output relation. During the last two decades, chaos theory proved its applicability to a number of problems. However, chaotic signal processing is a fast emerging ﬁeld of application in the dynamics of the ocean. The randomness of the hydrological time series and water levels are brought out using chaos theory by many authors e.g. Hense [18], Jayawardena and Lai [21], Sivakumar et al. [31], Rahman [27], Cristianini and Shawe-Taylor [12]. In coastal waters, Frison et al. [17] and Zaldivar et al. [34] adopted nonlinear dynamic analysis. In this paper, a diﬀerent approach has been adopted by combining the advantages of the numerical and chaotic methods. Bringing the underlying dynamics of a chaotic time series into a deterministic predictive tidal model would improve the prognostic capability with good accuracy. The prediction of nonlinear dynamics is based on the embedding theorem proposed by Takens [33] and on the works of Abarbanel [1]. The numerical model is based on the formulation of the Princeton Ocean Model (POM) [8]. The merger would be deﬁned as an error correction tool at individual stations, where, suﬃcient length of past observations is available. This discrete prediction could then be carried forth to the inﬂuencing zone in the domain by the concept of optimal interpolation using gain vectors which is discussed elsewhere [14,32]. The following section has clearly been demarcated to demonstrate the eﬃciency of the forecasting algorithm. The Section 2 explains the deterministic tidal model and an application of tidal level prediction in the South East Asian seas. Conversely, the Section 3 presents an entirely new concept of tidal level forecasting using chaos theory, called the local model. This stochastic model is based on the embedding theorem [33]. The genetic algorithm is employed for the optimization of the local model parameters. Appendix A brieﬂy explains the salient features of the evolutionary principle adopted. The advantages of both the deterministic model and the stochastic model are extracted by amalgamating both the models as explained in Section 4. The Section 5 concludes by summarizing the procedure adopted and the eﬃciency of the local model.

2. Tidal predictive model 2.1. General Tides are not only inﬂuenced by the gravitational attraction of the moon and sun, but also by the coastline conﬁguration, local water depth, seaﬂoor topography, winds, and weather. To understand the dynamics associated with the tidal movement and associated current, a barotropic model [7,11] has been applied to examine the barotropic response to tidal forcing in regional seas. It

adequately describes the surface elevation and tidal currents over most of the continental shelf. However, nearly abrupt changes in shelf topography, baroclinic tidal motions contributing to surface elevation and currents have been observed. Therefore, using mathematical equations governing the hydrodynamic movements, a three-dimensional model for the prediction of surface and internal currents has been developed and applied widely, such as in the ocean oﬀ the Northern British Columbia [13], the Australian North West Shelf [20], the Hawaiian Ridge [25] and the Singapore waters [35]. 2.2. Governing equation A tidal response can be predicted by a three-dimensional, prognostic, primitive equation model. The ﬂow equations governing ocean circulation consists of the hydrostatic and the Boussinesq Navier–Stokes equations. The hydrostatic assumption and the Boussinesq approximation are used in the model based on the premise that the horizontal extent is much larger than the vertical extent. The governing equations thus formulated in orthogonal Cartesian co-ordinates with x increasing in the eastward direction, y increasing in the northward direction and z measuring vertically upwards from an undisturbed water level are summarized as follows [8]. (a) The continuity equation. rV þ

ow ¼0 oz

ð1Þ

where V is the horizontal velocity vector having components ðu; vÞ and w is the vertical component of the velocity. (b) The Reynolds momentum equations. ou ov þ V ru þ w fv ot oz o oP o ou þ KM ¼ þ Fx oq0 ox oz oz ov ov þ V rv þ w þ fu ot oz o oP o ov ¼ þ KM þ Fy oq0 oy oz oz

ð2Þ

ð3Þ

(c) The hydrostatic assumption. qg ¼

oP oz

ð4Þ

where q0 is the reference density, q the in situ density, g the gravitational acceleration, P the pressure, KM the vertical eddy diﬀusivity of turbulent momentum mixing and f the Coriolis parameter. In the horizontal direction, the mixing terms, Fx and Fy , (Eqs. (2)

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772

763

and (3)) are parameterized using the concept of turbulent diﬀusion in analogy to molecular diﬀusion. For most coastal regions, the open boundaries essentially determine the tides inside. The model is driven by tides forced by the boundary conditions at straits or other open boundaries [10]. For narrow straits, available tide gauge data can be used to prescribe the open boundary conditions. For wide-open boundaries, the practical method is to use global ocean tide results, such as that of Schwiderski [30], or tides derived from altimetric measurements at the open boundary. In our study, the elevation ﬁeld at each time step is solved using a fully implicit scheme by specifying the boundary conditions as g ¼ gBC o u o u ce ¼0 ot ox ou ou ce ¼0 ð5Þ ot ox ov ov ce ¼ 0 ot oy ov ov ce ¼ 0 ot oy pﬃﬃﬃﬃﬃﬃﬃ where ce ¼ gH is the wave speed, u and v are depth averaged velocities in x- and y-direction. 2.3. Case study The study domain were the Southeast Asian Seas which include the South China Sea, the Java Sea, the Sulu Sea, the Gulf of Thailand and the Malacca straits, covering an area from 99°E–121°E longitude and 9°S– 24°N latitude. The South China Sea encompasses a portion of the Paciﬁc Ocean stretching roughly from Singapore and the Strait of Malacca in the southwest, to the Strait of Taiwan (between Taiwan and China) and Luzon strait in the northeast. It has complex bathymetry which ranges from the shallowest coastal fringes in the wide continental shelf in the southern part, to the Manila Trench with a maximum depth of 5377 m (Fig. 1). This part of the ocean is more important from the navigational point of view. The South China Sea region is the world’s second busiest international sea lane having the port of Singapore in the south-west and Hong Kong port on the northern side. More than half of the world’s supertanker traﬃc pass through this region. The strategic importance of this region shows the need for an accurate prediction of tides and currents for safe navigation and for the operable conditions at the oﬀshore platforms. The seas are surrounded mainly by land boundaries with many straits connecting the Paciﬁc and Indian Oceans, along with many small straits connecting individual water bodies within the domain. These com-

Fig. 1. The study domain covers South China Sea, Malacca straits and Java Sea.

plexities induce errors in the boundary speciﬁcations. There are many countries surrounding these seas, which make the availability of a measurement database in a unique system questionable. The tidal predictions in this paper focused on two strategic locations in the study domain. These were stations at Hong Kong (22°180 N, 114°130 E) and Horsburgh lighthouse (1°200 N, 104°240 E) at the eastern boundary of Singapore. However, there is no restriction on the selection of the location and the numbers of stations for this application, provided observations are available. The measurements at these stations were obtained from the TotalTide Tables published by the UK Hydrographic Oﬃce. It has to be noted that the tidal observations at these stations are based only on the derived tidal constituents at these locations. However, for the present application, the tidal observations were assumed to be true for the best knowledge available. The boundary elevations and current ﬁelds were obtained from TotalTide Tables. The simulation time period was from 1st March 2001 to 6th June 2001. The hourly tidal prediction was undertaken for the evaluation of the proposed model. The hindcast period was set from 1st March 2001 to 14th May 2001 and the forecasting period from 01:00 h 15th May 2001 to 6th June 2001. The tidal prediction at the selected stations for a 10 days period from 01:00 h 15th May 2001 is shown in Fig. 2. The eﬃciency of the prediction is shown in terms of the scatter diagram and the statistical error estimates,

764

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772 Measurement Tidal model

(a)

Tidal elevation (m)

15 May-1:00 2.5

17 May-1:00

19 May-1:00

21 May-1:00

23 May-1:00

25 May-1:00

2 1.5 1 0.5 0 0

24

48

72

96

144

168

192

216

240

Measurement Tidal Model

(b) 15 May-1:00 3

Tidal elevation (m)

120

17 May-1:00

19 May-1:00

21 May-1:00

23 May-1:00

25 May-1:00

2.5 2 1.5 1 0.5 0 0

24

48

72

96

120

144

168

192

216

240

Fig. 2. Tidal prediction by the deterministic model at: (a) Horsburgh lighthouse and (b) Hong Kong stations.

such as mean absolute error (MAE), root mean square error (RMSE), scatter index (SI) and correlation coefﬁcient (c). These parameters are deﬁned as follows: P g0 gf ð6Þ MAE ¼ n sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 P g0 gf ð7Þ RMSE ¼ n ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ sP 2 ﬃ g0 gf =n SI ¼ ð8Þ g0 P g0 Þ gf gf ðg0 c ¼ rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 P 2ﬃ P g0 gf gf g0 P gf ¼

gf ; n

ð9Þ

P g0 ¼

g0 n

ð10Þ

where g0 ðtÞ is the tidal observation, gf ðtÞ is the tidal prediction and n is the number of records. The over bar indicates the mean value. Table 1 presents the above statistical error measures for the model prediction during the 10 day period. The scatter diagram (Fig. 3) depicts that the model over predicts the high tides at both stations and under predicts the low tidal elevations at Horsburgh lighthouse. The scatter was large, as shown by the scatter

Table 1 Mean absolute error (MAE), root mean square error (RMSE), scatter index (SI) as well as correlation coeﬃcient (c) in the tidal elevation (g) prediction using numerical model Tidal station

Horsburgh lighthouse

Hong Kong

MAE (cm) RMSE (cm) c SI

17.39 21.12 0.91 0.16

25.62 29.52 0.85 0.20

index of 0.16 at Horsburgh and 0.20 at Hong Kong. The model was also unable to predict neap tides for some periods particularly if the low tidal motion was high especially above mean sea level (semi-diurnal tidal motion). This deviation is mainly due to discrepancies in the prescription of boundary elevation at diﬀerent boundaries, which are basically narrow straits connecting the neighbouring seas. Many parts of the study region were also characterized by the mixed tidal zone where either diurnal or semi-diurnal tides would occur due to the complexities of the domain. Hence, the deviation in the predictive model was expected to be high. Several investigators [9,15,22,28,35] have shown signiﬁcantly better comparison in some parts of this region. However, the studies were restricted to independent seas or straits and, each has adopted many variations in the boundary speciﬁcations and tidal ﬂow directions, so that the tidal elevation inside the interested domain would match reasonably well.

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772 3

765

3

Horsburgh lighthouse

Hong Kong

2 Tidal model

Tidal model

2

MSL 1

MSL 1

0

0 0

1 2 Observation

3

0

1 2 Observation

3

Fig. 3. Scatter diagram of tidal model prediction values from the observation at Horsburgh lighthouse and Hong Kong stations.

3. Prediction of nonlinear dynamics by chaos theory 3.1. Embedding theorem A landmark in the chaos signal processing was made with the origin of embedding theorem of Takens [33]. This theorem was based on the works of Packard et al. [26] which explored the time-lagged vectors to realize the underlying dynamics, whereby, a real time process result in a time series gðtÞ ¼ fgðt0 þ nsÞg is sampled at intervals s and initiated at t0 . Consider a dynamical system with a d-dimensional space and an evolving solution gðtÞ. Let g be some observation gðgðtÞÞ. The lag vector can be deﬁned as

ð11Þ gðtÞ gt ; gts1 ; gts2 ; gts3 ; . . . ; gtsd1 Then, under general conditions, the space of vectors gðtÞ generated by the dynamics contains all of the information of the space of solution vectors gðtÞ. The mapping between them is smooth and invertible. This property is referred to as embedding. Thus, the study of the time series gðtÞ is also the study of the solutions of the underlying dynamical system gðtÞ through a particular coordinate system given by the observable g. The embedding theorem establishes that, given a scalar time series from a dynamical system, it is possible to reconstruct a phase space from this single variable, that is, in theory, an embedded space with dimensions consisting of various time lags of the variable itself. The embedded space can also be created from many dynamic variables. For example, the measured water surface elevation could be thought of as a projection of its own and other variables’ time histories. The additional variable in the environment for the prediction of a surface elevation may be currents and/or wind speed. In some way, this projected surface elevation contains information about all the other contributing phenomena, even though if measured at only one point in the geographic space. According to the embedding theorem, the

underlying structure cannot be seen in the space of the original scalar time series, rather only when unfolded into an embedded (or phase) space. Time series can correspondingly be forecast based on this structure in the phase space. The purpose of the forecasting is to predict the state of the system gðtÞ at a time horizon T in the future gðt þ T Þ. In the present study, it has been observed that the tidal surface elevation has suﬃcient information within its series and hence, other variables were not included to obtain the multi-variate, multiphase domain. The vector, Eq. (11) represents the nonlinear dynamics in its entity when the embedding dimension d is large enough and the selection of the time delay s is proper. However, for a good approximate ﬁrst guess values would give a feeling on the eﬃciency of the embedding. There are many methods available to estimate d and s. Typically, the selection of s is based on the concept of average mutual information (AMI) and d using false nearest neighbour (FNN) analyses [1]. However, these methods have been shown to be generally sub-optimal selections [5] and an alternate strategy using genetic algorithms was suggested [4,6,24]. This approach is employed throughout this work as it has been shown to demonstrate signiﬁcant improvements in the delineation of prescribed values. 3.2. Estimation of optimal time delay and embedding dimension The optimal embedding could be ensured from the proper selection of optimal time delay s. The underlying fact of such selection should be that each component of the vector provides new information about the signal source at a given time. The time delay s must be large enough so that the information carried by each of the component of the vector is self-contained. At the same time, s must be small enough so that the components of the vector gðtÞ are uncorrelated with respect to each

766

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772

other. The embedding theorem provides assurance that, when inﬁnite amounts of inﬁnitely accurate data are available, any s would work, and these concerns on optimal embedding would vanish. However, since the amount of data is usually ﬁnite and of a ﬁnite precision, there are practical concerns of a serious nature that need to be addressed. Smaller values of s would not add signiﬁcant new information about the dynamics. However, larger values of s create uncorrelated elements in gðtÞ. The optimized value of s is the one at which either value of autocorrelation goes through ﬁrst zero, or the value of average mutual information takes the ﬁrst minimum. Such recipes provide robust, but in principle sub-optimal choices of embedding parameters, thus, resulting in sub-optimal embedding properties as well as a sub-optimal forecast skill. The embedding dimension d is the minimum number of time-delay coordinates needed so that the trajectories gðtÞ do not intersect in d-dimensions. For less than optimum dimensions d, trajectories can intersect because they are projected down into a fewer dimensions. Consequently, the forecast may be corrupted. If d is large, noise may corrupt as noise ﬁlls any dimension. The idea behind determining optimal d is to ensure that trajectories associated with close neighbours have to remain close for some time period. The bottom line for the performance of an embedding and subsequent prediction is the resulting forecast skill, as good forecast skill implies good embedding properties. Therefore, in an attempt to obtain mere optimal embedding parameters, an evolutionary algorithm is used (Appendix A). Determination of embedding parameters posed a global optimisation problem with the decision variables being the selection of embedding dimension, d and the time delay, s. 3.3. Local model Being identiﬁed the optimal estimate of d and s, the phase space, H is reconstructed with the elements of gðtÞ. The prediction model can then be built in this ddimensional space. gðt þ T Þ ¼ gT ðgðtÞÞ

ð12Þ

where gðtÞ is the current state of the system. gðt þ T Þ is the system state in a forecast horizon interval of ‘T ’ and gT is a mapping function. The problem now is limited to ﬁnd a good expression for gT . In the local model, gðtÞ is evaluated among only the most similar points which are locally present near the forecast point. The sub-phase space is identiﬁed from the historical time series by choosing k nearest neighbouring points within the ddimensional space, H . The neighbours can be chosen either within a circle of constant radius (distance) from the forecast point or by specifying the number of nearest neighbours, k.

Having constructed the phase space and pooled the most similar events in the past corresponding to the present time horizon, T , the desired expected (forecast) value vector, NðtÞ is formed for each point in the neighbourhood domain, say H 0 where H 0 2 H . The regression has been performed using the neighbourhood coordinates in the sub-domain H 0 ð Þ as inputs, and their corresponding expected values (N) as outputs. In this study, the regression order was chosen as linear (polynomial degree one) and hence, called a local linear model (LLM). Although a LLM makes use of a linear approximation for each separate prediction, the resulting overall model can be highly nonlinear, as each of these linear approximations are made for each separate neighbourhood [3]. However, the local approximation can be of any polynomial degree which should be tested for diﬀerent cases according to the dynamics of the time series. Local models are particularly well suited for the forecasting of chaotic time series because these share many fundamental ideas with the time-delay embedding theorem. A rather eﬀective method of simulating the evolution of a dynamical system is by means of a local approximation, using only the most similar trajectories from the past to make predictions of the future. 3.4. Forecasting by local model The tidal forecasting was undertaken using the historical data set available at Horsburgh lighthouse and Hong Kong using the Local linear approximation as discussed in the earlier section. The hourly tidal records were retrieved from the TotalTide Tables for the time period from 1st March 2001 to 6th June 2001. The hindcast period was from 01:00 h 1st March 2001 to 00:00 h 15th May 2001 and, the forecasting period started from 01:00 h 15th May 2001. Thus, within the total of 2200 records, 1800 records were in the hindcast period (training series) and the remaining in the forecasting horizon (testing series) reserved for validation purposes. The present implementation uses genetic algorithms as an engine for searching for the best set of local model parameters. The optimization process is constrained, since the objective function is minimized within bounds of largest and smallest allowed parameter values as presented in Table 2. Here, the objective function is the lowest overall deviation between the simulated and the available (TotalTide tables) tidal levels. In additional to these, a number of parameters related to genetic algorithms set-up must be deﬁned, such as: Population size: chosen to be 100. Number of children per pair of parents: set to 2. Maximum generation: set to 100.

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772 Table 2 Prescribed maximum and minimum parameter values for an optimisation run using genetic algorithm

767

Table 3 Embedding characteristics for tidal surface elevation, g for various lead times

Parameter

Minimum

Maximum

Tidal station

Horsburgh lighthouse

Time delay, s (h) Embedding dimension, d Neighbours, k

1 2 20

32 9 325

Forecast horizon (T hours) Neighbours (k) Embedding dimension (d) Time delay (s)

24

48

168

24

48

168

75 3

125 4

95 2

150 3

70 4

115 2

17

21

7

12

16

9

The initial population size corresponds to the number of simulated evolving members and the population growth is expressed through number of children, which are created during the automated population selection. The maximum number of generations corresponds to the length of evolutionary cycle. The embedding parameters (s; d) and the local model neighbours (k) were optimized using a genetic algorithm for each forecast horizon required. Table 3 presents the optimized parameters for diﬀerent forecast time periods. Fig. 4 shows the tidal prediction over the 10 days period (01:00 h 15th May to 00:00 h 25th May 2001) for a forecast horizon of 24 h (1 day), 48 h (2 days) and 168 h (7 days). Table 4 presents the corresponding statistical error measures between the observed tidal elevation and the LLM predicted tidal elevation. For the 7 day forecast, the LLM predicts tidal elevations better than a complex tidal predictive numerical model. Extension of the forecast horizon using LLM would tend to accumulate errors as the nonlinear series only carries little information after a certain extent. A RMSE error of 21.1 cm was obtained using the

4. Amalgamation of stochastic and tidal models Many of the important aspects of analyzing dynamical systems are carried out by the study of observable variables of the system as a function of time. However, it may be argued that not enough attention is given to the

17 May-1:00

19 May-1:00

21 May-1:00

23 May-1:00

25 May-1:00

1-day forecast

15 May-1:00 2.5

1.6 1.2 0.8 0.4 0 24

48

72

96

120

144

168

192

216

19 May-1:00

21 May-1:00

23 May-1:00

25 May-1:00

2 1.5 1 0.5

240

0

24

48

72

96

120

144

168

192

216

240

72

96

120

144

168

192

216

240

72

96

120

144

168

192

216

240

2.5

2-days forecast

2

Tidal elevation (m)

Tidal elevation (m)

17 May-1:00

1-day forecast

0 0

2.4

1.6 1.2 0.8 0.4 0

2-days forecast 2 1.5 1 0.5 0

0

24

48

72

96

120

144

168

192

216

240

0

24

48

2.5

2.4

7-days forecast

2

Tidal elevation (m)

Tidal elevation (m)

Measurement Local model prediction

(b)

Tidal elevation (m)

Tidal elevation (m)

15 May-1:00 2.4 2

numerical model and the stochastic model predicts only 12.7 cm for 24 h forecasting at Horsburgh lighthouse station. The correlation was also found to be better in the case of stochastic model prediction. However, the correlation is found to deteriorate with the increase of the forecasting period. Fig. 5 presents the scatter plot of tidal elevation forecasting using LLM. The scatter was found to be less compared to the numerical model prediction (Fig. 3). The scatter indices (SI) depicted in Table 3 also indicate lower values compared to the scatter indices of numerical model predictions.

Measurement Local model prediction

(a)

Hong Kong

1.6 1.2 0.8 0.4

7-days forecast 2 1.5 1 0.5 0

0 0

24

48

72

96

120

Time (hrs)

144

168

192

216

240

0

24

48

Time (hrs)

Fig. 4. Tidal forecasting by Local Linear Model at: (a) Horsburgh light house and (b) Hong Kong stations.

768

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772

Table 4 Mean absolute error (MAE), root mean square error (RMSE), scatter index (SI) as well as correlation coeﬃcient (c) in the tidal elevation (g) prediction using local linear model for diﬀerent lead times Tidal station

Horsburgh lighthouse

Forecast horizon (T hours) MAE (cm) RMSE (cm) c SI

24 9.87 12.68 0.94 0.10

Hong Kong

48 8.76 10.85 0.96 0.08

168 14.09 17.60 0.90 0.13

3

48 5.44 6.91 0.99 0.05

168 17.78 20.92 0.90 0.14

3

Horsburgh lighthouse

Hong Kong Local Model (direct)

Local Model (direct)

24 8.88 11.44 0.97 0.08

2

MSL

1

0

2

MSL 1

0 0

1

2

3

0

1

2

3

Observation

Observation

Fig. 5. Scatter diagram of the direct tidal forecast using the local linear model from the observation at Horsburgh lighthouse and Hong Kong stations.

representation of data obtained from ﬁeld or laboratory experiments. As a result, observations, which appear random under time series representation, are typically discarded as noise in most cases. The observation contains information on the underlying dynamics of the local ﬂow patterns which might not be captured by the numeric in many cases. Thus, impregnating the measurements into the model (data assimilation) would lead to stable long term forecasts. Tidal prediction using LLM was found to be better than the model prediction in this region as shown in the earlier section. However, the time series forecasting could only be carried out where suﬃcient measurements were available. The numerical model can predict over the entire domain in ﬁner grid structure and hence, good understanding of the physics of the ocean ﬂows in regional seas would be possible. Both of the advantages of the tidal model and the time series model, LLM were

brought together to obtain a higher resolution model. An attempt was made here to include the measurements into the dynamic model as an error correction mechanism using stochastic theory such as local models. The tidal prediction model described in the Section 2 was executed for the simulation period from 1st March to 6th June 2001 and hence, time history of error measure was formulated. e g ¼ g0 gm

ð13Þ

where g0 is the observed tidal surface elevation at the measurement station and gm is the model prediction. The error series contains purely stochastic characters and hence, it was embedded using optimized parameters derived using genetic algorithm (Table 5). The error forecast thus was carried out in diﬀerent forecast horizons using the corresponding time lag and embedding dimension in LLM. The tidal prediction by the deter-

Table 5 Embedding characteristics for tidal surface elevation error, eð¼ gm gf ) for various lead times Tidal station

Horsburgh lighthouse

Forecast horizon (T hours) Neighbours (k) Embedding dimension (d) Time delay (s)

24 50 9 7

Hong Kong 48 171 9 28

168 118 9 32

24 151 8 4

48 58 8 4

168 50 9 31

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772 Measurement Forecast

(a) 17 May-1:00

19 May-1:00

21 May-1:00

23 May-1:00

25 May-1:00

1-day forecast

2

15 May-1:00 2.5

1.6 1.2 0.8 0.4 0 24

48

72

96

120

144

168

192

216

19 May-1:00

21 May-1:00

23 May-1:00

25 May-1:00

2 1.5 1 0.5

240

0

24

48

72

96

120

144

168

192

216

240

72

96

120

144

168

192

216

240

72

96

120

144

168

192

216

240

2.5

2-days forecast

2

Tidal elevation (m)

Tidal elevation (m)

17 May-1:00

1-day forecast

0 0

2.4

1.6 1.2 0.8 0.4 0

2-days forecast 2 1.5 1 0.5 0

0

24

48

72

96

120

144

168

192

216

240

0

2.4

24

48

3

7-days forecast

2

Tidal elevation (m)

Tidal elevation (m)

Measurement Forecast

(b)

Tidal elevation (m)

Tidal elevation (m)

15 May-1:00 2.4

769

1.6 1.2 0.8 0.4 0

7-days forecast

2.5 2 1.5 1 0.5 0

0

24

48

72

96

120

144

168

192

216

0

240

24

48

Time (hrs)

Time (hrs)

Fig. 6. Tidal forecasting by the combined model at: (a) Horsburgh light house and (b) Hong Kong stations.

Table 6 Mean absolute error (MAE), root mean square error (RMSE), scatter index (SI) as well as correlation coeﬃcient (c) in the tidal elevation (g) prediction by the combination of the deterministic model and a stochastic model Tidal station

Horsburgh lighthouse

Forecast horizon (T hours) MAE (cm) RMSE (cm) c SI

24 3.12 3.98 0.99 0.03

48 5.24 6.53 0.99 0.05

ministic model in the forecast period was thus corrected with the error forecasting. Such a tidal correction is shown in Fig. 6 and the corresponding statistical error measures with the observations are depicted in Table 6. It can be seen that the combination of the stochastic model with a deterministic model can improve the tidal forecast as much as 80% (in terms of RMS error) for a 1 day forecast and 73% for a 7 day forecast at Horsburgh lighthouse station. A high correlation is also observed with the observations. Fig. 7 presents the scatter diagram of the tidal prediction using the combined model. It can be seen that the scatter is minimum, also evident from the scatter indices shown in Table 6.

5. Conclusions The concept of improving the tidal prediction accuracy of a three-dimensional baroclinic model using sto-

Hong Kong 168 4.19 5.64 0.99 0.04

24 2.59 3.19 0.99 0.02

48 4.80 6.08 0.99 0.04

168 6.64 8.22 0.99 0.06

chastic theory is discussed in this paper. The tidal prediction model was based on the Princeton Ocean Model and its predictive capability was tested in the South East Asian waters. The testing was carried out with reference to two strategic locations: Horsburgh lighthouse at the eastern boundary of Singapore waters and at Hong Kong. The nonlinear time series forecasting model, local model was overlaid on the numerical predictive results as an error forecasting algorithm. The resultant combined model was shown to be more eﬃcient in terms of long term prediction with the inclusion of local eﬀects. At Hong Kong station, the RMSE of tidal prediction using the numerical model was about 30 cm and, the combined model reduced the error to 3.2 cm for a 1 day forecast and 8.2 cm for a 7 day forecast. The scatter index of data also reduced from 0.2 to 0.02 for a 1 day forecast and 0.06 for a 7 day forecast. The local model was found to be computationally eﬃcient in terms of execution time as well as memory requirements. Once the forecast at a few observation

770

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772 3

3

Hong Kong

2

Combined model

Combined model

Horsburgh lighthouse

MSL 1

0

2

MSL 1

0 0

1 2 Observation

3

0

1 2 Observation

3

Fig. 7. Scatter diagram of the tidal forecast using combined model from the observation at Horsburgh lighthouse and Hong Kong stations.

stations were carried out with a higher accuracy, the entire domain could beneﬁt by distributing the errors from the limited number of stations using gain vectors using error covariance structure in line with the Kalman ﬁlter algorithm. Optimal interpolation would also be useful in carrying out the distribution tasks.

Appendix A. Evolutionary algorithms Evolutionary algorithms (EAs) are engines simulating grossly simpliﬁed processes occurring in nature and implemented in artiﬁcial media––such as a computer. The fundamental idea is that of emulating the Darwinian theory of evolution. According to Darwin, evolution is best depicted as the process of the adaptation of species to their environment as the one of ‘‘natural selection’’. Perceived in this way, all species inhabiting our planet are actually results of this process of adaptation. Evolutionary algorithms eﬀectively provide an alternative approach to problem solving––one in which

solutions to the problem are evolved rather than the problems being solved directly. The family of evolutionary algorithms today is divided into four main streams: Evolution Strategies [29], Evolutionary Programming [16], Genetic Algorithms [19] and Genetic Programming [24]. Although diﬀerent and intended for diﬀerent purposes, all EAs share a common conceptual base (schematised in Fig. 8). In principle, an initial population of individuals is created in a computer and allowed to evolve using the principles of inheritance (so that oﬀspring resemble parents), variability (the process of oﬀspring creation is not perfect––some mutations occur) and selection (more ﬁt individuals are allowed to reproduce more often and less ﬁt less often so that their ‘‘genealogical’’ trees disappear in time). One of the main advantages of EAs is their domain independence. EAs can evolve almost anything, given an appropriate representation of evolving structures. Similarly to processes observed in nature, one should distinguish between an evolving entity’s genotype and its phenotype. The genotype is basically a code to be executed (such as a

Fig. 8. Schematic illustration of an evolutionary algorithm. The population is initialised (usually randomly). From this population, the most ﬁt entities are selected to be altered by genetic operators exempliﬁed by crossover (corresponding to sexual reproduction) and mutation. Selection is performed based on certain ﬁtness criteria in which the more ‘ﬁt’ are selected more often. Crossover simply combines two genotypes by exchanging sub-strings around randomly selected points. In the illustration above, parental genotypes are indicated as either all 1 s or all 0 s, for the sake of clarity. Mutation simply ﬂips the randomly selected bit.

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772

code in a DNA strand), whereas the phenotype represents a result of the execution of this code (such as any living being). Although the information exchange between evolving entities (parents) occurs at the level of genotypes, it is the phenotypes in which one is really interested. The phenotype is actually an interpretation of a genotype in a problem domain. This interpretation can take the form of any feasible mapping. For example, for optimisation and constraint satisfaction purposes, genotypes are typically interpreted as independent variables of a function to be optimized. Along these lines, one can employ mapping in which genotypes are interpreted as roughness coeﬃcients in a free surface pipe ﬂow model with the genetic algorithms (GAs) directed towards the minimisation of the discrepancies between model output and measured water level and discharge values. Resulting GA represents an automatic calibration model of hydrodynamic systems [6]. Several other applications of GAs, which make use of various kinds of genotype-phenotype mappings and with a speciﬁc emphasis on water resources, are described in for example [2,4,23]. Following the idea of evolutionary embedding, a steady state GA has been implemented in which the evolving individuals represent embedding vector xðtÞ, as well as the number of nearest neighbours to be used for ﬁtting of the local models. For all the runs, population size was set to 100. The selection mechanism was based on tournament selection, with a tournament size of 8. The ﬁtness was based on the associated accuracy of resulting LLMs.

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16] [17]

[18] [19] [20]

[21] [22]

References [1] Abarbanel HDI. Analysis of observed chaotic data. New York: Springer-Verlag; 1996. [2] Babovic V. Emergence, evolution, intelligence: hydro informatics. Rotterdam: Balkema; 1996. [3] Babovic V, Keijzer M. Forecasting of river discharges in the presence of chaos and noise. In: Marsalek J, editor. Coping with ﬂoods: lessons learned from recent experiences. NATO ARW series. Dordrecht: Kluwer; 1999. [4] Babovic V, Keijzer M. Genetic programming as a model induction engine. J Hydro Informatics 2000;2(1):35–60. [5] Babovic V, Keijzer M, Bundzelm, M. From global to local modeling: a case study in error correction of deterministic models. In: Proceedings of the Fourth International Conference on Hydro Informatics, Iowa city, 2000. [6] Babovic V, Larsen LC, Wu Z. Calibrating hydrodynamic models by means of simulated evolution. In: Proceeding of the ﬁrst international conference on hydro informatics. Rotterdam: Balkema; 1994. p. 193–200. [7] Battisti DS, Clarke AJ. A simple method for estimating barotropic tidal currents on continental margins with speciﬁc application to the M2 tides oﬀ the Atlantic and Paciﬁc coasts of the United States. J Phys Oceanogr 1982;12:8–16. [8] Blumberg AF, Mellor GL. A description of a three-dimensional coastal model. In: Heaps N, editor. Three-dimensional coastal

[23]

[24]

[25]

[26] [27] [28] [29] [30] [31] [32]

771

ocean models. Coastal and Estuarine sciences, series number 4. Washington, DC: American Geophysical Union; 1987. p. 1– 16. Chau KW, Jin HS, Sin YS. A ﬁnite diﬀerence model of twodimensional tidal ﬂow in Tolo harbor, Hong Kong. Appl Math Model 1996;20:321–8. Chen P, Mellor GL. Determination of tidal boundary forcing using tide station data. In: Mooers CNK, editor. Coastal Ocean prediction, Coastal and Estuarine studies, vol. 56. Washington, DC: American Geophysical Union; 1999. p. 329–51. Clarke AJ. The dynamics of barotropic tides over the continental shelf and slope. In: Parker BB, editor. Advances in tidal hydrodynamics. John Wiley & Sons, Inc; 1991. p. 79–108. Cristianini N, Shawe-Taylor J. An introduction to support vector machines (and other kernel-based learning methods). Cambridge: Cambridge University Press; 2000. Cummins PF, Oey LY. Simulation of barotropic and baroclinic tides oﬀ Northern British Columbia. J Phys Oceanogr 1997;27:762–81. Derber J, Bouttier F. A reformulation of the background error covariance in the ECMWF global data assimilation system. Tellus 1999;51A:195–221. Fang G, Kwok YK, Yu K, Zhu Y. Numerical simulation of principal tidal constituents in the South China Sea, Gulf of Tonkin and Gulf of Thailand. Cont Shelf Res 1999;19:845–69. Fogel LJ, Owens AJ, Walsh MJ. Artiﬁcial intelligence through simulated evolution. Ginn, Needham Height; 1966. Frison TW, Abarbanel HDI, Earle MD, Schultz JR, Scherer W. Chaos and predictability in ocean water levels. J Geophys Res 1999;104(4):7935–51. Hense A. On the possible existence of a strange attractor for the southern oscillation. Beitr Phys Atmos 1987;60(1):34–7. Holland JH. Adaptation in natural and artiﬁcial systems. University of Michigan, Ann Arbor, 1975. Holloway P. A regional model of the semidiurnal internal tide on the Australian North West shelf. J Geophys Res 2001;106(C9): 19,625–38. Jayawardena AW, Lai F. Analysis and prediction of chaos in rainfall and stream ﬂow time series. J Hydrol 1994;153:23–52. Kang SK, Lee SR, Lie HJ. Fine grid tidal modeling of the Yellow and East China Seas. Cont Shelf Res 1998;18:739–72. Keijzer M, Babovic V. Error correction of a deterministic model in Venice lagoon by local linear models. In: Proc ‘Modeli Complessie Metodi Computazional Intensivi per la Stima e la Previsione’ Conference, Venice, 1999. Koza JR. Genetic programming: on the programming of computers by means of natural selection. Cambridge, MA, USA: MIT Press; 1992. Merriﬁeld MA, Holloway PE, Johnson TMS. The generation of internal tides at the Hawaiian Ridge. Geophys Rev Lett 2001;28:559–62. Packard NH, Crutchﬁeld JP, Farmer JD, Shaw RS. Geometry from a time series. Phys Rev Lett 1980;45:712–6. Rahman M. Analysis and prediction of chaotic time series. MSc thesis, IHE/DHI, 1999. Riddle AM. Uncertainties in modeling of tidal ﬂows oﬀ Singapore Island. J Mar Syst 1996;8:133–45. Schwefel HP. Numerical optimization of computer models. Chichester: Wiley; 1981. Schwiderski EW. On charting global ocean tides. Rev Geophys Space Phys 1980;18:243–68. Sivakumar B, Liong SY, Liaw CY, Phoon KK. Singapore rainfall behaviour: chaotic. J Hydrol Eng, ASCE 1999;4(1):38–48. Sorensen JVT, Madsen H, Madsen H. Towards an operational data assimilation system for a three-dimensional hydrodynamic model. In: Fifth International Conference on Hydroinformatics, 2002.

772

S.A. Sannasiraj et al. / Advances in Water Resources 27 (2004) 761–772

[33] Takens F. Detecting strange attractors in turbulence. In: Rand DA, Young L-S, editors. Lecture notes in mathematics: dynamical systems and turbulence, vol. 898. Berlin: Springer-Verlag; 1980. p. 366–81. [34] Zaldivar JM, Gutierrez E, Galvan IM, Strozzi F, Tomasin A. Forecasting high waters at Venice Lagoon using chaotic time

series analysis and non-linear neural networks. J Hydro Informatics 2000;2(1):61–84. [35] Zhang QY, Gin KYH. Three-dimensional numerical simulation for tidal motion in Singapore’s coastal waters. Coast Eng 2000;39:71–92.

Lihat lebih banyak...

Enhancing tidal prediction accuracy in a deterministic model using chaos theory

Descripción

Comentarios