Learning the Spatiotemporal Evolution Law of Wave Field Based on Convolutional Neural Network

2022-10-24 05:26LIUXingGAOZhiyiHOUFangandSUNJinggao
Journal of Ocean University of China 2022年5期

LIU Xing, GAO Zhiyi, HOU Fang, and SUN Jinggao

Learning the Spatiotemporal Evolution Law of Wave Field Based on Convolutional Neural Network

LIU Xing1), 2), GAO Zhiyi2), HOU Fang2), and SUN Jinggao1), *

1) School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China 2) National Marine Environmental Forecasting Center, Beijing 100081, China

Research on the wave field evolution law is highly significant to the fields of offshore engineering and marine resource development. Numerical simulations have been conducted for high-precision wave field evolution, thus providing short-term wave field prediction. However, its evolution occurs over a long period of time, and its accuracy is difficult to improve. In recent years, the use of machine learning methods to study the evolution of wave field has received increasing attention from researchers. This paper proposes a wave field evolution method based on deep convolutional neural networks. This method can effectively correlate the spatiotemporal characteristics of wave dataconvolution operation and directly obtain the offshore forecast results of the Bohai Sea and the Yellow Sea. The attention mechanism, multi-scale path design, and hard example mining training strategy are introduced to suppress the interference caused by Weibull distributed wave field data and improve the accuracy of the proposed wave field evolution. The 72- and 480-h evolution experiment results in the Bohai Sea and the Yellow Sea show that the proposed method in this paper has excellent forecast accuracy and timeliness.

wave evolution; machine learning; convolutional neural network; hard example mining

1 Introduction

The evolution of the wave field is forced motion; it is mainly dominated by the wind field, although it follows its own evolution law. As early as 200 years ago, researchers noticed a strong correlation between wave height and wind speed. Sir Beaufort of the British Navy first summarized the relationship between wind and wave, and the resulting Beaufort Wind Class and Wave Class Relationship Table is still of practical significance today. With the development of wave theory, researchers now have a better understanding of wave generation and dissipation. Sverdrup and Munk (1947) used characteristic waves to study the growth of waves; however, their proposed me- thod is too simple to accurately describe the evolution of waves. Pierson(1958) combined the energy balance equation with the wave spectrum and established a spectrum transmission equation to describe the growth process of wind and wave. This equation is still widely used to date.

With the continuous improvement of wave theory, the current methods for wave evolution can be divided into two types: numerical methods and soft-computing methods. The numerical wave model is based on the concept of wind-wave energy spectrum and can better character-ize the evolution of wave field with wind field. Over the years, researchers have developed a series of numerical models, such as Wave Model (WAM) (Komen, 1996), Simulating Waves Nearshor (SWAN) (Booij, 1999), and WAVEWATCH III (Tolman, 2009; Zheng, 2018; Sheng, 2019) to perform the evolution of ocean waves. However, the numerical simulation process must be able toperformcomplexcalculationsonhigh-performancecom- puters. Moreover, its timeliness also needs further improvement.

In comparison, the soft-computing methods are used to achieve the evolution of ocean waves through soft-computing algorithms. Machine learning (ML) algorithms, suchas support vector machines, neural networks, and long-short memory networks (LSTM) (Hochreiter and Schmidhuber, 1997), are some of the most popular soft-computing methods applied today. Deo and Naidu (1998) conducted experiments to prove the feasibility of using artificial neural networks to forecast waves. Later on, other researchers improved the prediction accuracy of artificial neural networks (Tsai, 2002; Deka and Prahlada, 2012; Nagalingam, 2018). However, the above works on neural networks used simple, three-layer neural networks. The neuron connection methods in these neural networks are all fully connected, and the wave field data are regarded as isolated points for prediction, without considering the spatial characteristics between the data. Mahjoobi and Mosab- beb (2009) used support vector machines to achieve better accuracy than artificial neural networks; however, they did not consider the spatial characteristics of the data. As shown in Fig.1, wave field data have a strong correlation in time and space. The success of deep neural networks in traffic flow prediction, rainfall prediction, and sea ice prediction fully proves that deeper model structures have stronger re- presentation learning capabilities (Shi, 2015; Zhang, 2017; Jiao, 2020; Lim and Zohren, 2020). To date, only a few studies have been able to combine ocean wave and deep learning, and none of them have investigated the evolution of ocean waves. Ni and Ma (2020) used LSTM to predict the effective height of the wave; however, LSTM was unable to mine important spatial information in the wave field. Choi(2020) used ConvLSTM to recognize sea state, which effectively correlated the temporal and spatial characteristics of the data, but it was not used for the study of wave evolution laws.

As mentioned above, ML-based methods do not consider the spatiotemporal characteristics of the wave field, and the evolution accuracy of the algorithm needs further improvement. Moreover, the wave evolution process based on numerical simulation has poor timeliness and high hard- ware requirements. The current paper proposes a wave field evolution method based on deep convolutional neural networks (DCNN). The proposed method can effectively correlate the spatiotemporal characteristics of data and improve the timeliness of evolution. The main contributions of this paper are as follows:

1) This work used DCNN to approximate the evolution of the wave field. Considering the physical properties of the wave, we propose a wave prediction model based on UNet (Ronneberger, 2015). Experiments prove that the proposed method has better accuracy and spatiotemporal correlation than other methods.

2) The hard example mining training strategy is used to solve the problems caused by the Weibull distribution of wave element data and to further improve the accuracy of wave evolution.

3) Extensive experiments in the Bohai Sea and the Yellow Sea show that the accuracy of the evolution is better than that of the full neural network and LSTM; furthermore, the evolution of wave can be performed on a single CPU. Finally, compared with numerical simulation methods, the timeliness of evolution has also been improved exponentially.

2 Dataset

The dataset used in this paper contains the reanalysis data ERA-5 generated by the European Mesoscale Weather Forecast Center. The selected learning range for the evolution of the Bohai Sea and the Yellow Sea is 32˚06΄–41˚91΄N, 117˚28΄–127˚12΄E. The width and height of the wave field grid is 36 pixels, and the spatial resolution of latitude and longitude is 0.28˚. Part of the data is shown in Fig.1. As can be seen, the observation elements are sig-nificant wave height and wind speed data (u10m and v10m). The time resolution is 1h, and the time range is from January 1, 2000 to December 30, 2004. There are 35064 samples in total, of which 27000 are used as the training set. The remaining data samples comprise the test set, which is used to verify the accuracy of the evolution result.

Fig.1 Part of the wave field and wind field element data used in the study. (a), significant wave height; (b), average wave direction; (c), wave length; (d) u10m/v10m wind field. The spatiotemporal data change continuously and have a strong correlation.

The Bohai Sea and the Yellow Sea are part of the WesternPacific; the topography of the Bohai Sea is almost closed andisnotaffectedbythePacificswells(Zheng,2017).Mainly, wind waves can be observed in the Bohai Sea areawith almost no swells. The Yellow Sea, especially its south- ern part, is affected by swells, making it relatively difficult to learn the evolutionary laws. Two sea areas with dif- ferent characteristics, which are conducive to studying the evolution effect of DCNN methods in different regions, are used as the research objects in this paper.

3 Methods

This paper proposes a new network as a training model, which is developed based on UNet. According to the characteristics of the wave field, we introduced a multi-scale path (Sun, 2019) and an attention mechanism (Woo, 2018) to ensure the suitability of the model structure for wave data. We also introduced a hard example mining strategy (Li, 2019) to suppress the influence of the Weibull distribution of wave data, thereby effectively im-proving the accuracy of evolution. This chapter also studies two forecast methods (one-time forecast and recursive forecast) and presents its mathematical formulas.

3.1 Proposed Evolutionary Model

The proposed model in this work uses a bottleneck structure similar to the semantic segmentation model UNet. The structure of the proposed model is shown in Fig.2. As can be seen, to obtain a higher-dimensional feature map, the input wave field continuously doubles the channel num- berof the feature map and halves the spatial size through continuous convolution, activation, and pooling, among others. The feature map with the highest semantic dimension has the smallest spatial size, which is called the bottleneck of the model. Once the feature passes through the bottleneck layera series of convolution, activation and upsampling operations, the number of channels of the feature map is halved, and the space size is doubled, thus restoring the space size to the same as the input image. The final layer of the model is connected to the Sigmoid function to nonlinearly compress the prediction results to [0, 1]. To prevent the loss of image details during the convolution process, UNet adds side paths to merge the feature maps of different semantic dimensions and resolutions, thereby achieving the accurate segmentation of the input image.

Fig.2 Model structure used in this paper.

Studies have shown that retaining the feature information of different time and space scales can effectively improve the prediction accuracy of spatiotemporal data (Wang, 2018). Therefore, the model used in the current paperfurther adds multi-scale paths, which can retain the detailedinformation of the wave field while introducing multi-scale feature information. The multi-scale paths use the original input wave field and feature map as the input of the last attention module through the multi-scale path. Once these paths are introduced, the model can comprehensively use the wave field information feature maps of various scales, thus achieving more accurate results.

3.2 Evolution Method

where2is the recursive evolution model. The final result can be obtained by combining the evolution results of each part of the recursive evolution:

The experimental part will compare the prediction accuracy and efficiency of the two evolution methods.

3.3 Attention Mechanism

where represents the corresponding element-wise multiplication, conv(·) refers to the two consecutive convolution operations, F' is the feature map obtained by multiplying Fc and the channel attention map, andF'' is the feature map obtained by multiplying F' and the space attention map. The spatial map determines the weights on the spatial location, and the channel map determines the weights on different channels. The attention mechanism gives greater weight to spatial locations or channels that have huge impacts on the evolution through training iterations. Adding a spatial attention map can help in effectively avoiding the interference of meaningless areas (e.g., land). In this way, more attention is given to several key data channels, improving the convergence speed and forecast accuracy of the model.

3.4 Hard Example Mining

Hard examples are those that are difficult to classify or predict correctly in the training process and have a large loss value (Shrivastava, 2016). The corresponding simple examples are those that can be correctly and quickly classified.

As shown in Fig.4, the histogram of the significant height of the waves in the dataset has a Weibull distribution. A model trained on this type of unbalanced data tends to make predictions with small significant wave height to reduce the overall loss value. In comparison, this work uses the strategy of mining hard examples to train the proposed model (Li, 2019). Doing so can effectively suppress the influence of wave field data on training under Weibull distribution.

The model training process using the hard example mining strategy is shown in Fig.5. The model training strategy is divided into two parts: systematic learning and hard example mining. On the one hand, in systematic learning, the hard example selector does not work, and the wave evolution model uses both simple and hard examples for training. On the other hand, in the hard example mining, the hard example selector starts to work and filters out simple examples. Thus, the wave evolution model only uses hard examples for training.

We calculate the distance between the evolution result and the label as a quantitative indicator of example difficulty using the equation:

Fig.4 Histogram of wave significance height under a Wei- bull distribution.

Fig.5 Training process using hard example mining.

The hard example selector only performs forward inference and does not calculate the gradient to update its parameters. Simultaneously, there exists a gradient blockerbetween the hard example selector and the evolution model. Furthermore, the parameters of the hard example selector will not be updated as the evolution model error gradient backpropagation. To ensure that the evolution ability of the hard example selector is always slightly weaker than the wave evolution model, the wave evolution model must pass the network weight to the hard example selector to update the weight everyepoch. Thus,=,, andare the parameters of hard example selector and wave evolution model, respectively.

4 Experiments

The hardware and system configurations for the model training and post-processing in this paper are as follows: AMD EPYC 7502 32-Core Processor, 2*Nvidia GTX 3090, and 128GB RAM. We used the deep learning framework PyTorch 1.4.0 for training and Adam (Kingma and Ba, 2014) as the optimizer.

4.1 Training Details

In addition to the ocean part, a large number of land parts can be found in the dataset. The significant wave height of the land part in the dataset is set to the default value of −999. The default value obviously lacks the corresponding physical meaning. In this paper, the land part of the dataset is set to 0, indicating that the significant wave height in this area is constant at 0. The wave element data and the wind speed element data are respectively normalized by dividing the absolute value of the historical maximum value. This is done to speed up the convergence speed during training.

Next, the root mean square error (RMSE), Pearson correlation coefficient, and standard deviation are used as quantitative evaluation indicators to verify the effectiveness of different models and training strategies. The significant wave height evolution result of the land area lacks corresponding meaning; thus, the evolution error of the land area should not be included in the calculation of the overall error. The calculation formula of the RMSE used in this paper is given by

The correlation coefficient can represent the degree of linear correlation between the evolution result and the truevalue, whereas the standard deviation can represent the degree of dispersion of the evolution error. The correlationcoefficient and standard deviation calculation formula used in this paper are as follows:

whereεis the evolution error. The formula forεis given by

The spatial error matrix, which is used to study the spatial distribution of evolution error in this paper, is expressed as follows:

4.2 Comparison of Different Evolution Methods

Different evolution methods are used to predict the significant wave height in the Bohai and Yellow Sea region for 72h and to obtain their accuracies and time consumption. The errors of different evolution periods are counted, and the results are shown in Table 1.

Table 1 Root mean square error and 72-h time consumption of different forecasting methods

The results show that the recursive method requires much more time than the evolution method. However, the recursive evolution method has better accuracy and lower error accumulation with the evolution time. Moreover, the recursive evolution method can significantly reduce the memory required. Therefore, the follow-up experiments in this paper will use the recursive evolution method for inputting 24h and outputting 24h.

4.3 Precision Experiments of Different Structures

We take the RMSE, correlation coefficient, and standard deviation when the output is constant as the mean value at baseline. First, the UNet model, the UNet model with attention mechanism, the UNet model with multi-scale path, and the model proposed in this paper (with attention me-chanism and multi-scale paths) are respectively tested on the test dataset to verify the effectiveness of the model’s attention mechanism and multi-scale path design. The results are shown in Table 2. Under the same training strategy, the model proposed in this paper showed optimal mean square error, correlation coefficient, and standard deviation.

Table 2 Ablation experiment

Compared with that of the non-hard example mining, the accuracy of the model obtained by training with hard example mining is significantly improved, thereby proving the effectiveness of the designed mining strategy. Fur- thermore, the three accuracy indicators all reached the bestvalues in the proposed model with the hard example mining strategy. This finding fully proves the effectiveness of the model design with the hard example mining strategy proposed in this paper.

4.4 Analysis of Evolution Results

The 72-h significant wave height prediction effects for the Bohai Sea and the Yellow Sea using the proposed model are shown in Fig.6. The significant wave height field predicted by the model 72h later is shown in Fig.6(c). As can be seen, the model can evolve big waves accurately, and the predicted result has good smoothness both in space and time. This shows that the model can effectively integrate the spatiotemporal characteristics of ocean wave data.

Next, the predicted wave height curve of six points in space is drawn to show the difference between the predicted result and the real wave height. Selected locations can represent the respective geographical characteristics of the Bohai Sea and the Yellow Sea, and the selected locations are shown in Fig.6(b). As can be seen, the forecast result of the model is consistent with the real effective wave height trend. This indicates that the model has learned the complex nonlinear process of wave generation and propagation. Furthermore, the prediction result of the model proposed in this paper is better than that of the UNet model, thus proving the effectiveness of the proposed model structure. According to the results of the evolution, the predicted value of the significant wave height in the land area converged to a very small value (less than 1e−5), indicating that the model has learned the spatial characteristics of ocean waves.

Fig.6 Results of the 72-h significant wave height evolution in the Bohai Sea and the Yellow Sea. (a), comparison between the evolution result and the real wave field over time; (b), positions of the selected six points; (c), comparison of the real wave height, the evolution result of UNet, and the evolution result of the method proposed in this paper at the selected six-point positions.

4.5 Long-Term Evolution Experiment

To verify the long-term evolution effect of the method, a 480-h long-term forecast experiment was conducted in this study using the recursive evolution method. The forecast result of point A in Fig.6(a) is shown in Fig.7. As can be seen, when the wave height is greater than 1m, the forecast result is highly consistent with the true value. Notably, the model can still accurately predict the large wave of 3m at 450h. Thus, we are able to prove that the error of the proposed evolution method will not diverge with the increase of the forecast time and that a good forecast effect can be maintained in the long-term forecast experiment.

4.6 Spatial Distribution of Error

The degree of difficulty in learning the evolutionary laws of different regions varies along with the composition of the waves. Using the model trained under the hard example mining strategy of the algorithm proposed in this paper, the RMSE distribution of the evolution result at different positions is shown in Fig.8. As can be seen, the RMSE in the Bohai Sea region is relatively lower, whereas that in the southern part of the Yellow Sea is significantly higher. The reason is that the source of the waves in the southern part of the Yellow Sea is complicated and is affected by the waves of the Northwest Pacific and the Sea of Japan. Furthermore, learning the evolutionary laws of this region is relatively difficult.

Fig.7 Result of the 480-h significant wave height prediction experiments. The big waves after 400h framed by the red rectangle can still evolve accurately.

Fig.8 Spatial distribution of the RMSE of the prediction result.

4.7 Comparison with the Numerical Model

Table 3 shows the time consumption comparison between the numerical simulation and the method proposed in this paper. The numerical model uses WAVEWATCHIII, and the hardware computing resources used are consistent with the convolutional neural network experiment. The operating system used in the experiment is CentOS 7.9. In addition, the compiler and parallel environment adopt Intel’s ifort and mpiifort, and ST3 is used as the parameterization scheme. The convolutional neural network method has a huge advantage in the timeliness of evolution. In addition, the convolutional neural network method can perform wave field evolution on a single CPU after the training is completed.

Table 3 Comparison of evolution timeliness

5 Conclusions

In this paper, a wave field evolution method is proposedbased on a DCNN model. In particular, the proposed me- thod effectively correlates the temporal and spatial characteristics of the given data. The hard example mining strategy is used to suppress the influence of the Weibull distribution of wave data. The accuracy of evolution has been further improved using a design that incorporates attention mechanisms and multi-scale paths. Furthermore, the 72-h evolutionary RMSE can reach 0.137m, which is significantly improved compared to the fully connected neural network and LSTM methods. In the long-term evolution of 480h, the method can still maintain good accuracy.

The relationship between wind and waves in the Bohai Sea and the Yellow Sea is relatively ideal. In the follow-up work, we will further optimize the accuracy and memory based on the findings of this paper and perform the wave field evolution experiment in the Northwest Pacific.

Acknowledgements

This study is supported by the National Key Research and Development Project (No. 2018YFC1407001). We are grateful to Drs. Yunlong Dong and Qifeng Wu for their constructive comments, corrections, and inspiration.

Booij, N., Ris, R. C., and Holthuijsen, L. H., 1999. A third-generation wave model for coastal regions: 1. Model description and validation., 104(C4): 7649-7666.

Choi, H., Park, M., and Son, G., 2020. Real-time significant wave height estimation from raw ocean images based on 2D and 3D deep neural networks., 201: 107-129.

Deka, P. C., and Prahlada, R., 2012. Discrete wavelet neural network approach in significant wave height forecasting for multi- step lead time., 43: 32-42.

Deo, M. C., and Naidu, C. S., 1998. Real time wave forecasting using neural networks., 26 (3): 191-203.

Hochreiter, S., and Schmidhuber, J., 1997. Long short-term memory., 9 (8): 1735-1780.

Jiao, Y., Huang, F., and Gao, S., 2020. Research on extended-range forecast model of sea ice in the Liaodong Bay based on long short term memory network., 50 (6): 1-11 (in Chinese with English abstract).

Kingma, D. P., and Ba, J., 2014. Adam: A method for stochastic optimization.: 1412.6980

Komen, G. J., Cavaleri, L., and Donelan, M., 1996.Cambridge University Press, Cambridge, 554pp.

Li, B. Y., Liu, Y., and Wang, X. G., 2019. Gradient harmonized single-stage detector.. Hawaii, 8577-8584.

Lim, B., and Zohren, S., 2020. Time series forecasting with deep learning: A survey.: 2004.13408.

Mahjoobi, J., and Mosabbeb, E. A., 2009. Prediction of significant wave height using regressive support vector machines., 36 (5): 339-347.

Nagalingam, K., Ramasamy, S., and Mamun, A. A., 2018. Ocean wave height prediction using ensemble of extreme learning machine., 277: 12-20.

Ni, C. H., and Ma, X. D., 2020. An integrated long-short term memory algorithm for predicting polar westerlies wave height., 215: 107715.

Pierson, W. J., Neumann, G., and James, R., 1958.. United States Navy Hydrographic Office, No. 603, 284pp.

Ronneberger, O., Fischer, P., and Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation.. Munich, 234-241.

Sheng, Y. X., Shao, W. Z., and Li, S. Q., 2019. Evaluation of typhoon waves simulated by WaveWatch-III model in shallow waters around Zhoushan Islands., 18 (2): 365-375.

Shi, X. J., Chen, Z. R., and Wang, H., 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting.. Montreal, Canada, 802-810.

Shrivastava, A., Gupta, A., and Girshick, R., 2016. Training region-based object detectors with online hard example mining.. Las Vegas, 761-769.

Sun, K., Xiao, B., and Liu, D., 2019. Deep high-resolution representation learning for human pose estimation.. Long Beach, 5693-5703.

Sverdrup, H. U., and Munk, W. H., 1947.. United States Navy Hydrographic Office, No. 601, 44pp.

Tolman, H. L., 2009. User manual and system documentation of WAVEWATCH III TM version 3.14.,, 276: 220.

Tsai, C. P., Lin, C., and Shen, J. N., 2002. Neural network for wave forecasting among multi-stations., 29 (13): 1683-1695.

Wang, Y. B., Gao, Z., and Long, M., 2018. PredRNN++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning.. Stockholm, 5123-5132.

Woo, S., Park, J., and Lee, J. Y., 2018. CBAM: Convolutional block attention module.. Munich, 3-19.

Zhang, J. B., Zheng, Y., and Qi, D. K., 2017. Deep spatio-temporal residual networks for citywide crowd flows prediction.. San Francisco, 1655-1661.

Zheng, C. W., Zhang, R., and Shi, W. L., 2017. Trends in significant wave height and surface wind speed in the China seas between 1988 and 2011.,16 (5): 717-726.

Zheng, K. W., Osinowo, A. A., and Sun, J., 2018. Long-term characterization of sea conditions in the East China Sea using significant wave height and wind speed., 17 (4): 733-743.

January 25, 2021;

March 17, 2021;

March 23, 2021

© Ocean University of China, Science Press and Springer-Verlag GmbH Germany 2022

. E-mail: sunjinggao@126.com

(Edited by Xie Jun)