Application of four machine-learning methods to predict short-horizon wind energy

2023-12-28 03:31DohaBouabdallaouiTouriaHaidiFaissalElmariamiMounirDerriElMehdiMellouli
Global Energy Interconnection 2023年6期

Doha Bouabdallaoui ,Touria Haidi ,Faissal Elmariami ,Mounir Derri ,El Mehdi Mellouli

1.Laboratory LAGES,Ecole Hassania des Travaux Publics (EHTP),Casablanca 20230,Morocco

2.Laboratory LESE,Ecole Nationale Supérieure d’Electricité et de Mécanique (ENSEM),Casablanca 20100,Morocco

3.Laboratory LESA,School of Applied Sciences,Sidi Mohamed Ben Abdellah University,Fez 30000,Morocco

Abstract: Renewable energy has garnered attention due to the need for sustainable energy sources.Wind power has emerged as an alternative that has contributed to the transition towards cleaner energy.As the importance of wind energy grows,it can be crucial to provide forecasts that optimize its performance potential.Artificial intelligence (AI)methods have risen in prominence due to how well they can handle complicated systems while enhancing the accuracy of prediction.This study explored the area of AI to predict wind-energy production at a wind farm in Yalova,Turkey,using four different AI approaches:support vector machines (SVMs),decision trees,adaptive neuro-fuzzy inference systems (ANFIS)and artificial neural networks (ANNs).Wind speed and direction were considered as essential input parameters,with wind energy as the target parameter,and models are thoroughly evaluated using metrics such as the mean absolute percentage error (MAPE),coefficient of determination (R²),and mean absolute error (MAE).The findings accentuate the superior performance of the SVM,which delivered the lowest MAPE (2.42%),the highest R² (0.95),and the lowest MAE (71.21%)compared with actual values,while ANFIS was less effective in this context.The main aim of this comparative analysis was to rank the models to move to the next step in improving the least efficient methods by combining them with optimization algorithms,such as metaheuristic algorithms.

Keywords: Wind Energy Prediction;Support Vector Machines;Decision Trees;Adaptive Neuro-Fuzzy Inference Systems;Artificial Neural Networks

0 Introduction

Faced with changing energy demand,electricity suppliers are tending to switch to utilizing renewable sources [1-5].Wind power has become a suitable and promising solution,amidst a myriad of renewable sources available [6-9].Wind power is increasing in importance.It offers many advantages and is instrumental in the transition to a greener,more ecologically friendly future [10-12].

Given its prominence as a reliable source of energy,the ability to predict wind-energy generation is becoming increasingly important to ensure smooth integration into existing distribution networks [13,14].Predictive techniques can aid wind-power plant operators in estimating power production variations,enabling better synchronization with alternative energy sources,and improving grid stability [15-18].

Diverse methods such as statistical analysis and artificial intelligence (AI)have been utilized to estimate wind power production.Wind power production has been forecasted by applying a statistical model which relies on both historical and meteorological data.Physical models employ fluid dynamics and atmospheric physics to calculate the motion of winds.As the sophistication of AI increases,neural networks and machine learning are getting more attention.AI-based procedures rely on substantial datasets,meteorological information,and previous wind-energy data to provide more precise forecasts [19-22].

The use of machine-learning approaches to forecast wind generation has been demonstrated in the literature.In [23],a predictive model was designed using artificial neural networks (ANNs).They extracted data from KL University,Andhra Pradesh.The results indicated that increasing the delay reduces errors,while increasing the number of cache layer neurons enhances processing capacity and further decreases predicted output errors.In another study [24] utilized ANN for wind-energy prediction in the Himalayan region.They acquired data consisting of thirty-day inputs of wind velocity,air temperature,and density for predicting wind energy.The ANN proved to be an effective prediction algorithm in this research with a regression coefficient of 0.99.Furthermore,[25] presented a technique using support vector machines (SVMs)to forecast wind turbine generation.The suggested wind power prediction method begins by predicting wind speed,then uses the properties of the turbine generators to predict wind energy.Results indicate the capacity of this approach to be accurate in the very short and short horizon,and to be more efficient than the persistence model.In [26],wind energy was predicted using a mixture of SVM and enhanced dragonfly algorithm.The robustness of this model was validated against actual data collected from the La Haute Borne French wind park.Compared with two other techniques,the proposed model demonstrated superior performance.[27] established a comparison of three methods:ANNs,autoregressive integrated moving average (ARIMA),and adaptive neurofuzzy inference system (ANFIS),for predicting wind energy.ARIMA provided more accurate forecasts than the other two methods.The ANFIS yielded the least accurate output.In [28],a hybrid approach of ANFIS and particle swarm optimization (PSO)was used to predict shorthorizon wind energy.With an average value of 5.41%,MAPE performs better than the standard ANFIS and the persistence model,confirming the good performance of the ANFIS.The accuracy of predictions was compared in depth between decision trees and two other regression methods in [29].The weather variables consisting of wind velocity,wind direction and temperature were ranked according to their dominant influence on wind energy output.The most prominent influence was on the wind speed factor.Regarding the mean absolute error (MAE)and coefficient of determination (R2),the decision tree and nearestneighbor regression approaches performed more accurately than linear regression techniques.In [30],an exhaustive comparison between ANN,SVM and decision trees was conducted.Results showed that properly tuned ANNs provide accurate predictions of wind-energy production,but the time and processing capacity necessary of finding the optimum hyperparameter configuration is high.The decision-tree model is more transparent than the ANN algorithm,but its performance is inferior.

This study investigated the prediction of wind energy production for an installed wind farm of 54,000 kW in Yalova,Turkey.In this context,four AI techniques were used to compare their performance and determine the most and least accurate models.The selected algorithms were the SVM,decision trees,ANFIS,and ANN.The general structure of the study is shown in Fig.1.

Fig.1 Global structure of the methodology of the study

This study performed a complete comparative assessment of several well-known classical AI methods addressing the specific task of wind-power production prediction.It goes beyond previous efforts in the literature,which often focused on single methods or advanced methods combined.To this end,this study helps to give an overall concept of the performance of these standard prediction AI methods,and then to classify them,so enabling us to work afterwards on the worst-performing method,trying to improve it by one or more optimization methods.

1 Methods

The wind farm in Yalova was commissioned in 2016.It has the following coordinates:latitude:40° 34' 58.3" and longitude:28° 56' 3".Figure 2,generated by Google Maps,shows the turbines marked in red [31].An integrated SCADA monitoring system was installed and recorded data for one of the wind generators.The database used can be freely accessed via [32].It is an open-source dataset which contains 5 data:time,wind power (kW),wind speed (m/s),wind direction (°),and wind energy (kWh).The record of the data runs from January 1 to December 31,2018.

Fig.2 Cartographic presentation of the Yalova wind power plant generated by Google Maps

To carry out wind power prediction for Yalova Turkish wind park,SVM,decision trees,ANFIS,and ANN methods are implemented.Specific pre-processing phases were required for each method to guarantee the compatibility and quality of the wind power dataset,notably scalability and normalization,and the handling of missing values and categorical variables.

Yalova’s wind power database was based on two inputs:wind direction and wind speed,as well as the wind power as a target parameter.A total of 45,507 data samples were utilized for the training phase,which extended from 00:00 on January 1,2018,to 23:50 on November 26,2018,as presented in Fig.3 and Fig.4.

Fig.3 Wind speed data input (m/s)for January 1st,2018 to November 26th,2018

Fig.4 Wind direction data input (m/s)for January 1st,2018 to November 26th,2018

Fig.5 Wind direction against wind power

Fig.6 Wind speed against wind power

As for the test phase,144 samples were used,ranging from 00:00 to 23:50 on November 27,2018.Prominently,the timescale of the database was defined as 10 min.The distribution of the velocity and the angle of wind against wind-energy output is shown in Fig.5 and Fig.6,respectively.These graphs facilitated data exploration,model selection,reporting of results,and evaluation of their quality.The following Table 1 lists the average and standardized deviation,the minimal and maximal amounts,and the confidence interval for each variable used.

1.1 Artificial Neural Networks (ANNs)

ANNs are a category of computing system.Typically,ANNs are formed by neurons interconnected in layers [33],[34].On one side,data are captured in the input layer,while on the other,the desired prediction is delivered via the output layer.Between these layers,the hidden layers enable the system to evolve and extract features from the input signal.There are three widely employed functions for training data:Levenberg-Marquardt (LM),scaled conjugate gradient (SCG),and Bayesian regularization (BR)[35,36].A general flowchart of the ANN design is shown in Fig.7.

The selected function was the LM algorithm,which has been identified as an effective algorithm for midsize systems.Two independent inputs were included in the model:wind velocity and direction,and the model output was the wind energy produced.During the modeling process,the number of hidden neurons was increased to 50 to round out the nonlinear ANN architecture.The details of the main model settings are listed in Table 2.

Fig.7 General flow chart of an ANN

Table 1 Information about the dataset

Table 2 Principal parameters of the ANN model

1.2 Adaptive Neuro-Fuzzy Inference System (ANFIS)

By combining the advantages of fuzzy logic and neural networks,ANFIS provides an efficient hybrid approach for handling complicated forecasting applications.ANFIS involves a two-phase procedure.Initially,fuzzy logic is employed to create a collection of fuzzy rules derived from existing knowledge and training data.These rules establish relationships linking the input and output variables to be predicted.Subsequently,neural networks have been utilized to optimize the settings of the fuzzy rules [37,38].This adaptive process enables ANFIS to learn directly from the data and sharpen its prediction ability.By incorporating fuzzy logic and neural networks,ANFIS provides a prediction approach that is versatile and responsive,capable of capturing nonlinear features and addressing the uncertainties inherent in real-time prediction challenges [39,40].A general flowchart of the ANFIS model is shown in Fig.8.

Fig.8 General flow chart of ANFIS

In this case,nine membership functions and two inputs were used.This implies that nine fuzzy rules existed.A hybrid training algorithm that associates back-propagation and least-squares methods was used to calculate the membership function parameters.In this configuration,ANFIS uses the Sugeno fuzzy inference system,which employs if-then fuzzy rules with linear consequences.The processing method for antecedent fuzzy sets was “prod,” and “probor” for consequent fuzzy sets.The method employed for defuzzification was “wtaver.” The implication method was “prod,” and the aggregation method was “sum.” Further details are presented in Table 3.

Table 3 Principal parameters of the ANFIS model

continue

The general expression for a Gaussian membership function (gauss2mf)used in the ANFIS is

whereμ(x)is the value of membership defined for a given input valuex,cis the MF Gaussian center,identifying the function’s peak or center,andσis the standard deviation,controlling the function’s width or range.

1.3 Decision Trees

Decision trees are effective forecasting methods [41].The process consists of recursively splitting data according to attributes or characteristics,thus creating the decision tree structure presented in Fig.9.A specific characteristic is assessed at every node of the tree,and the data are partitioned into subgroups depending on the value of the characteristic.This process continues until certain cutoff criteria are reached,such as the maximum value or minimum count of data points in the assembly.The final nodes,called the leaf nodes,contain predictions or outcomes.Decision trees can be interpreted,process both numerical and categorical data,and detect nonlinear links between variables [42,43].

Fig.9 General structure of a decision tree

To optimize the efficiency and precision,the coarse tree model employed to predict wind power was parameterized with specific settings.By requiring a minimum of 72 occurrences in the parent node and 36 occurrences in each leaf node,the model guaranteed that the nodes had sufficient data for reliable splitting while avoiding overfitting.The number of splits permitted was set to a maximum of 45,512,meaning that the model could create a complex tree structure to understand the delicate relationships within the wind power data.Pruning was activated to eliminate unnecessary nodes and improve the generalizability of the model.Moreover,the tree technique considered up to ten different categories of class predictors,offering flexibility in the processing of different wind energy features.As predictor selection had to be set to "All Splits," all predictors available were utilized for each node during the tree-building process.Generally,the coarse tree model configuration aimed to balance complexity and generalization,enabling it to predict wind power production efficiently,precisely,and consistently.

1.4 Support Vector Machines (SVMs)

SVMs are widespread machine-learning algorithms designed for predictive applications in a wide range of fields.SVMs operate according to an optimal classification or fitting function that best fits data [44,45].SVMs can handle both small and large-scale datasets and operate efficiently in high-dimensional spaces [46,47].The basic SVM architecture is shown in Fig.10.

Fig.10 General structure of SVM

The SVM configuration in this study comprised specific parameters and settings.The Gaussian kernel (RBF)is a widely adopted method for capturing complex relationships between data points,and was selected in this case.Epsilon is a hyperparameter that specifies the margin size around the support vectors and has a value of 132.65.The kernel scale determined the impact of each support vector on the boundaries defined by the decision;in this case,it was set to 1.4.The details of the SVM system parameters are listed in Table 4.

Table 4 Principal parameters of the SVM model

continue

2 Prediction accuracy evaluation criteria

Various measures can be used to assess the efficiency of wind-energy forecasting models.The accuracy of the proposed approach is evaluated using various criteria.

2.1 Mean Absolute Percentage Error (MAPE)

This parameter calculates the forecast deviation as a percentage of the actual or target value [48].

2.2 Determination coefficient (R2)

This corresponds to the ratio of the dependent variable variant,which is expressed in the regression formula using one or several independently defined parameters.It can be specified as a value from 0 to 1 or as a percentage [49].

2.3 Mean Absolute Error (MAE)

This parameter expresses the average deviation of the predicted values from the target values [50].

2.4 Bias (B)

This parameter is a measure of the systematical error of the forecast model.

3 Results and discussion

Table 5 summarizes the effectiveness of the four forecasting methods using different criteria.The findings presented in Table 6 show the extent to which the SVM system provided more accurate forecasts.

The use of the SVM in the short horizon to predict wind energy achieved relevant feedback,as shown in Fig.11.The SVM exhibited outstanding performance with a MAPE of 2.42%,indicating its remarkable accuracy in forecasting wind power production.The R2of 0.95,as presented in Fig.12,demonstrated a close correlation between the predicted and observed results.Moreover,compared with the actual values,an MAE of 71.21 demonstrated that the model could achieve a mean prediction deviation of 71.21 units.These convincing results underlined the performance of the SVM method in providing reliable wind-energy forecasts in both the short and very short terms.

Fig.11 SVM tracing of predicted and true data on November 27,2018

Fig.12 Scatter plot of real versus forecast data using the SVM

The ANN results for wind-energy prediction revealed considerable promise,as shown in Fig.13.The model had a MAPE of 3.13%,indicating good accuracy in forecasting wind power production.The R2of 0.92,as shown in Fig.14,indicated a high correspondence between the expected and true values,suggesting that the ANN model effectively captured the underlying patterns of the windenergy data.Additionally,the MAE of 96.25 signified that the model’s predictions,on average,deviated from the actual values by 96.25 units,which is comparatively modest for the scale of the wind power measures.These results indicated the ability of the ANN model to provide confident predictions of wind power over the short and very short terms,considering that this accuracy can be enhanced by running the model using a large database with multiple data inputs.

Fig.13 ANN tracing of predicted and true data on November 27,2018

Fig.14 Scatter plot of real versus forecast data using the ANN

The use of coarse trees to predict wind energy yielded notable results,as shown in Fig.15.The coarse tree model had a MAPE of 3.07%,which indicates a reasonable level of forecasting precision for wind energy output.An R2of 0.88 (Fig.16)revealed a modest correlation between the forecast and true values.Meanwhile,the MAE of 92.38 signified that the model’s predictions deviated from the actual values by an average of 92.38 units.These results demonstrate that the coarse-tree model is a viable option for short-horizon wind-energy forecasting.

Fig.15 Coarse tree tracing of predicted and true data on November 27,2018

Fig.16 Scatter plot of real versus forecast data using the coarse tree

The ANFIS method for wind-energy prediction generated noteworthy results,as shown in Fig.17.ANFIS achieved an MAPE of 3.53%,reflecting a reasonable precision level in terms of wind power production forecasting.An R2of 0.91 (Fig.18)indicated a close correlation between the predicted and observed values,which means that the ANFIS model captured a considerable amount of the variation present in the wind power data.Furthermore,an MAE of 106.91 meant that,on average,the ANFIS model predictions deviated from the actual values by 106.91 units.Overall,these results confirmed the suitability of the ANFIS model for wind-energy prediction.

Fig.17 ANFIS tracing of predicted and true data on November 27,2018

Fig.18 Scatter plot of real versus forecast data using ANFIS

By comparing the performance metrics of each method,the strongest R2value was for the SVM model (0.95),reflecting the greatest correlation and matching with real data,whereas the decision tree model had the smallest R2ratio (0.88),which indicated a lower fit.In addition,the SVM model provided the weakest MAPE (2.42%),revealing the smallest deviation percentage compared with the real values overall.In addition,the lowest MAE was achieved by the SVM model (71.21),representing the smallest deviation,on average,relative to the true measured values.In terms of bias,the SVM model yielded the smallest bias (-66.77),which indicated the least underestimation of wind power output consistently.In summary,based on the provided measures,the SVM model was the most successful of the four models,with the greatest R2,minimum MAPE,least MAE,and relatively low bias.Its predictive capability was considerable and correlated well with real data.

The different accuracy levels of the wind-energy prediction results were owing to the intrinsic features and capacities of the models used.The inherent qualities of the SVM and ANN models achieve more accurate windenergy forecasts.In particular,SVMs can handle complex nonlinear data relationships and dynamics,enabling them to effectively understand complex patterns.Conversely,the ability of ANN models to recognize and learn complicated relationships within data is well-known and can be used to uncover patterns and provide precise predictions.However,the less precise results using coarse trees and ANFIS may reflect their limitations in capturing complex relationships and processing large datasets.Fig.19 presents a histogram of the prediction errors for the four methods used.

This histogram shows that bins ranging from -300 to 0 corresponded to prediction errors in which the model underestimated the target variable,whereas those ranging from 0 to 100 corresponded to prediction errors in which the model overestimated the target variable.The peak at approximately zero indicated that the SVM model had a relatively low bias.This meant that the predictions of the model,on average,were not heavily biased towards over-or under-estimation of the target.

These classical AI methods can perform the prediction task;however,to achieve higher accuracy,a richer database with more inputs would be required and regularization methods should be used to prevent overfitting;moreover,optimization algorithms should be used either as training functions or to optimize the choice of kernel parameters for the models.

Fig.19 Histogram of prediction error for the four AI methods used

Table 5 Evaluating model training Performance based on the criteria

Table 6 Prediction outputs per hour on November 27,2018,using the four methods (in kW)

4 Conclusion

The shift toward sustainable energy sources has made wind energy a highly desirable alternative.Precise wind generation prediction is the key to optimizing its deployment and integration into the existing power infrastructure.In this study,a wind-energy production forecast for a wind park in Yalova,Turkey,was conducted using artificial intelligence methods.Four AI approaches were used:SVM,decision trees,ANFIS,and ANN.Simulations using the four techniques revealed that the SVM and ANN were accurate predictors for short-term wind power forecasts,recording MAPE=2.42 and R2=0.95 for SVMs and MAPE=3.13 and R2=0.92 for the ANN.Nevertheless,coarse trees and adaptive neuro-fuzzy inference systems were relatively less precise,resulting in MAPE=3.07 and R2=0.88 for coarse trees and MAPE=3.53 and R2=0.91 for ANFIS.This study aimed to explore and refine forecast models to enhance the precision and robustness of wind-energy predictions.Combining these models with optimization algorithms such as metaheuristic algorithms can generate a more accurate configuration and may be an effective solution for improving the precision of predictive machine-learning methods.

Declaration of Competing Interest

We declare that we have no conflict of interest.