Train driver fatigue detection based on facial multi-information fusion

2024-01-08 09:12HAOZhengqingWANGYingCHENXiaoqiangXIONGYe

HAO Zhengqing,WANG Ying,2,CHEN Xiaoqiang,2,XIONG Ye

(1.School of Automation &Electrical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China; 2.Key Lab of Opt-Electronic Technology and Intelligent Control,Ministry of Education,Lanzhou Jiaotong Universtiy,Lanzhou 730070,China)

Abstract:In order to improve the accuracy of train driver fatigue detection,a method of train driver fatigue detection based on facial multi-information fusion is proposed.Firstly,low-light enhancement is used for image preprocessing,and human faces are detected by local binary patterns (LBP) feature.Secondly,the driver’s facial feature points are obtained by ensemble of regression trees (ERT) algorithm,afnd face model matching is used to obtain the driver’s head posture angle.Finally,according to the special driving environment of the train driver,adaptive threshold correction and eye gaze correction are carried out for the eye characteristic quantities that best show fatigue.The fuzzy inference system is used as a fusion tool,and the features of eyes,mouth and head are used as the input of the fuzzy inference system,and the driver’s fatigue value is used as the detection results.Experiment results show that the detection method can distinguish driver fatigue levels with accuracy rates of 95% in normal environments and 86.8% in low-light environments.

Key words:train driver; fatigue detection; feature point detection; head posture; facial multi-information fusion

0 Introduction

With the rapid development of China’s railway industry,train mileage has been increased and working conditions have become complex,resulting in more and more fatigue driving situations for train drivers.Therefore,it is important to study the fatigue state of train drivers for driving safety[1].

The driver fatigue detection is mainly divided into detection methods based on physiological information[2],based on driving behavior[3]and based on facial expressions[4-5].The ease of operation,low cost of the facial expression detection method and the increase in computing power with modern computers have made the method become mainstream for studying driver fatigue.

The fatigue detection of drivers mainly has the following steps.(1) Face detection:commonly used methods such as Adaboost algorithm[6],convolutional neural network (CNN)[7]and MTCNN[8]; (2) Feature point localization such as CNN[10],face sequence[12],template matching method[13],and Haar-like[14]; (3) Fatigue feature extraction such as PERCOLS value,blink frequency,yawn time,continuous eye closure time,eye opening and closing,mouth opening and closing,head posture,etc.; (4) Fatigue determination,CNN,feature fusion[9],plain Bayesian classifier,fuzzy algorithm,etc.Driving a train is different from driving a car.In addition to looking straight ahead during the driving process,at the same time,various dashboard information have been payed attention to.And the train will also operate in a low-light environment with alternating light and dark conditions as well as long periods of darkness.

A train driver fatigue detection method is proposed that considers the multi-information fusion of the face in the low-light environment for the characteristics of low illumination of the train traveling at night.The low-light enhancement of the images collected in the low-light working condition of the train reduces the influence of environmental factors on the subsequent detection.The eye characteristics are corrected when the train driver has to pay attention to the dashboard information.The establishment of a fuzzy inference system is useful to realize the train driver fatigue detection,improve the detection accuracy and increase the reliability of fatigue determination.

1 Face detection and feature point localization

In order to improve the adaptability of fatigue detection methods to different driving environments,the low-light enhanced local binary patterns (LBP) algorithm for face detection is used,and the ensemble of regression trees (ERT) algorithm is used in the face detection region to achieve 68 feature points localization,which is more conducive to the extraction of fatigue information.

1.1 Low light enhancement based LBP algorithm

The light in the driver’s room keeps changing,and the skin color of the face will be affected by the light leading to uneven skin color brightness and darkness,which affects the accuracy of detection.After the images are captured by using the camera,they are processed with LIME algorithm,histogram equalization (HE) algorithm,and gamma correction (GC) algorithm for low light enhancement.The face feature points do not be detected from the original images.The three algorithms are able to locate feature points.The image details are clearer after processed by LIME algorithm.The image is smoother after processed by HE algorithm.Compared to HE algorithm,exposure processing of the GC algorithm is stronger,while the GC algorithm takes the least time.So low light of the GC algorithm is enhanced,and the three algorithms processing results are shown in Fig.1.

Fig.1 Processing effect of three low-light enhancement algorithms

LBP[15]has the advantages of gray invariance and rotation invariance.The center pixels of the neighborhood are taken as the threshold and compared with the neighboring pixels to generate a string of binary numbers as the LBP value of the center pixel.And the LBP value of the center pixels reflects the texture information of the region around the pixel.Due to the high discriminative feature texture of the face and the fast computation speed of LBP algorithm,it is more appropriate to use LBP to detect the face,and the specific process is shown in Fig.2.The LBP algorithm is implemented in 3 steps.(1) LBP feature extraction.In the neighborhood of size 3×3,the center pixel of the neighborhood is used as the threshold,and the grayscale values of 8 adjacent pixels are compared with it.If the surrounding pixel value is greater than the center pixel value,the position of the pixel point is marked as 1,otherwise it is 0.That is to say,the LBP value of the center pixel point of the neighborhood is obtained,and the LBP value of the whole image is traversed sequentially.(2) Histogram of statistical face subregions.The image is divided into subregions of size 7×7,and the histogram is counted in the subregion according to the LBP value.(3) LBP feature matching.The histogram is used as its discriminative feature to compare with the standard face to get the face region.

Fig.2 Face detection process

1.2 Feature point location

ERT[16]is a regression forest-based algorithm for face feature point localization,which first estimates an approximate feature point location and then uses a gradient boosting algorithm to reduce the sum of squared errors between the original shape and the true shape until the iteration requirement is reached.

(1)

Step 2:Iterative updatertis expressed as

(2)

(3)

(4)

Step 4:Repeat steps 2 and 3 until convergence or reaching the number of iterations,the final output is

(5)

Step 5:Update the location.

(6)

(7)

Fig.3 Feature point positioning effect

2 Fatigue characterization

2.1 Eye feature extraction

The ocular feature parameters change most significantly when the driver feel fatigued.To increase the detection robustness,the eye aspect ration (EAR) is used to define the opening and closing as shown in Fig.4,which is more accurate compared to the monocular EAR[17].

Fig.4 Eye feature point coordinates

(8)

Each drivers have different eye sizes,so a fixedEARthreshold will cause false detection.The adaptive threshold method is proposed to determine theEARthreshold for each driver,and the two stateEARsequences of three drivers with different eye sizes are selected by using thek-means++ method for adaptive threshold,and the experimental results are shown in Table 1.

Table 1 Clustering data of eye opening and closing degree

Considering that the driver’s attention to various information on the dashboard will affect the ears,the three drivers’ normal open eyes are 0° gaze,the fixed camera and face position are 0.7 m,and the height of the camera is equal to the height of the human eye.The open and closed data of three drivers of the six fixations are measured by sliding the camera position up and down according to the degree of fixation.The 0° open and closed data is used to derive the closed eye data for each angle and fit the data,and the experimental data are shown in Table 2.

Table 2 Eye gaze and eye closure data

Let the eye gaze correction coefficient bep=1-f(θ).The least squares method is used for fitting.f(θ) is the fitting function,and the correctedEARis defined as

(9)

As shown in Fig.5,the corrected normal condition is maintained at around 0.3.When the eyes are closed,it will rapidly decrease from 0.3 and then rapidly increase to near 0.3.

Fig.5 Comparison of EAR before and after correction

When the human eyes are basically closed during fatigue,the value is close to 0,and the corrected second driver EAR is used as the threshold value.

2.2 Mouth feature extraction

The mouth aspect ration (MAR) is used to determine the mouth opening and closing degree as shown in Fig.6.

(10)

Fig.6 Mouth feature point coordinates

When the train driver is driving normally,theMARis between 0 and 0.1.When yawning,MARrises from near 0 to near 0.8 and then falls.0.4 is used as the threshold value to determine whether the mouth is in a normal state and a fatigue state.

2.3 Head posture extraction

When fatigue determination of eyes and mouth is influenced by the environment,head attitude detection can be used to improve reliability.Head pose is composed of 3 superimposed poses:pitch angle (α),roll angle (β),and yaw angle (γ).

The camera model can be represented as

(11)

wheresis the scale factor;uandvare 2D coordinates in the image coordinate system;fx,fy,cxandcyare the internal reference matrices of the camera;tijis the translation vector; andX,Y,andZare 3D coordinates in the world coordinate system,all of which are known quantities;rijis the rotation matrix,which is the quantity to be solved.The rotation matrix in terms of Euler angles can be expressed as

R=Rx(α)Ry(β)Rz(γ),

(12)

The rotation matrix is represented as a sine and cosine matrix composed of Euler angles,that is

(13)

The head pitch angle (α),roll angle (β),and yaw angle (γ) can be expressed as

α=arctan(r23/r33),

β=-arcsin(r13),

γ=arctan(r21/r11).

(14)

By counting the head posture of train drivers,it is found that the changes in roll and yaw angles are less pronounced when in a fatigued state,while the changes in pitch angle are the most pronounced with head nodding movements.A 20% change in pitch angle is used as a criterion for fatigue judgment,and the range of adult head pitch angle is from-60° to 70°[18].That is to say,when the pitch angle exceeds-18° to 18°,it is judged as nodding and considered to be in a fatigue state at that moment.It is specified that the front-to-camera is 0°,and 0.2 is used as the threshold for determining head fatigue.

3 Fatigue feature fusion

Since fatigue is a gradual process and each person has a vague concept of fatigue,it is impossible to establish mathematical expressions through precise mathematical models.The fuzzy inference system is used as a fusion tool for fatigue detection features,and the driver’s “experience” is converted into a control strategy through fuzzy inference,and the control strategy rules are shown in Table 3.The fuzzy inference system is established,taking the eye opening and closing degree,mouth opening and closing degree and head posture angle as input.The driver’s state is divided into four levels:no fatigue,mild fatigue,moderate fatigue,and severe fatigue,and the four fatigue levels are taken as the output of the fuzzy inference system,as shown in Fig.7,in which the affiliation function is chosen as the triangular affiliation function.

Table 3 Fuzzy inference rules

Fig.7 Fuzzy inference process

The eye state fuzzy set is {open,closed} and the domain of the argument is[0,1].The threshold cut-off point is the data of the 2nd driver in Table 2.Its affiliation function is

"hurt them" in English or "attack them" in Hebrew. The Israeli Defense Force uses Facebook's automated translation to monitor the accounts of Palestinian users for possible threats. In this case, they trusted Facebook's AI enough not to have the post checked by an Arabic-speaking officer before making the arrest.

(15)

The mouth state fuzzy set is {closed,talk,open} and the domain of the argument is[0,1].The threshold cut-off point is based on the P80 criterion of PERCLOS,and its affiliation function is

(16)

The head pose fuzzy set is {nod,normal} and the domain of the argument is[0,1].Its affiliation function is

(17)

The fuzzy set of fatigue degree is {none,mild,moderate,high} and the domain of the argument is[0,1].Its affiliation function is

(18)

4 Result and discussion

In order to verify whether the proposed algorithm can achieve the expected results,data acquisition is performed in a laboratory environment by simulating a real driving environment with 300 frames of data as a group.Experiments and analysis are conducted to verify the feasibility of the algorithm.

Experiment 1:The eye mouth feature quantity is used to determine the driver’s state when normal and fatigued,as shown in Fig.8.

(a) Normal state

The MAR value fluctuates around 0 in the normal state,indicating that the mouth is not open for yawning or talking.For EAR values,they remained around 0.3 except for blinks that decreased and increased rapidly at certain intervals.The fatigue level becomes higher rapidly when blinking,and the fatigue level value is basically around 0.2 when not blinking,which is in the non-fatigue or mild fatigue range.From the overall point of view,non-fatigue accounts for 226 frames,mild fatigue accounts for 38 frames,moderate fatigue accounts for 2 frames,and severe fatigue accounts for 34 frames.The non-fatigue grade accounts for the most that is 75.3%,so the fatigue grade during the cycle is non-fatigue.

TheMARvalue increases from 0 to about 0.85 in the fatigue state,and is yawning,while yawning is accompanied by the decrease of EAR value,which is consistent with the decrease of eye opening and closing when yawning in life.From the overall point of view,non-fatigue accounts for 79 frames,mild fatigue accounts for 81 frames,moderate fatigue accounts for 44 frames,and severe fatigue accounts for 96 frames.Fatigue accounts for 73.7%,and severe fatigue accounts for the most that is 32%,so the fatigue level in this cycle is severe fatigue.

(a) Normal state

After adding the head posture,the fatigue degree value decreases at the blink.The head posture changes near the 150th frame in Fig.9(a),causing the fatigue degree to increase.And the overall is still in the non-fatigue state,which indicates that the head posture in the normal state has less effect on the fatigue degree.It is derived that the head posture in this cycle does not exceed the threshold value,which is in the normal state from Fig.9(b).From the overall view,non-fatigue accounts for 72 frames,mild fatigue accounts for 70 frames,moderate fatigue accounts for 44 frames,and severe fatigue accounts for 104 frames.Fatigue accounts for 76%,and severe fatigue accounts for the most that is 34.7%,so the fatigue level in this cycle is judged as severe fatigue.

Experiment 3:Comparison of fatigue values after adding head posture is shown in Fig.10.

Fig.10 Comparison before and after adding head posture

The before-and-after comparison of the head posture during fatigue is shown in Fig.10.Compared with the time when the head posture is not added,the overall fatigue level curve is lower when the head posture is added,which has a slight effect on the overall curve because the head posture is in the normal range.

The comparison between the proposed method and the PERCLOS method is shown in Table 4.The PERCLOS method only determines the fatigue and non-fatigue states,while the proposed method divides the fatigue state into four levels to refine the fatigue degree and reduce the detection error.

Table 4 Comparison of fatigue test results

In order to reduce the possible error caused by a single group of data,the accuracy of five groups of data is collected as the average accuracy.Table 5 shows the comparison of fatigue detection accuracy under normal environment,where excessive head deflection or driver not in the detectable range are judged as detection failure.Table 6 shows the accuracy comparison between adding low-light enhancement processing and without adding low-light enhancement processing in the dark environment.With the addition of low-light enhancement,the image exposure is enhanced,which makes it easier to detect the driver’s face and improve the accuracy of detection.

Table 5 Comparison of fatigue detection accuracy

Table 6 Accuracy comparison in low light environment

By fusing three feature quantities of eyes,mouth and head posture angle for driver fatigue determination,it not only corrects the eye opening and closing degree of train drivers when they pay attention to various dashboard information,but also realizes fatigue classification,improves the sensitivity of detection and achieves accuracy rates of 95% and 86.8% in the dark.

5 Conclusions

A multi-information fusion fatigue detection method is proposed to detect train driver fatigue under low light.

1) Low-light enhancement processing of driver images based on low-light environment,detection and localization of driver face images by using LBP and ERT algorithms are progressed.

2) Adaptive threshold method and eye gaze correction are used to reduce the detection error due to individual differences.

3) A fuzzy inference system with three feature quantities of eyes,mouth and head posture angle as input and driver fatigue level as output is fused to determine the driver fatigue status.