Open AccessArticle

Classification-Based Parameter Optimization Approach of the Turning Process

Lei Yang

^1,*

Yibo Jiang

²,

Yawei Yang

³,

Guowen Zeng

³,

Zongzhi Zhu

⁴ and

Jiaxi Chen

⁵

ZJU-Hangzhou Global Scientific and Technological Innovation Center, Hangzhou 311215, China

The Jiaxing Shutuo Technology Co., Ltd., Jiaxing 314031, China

China Unicom Jinhua Branch, Jinhua 321013, China

⁴

The Zhejiang Guoli Security Technology Co., Ltd., Hangzhou 310059, China

⁵

Faculty of Science and Engineering, University of Nottingham, Ningbo 315000, China

Author to whom correspondence should be addressed.

Machines 2024, 12(11), 805; https://doi.org/10.3390/machines12110805

Submission received: 26 September 2024 / Revised: 6 November 2024 / Accepted: 11 November 2024 / Published: 13 November 2024

(This article belongs to the Topic Smart Product Design and Manufacturing on Industrial Internet)

Download

Browse Figures

Figure 1
The flowchart of the proposed approach. "> Figure 2
The schematic diagram of the overlapped sliding window. "> Figure 3
The structure of the EDAMMV model. "> Figure 4
The schematic diagram of LSTM unit. "> Figure 5
The schematic diagram of major voting. "> Figure 6
The schematic diagram of the optimization method. "> Figure 7
The schematic diagram of the alignment task when n is 6 and m is 9. "> Figure 8
The comparison between real distribution and sampling distribution (30 samples). "> Figure 9
The picture of the data acquisition and turning machine. (a) The data acquisition of main shaft; (b) the data acquisition of turret; (c) the data acquisition of main shaft power; (d) the workpiece. "> Figure 10
The schematic diagram of nine stages in the turning process. "> Figure 11
The effects of main-relevant component number. (a) Cumulative contribution rate. (b) The performance of main-relevant component numbers on the training set and the testing set. "> Figure 12
The effects of overlapped sliding window size. "> Figure 13
The training process of EDAMMV model. (a) The loss curve of EDAMMV model on the training set. (b) MAPE curves during training process. "> Figure 14
Classification results of the turning process. (a) The classification results from time step 1 to 3400. (b) The classification results from time step 3401 to 6800. (c) The classification results from time step 6801 to 10,200. (d) The classification results from time step 10,201 to 13,600. "> Figure 15
The wear status analysis of Cutting Tool. (a) Effects of tool wear status on vibration. (b) Distance calculation by DTW. "> Figure 16
The calculation of envelopes. "> Figure 17
Envelopes of the turning process. "> Figure 18
The distribution of productive time in each substage among different workpieces. "> Figure 19
The optimization effect of the turning process. ">

Review Reports Versions Notes

Abstract

The turning process is a widely used machining process, and its productivity has a significant impact on the cost and profit in industrial enterprises. Currently, it is difficult to effectively determine the optimum process parameters under complex conditions. To address this issue, a classification-based parameter optimization approach of the turning process is proposed in this paper, which aims to provide feasible optimization suggestions of process parameters and consists of a classification model and several optimization strategies. Specifically, the classification model is used to separate the whole complex process into different substages to reduce difficulties of the further optimization, and it achieves high accuracy and strong anti-interference in the identification of substages by integrating the advantages of an encoder-decoder framework, attention mechanism, and major voting. Additionally, during the optimization process of each substage, Dynamic Time Warping (DTW) and K-Nearest Neighbor (KNN) are utilized to eliminate the negative impact of cutting tool wear status on optimization results at first. Then, the envelope curve strategy and boxplot method succeed in the adaptive calculation of a parameter threshold and the detection of optimizable items. According to these optimization strategies, the proposed approach performs well in the provision of effective optimization suggestions. Ultimately, the proposed approach is verified by a bearing production line. Experimental results demonstrate that the proposed approach achieves a significant productivity improvement of 23.43% in the studied production line.

Keywords:

classification; process optimization; turning process; long short-term memory; attention mechanism; cutting tool

1. Introduction

The turning process is an important manufacturing process in industry, in which the tool performs radial and axial motion to remove material from the rotating workpiece in order to obtain the required shape [1]. Since the turning process accounts for about 40% of the total machining processing, and the number of lathes constitutes 25–35% of the total machines in the workshops [2], the productivity of the turning process has a significant impact on the costs and profits of industrial enterprises. Currently, numerous optimization techniques have been developed to enhance the productivity of the turning process, such as the optimization of cutting tools [3], the improvement of production scheduling [4], the optimization of process parameters [5], etc. Among these techniques, the development of artificial intelligence algorithms has facilitated the deep analysis of the massive data generated by devices and production processes. This development opens up new opportunities for improving the productivity of the turning process and leads industrial enterprises to increasingly turn to modeling-based optimization techniques [6,7].

This paper focuses on the optimization of process parameters. Existing optimization methods of turning parameters can be roughly classified as physical model methods and data-driven methods. Specifically, the physical model method usually involves two aspects: the derivation of mathematical formulas, and the optimal solution search of cutting parameters. These mathematical formulas typically rely on physical laws between cutting parameters (cutting speed, cutting depth, etc.) and the turning process productivity to derive mathematical equations of cutting constraints [8,9]. Furthermore, based on these cutting constraints equations, previous research has usually employed various search algorithms to obtain the optimal parameter solution that aims at maximizing the productivity of the turning process. In the literature [10], the Genetic Algorithm (GA) is applied to optimize parameters (such as the feed rate, depth of cut, etc.) in a turning case, and as a result, the unit production cost is significantly reduced by 30.04%. Dewil [11] optimizes the cutting order of complex parts in accordance with the Tabu search algorithm and achieves a 7% to 24% reduction in the cycle time. To obtain better calculation speed and performance, especially in the case of a process with special requirements, fusion algorithms are used. Li [12] proposes a fusion algorithm to optimize the machining path of holes. This algorithm combines the capacity of Ant Colony Optimization (ACO) to avoid local optima with the fast convergence capacity of GA, and comparative experiments against GA and ACO demonstrate that the fusion algorithm achieves an 18.0% reduction in the shortest path and a 5.1% decrease in simulation time. Physical model methods have achieved a significant success in the simple turning process based on accurate mathematical equations. However, when the turning process becomes increasingly complex, rapidly increasing cutting constraints and finite search capabilities limit the application of physical model methods in parameter optimization tasks. To address this problem, some researchers have attempted to use data-driven methods. The data-driven method eliminates the derivation of a mathematical formula and adopts a data analysis strategy using statistical theory. For instance, some smart boxes with adaptive algorithms have been used in some enterprises. These smart boxes can adjust the cutting speed in accordance with the actual processing load to reduce the cycle time, but because they barely focus on the processing load, the cycle time reduction is limited [13,14,15].

As mentioned above, there are two main problems with existing optimization methods. (1) Due to the limitations in the derivation of mathematical formulas, previous optimization methods always try to oversimplify the turning process; considering the requirements of production, a complete turning process in a real industrial environment is usually complex [16]. Thus, it is necessary to develop an effective approach for parameter optimization under complex conditions. (2) The wear status of a cutting tool has a significant impact on the turning process, and it is obviously dynamic [17]. Nevertheless, the wear status of a cutting tool is considered a constant or a linear variable in numerous research studies, which leads to a significant deviation between optimization results and reality.

To solve these problems, a novel parameter optimization approach to the turning process is proposed in this paper. Firstly, in order to achieve the optimization of turning process productivity under complex working conditions, a natural idea is to separate the whole complex turning process into different substages to simplify further analysis. Given that deep learning has been recently treated as the most significant breakthrough in the field of pattern recognition [18], and has been widely applied in classification and prediction tasks, such as parameter prediction [19,20], fault diagnosis [21] etc., deep learning approaches will be introduced in this paper to develop a classification approach with high accuracy to recognize different substages. Additionally, to further enhance the robustness of the classification approach, ensemble learning methods, such as major voting [22], bagging [23,24], and adaboost [25,26] are utilized to improve its anti-interference ability. Secondly, both tool wear prediction [27,28] and tool fault diagnosis [29,30] can be utilized as solutions to eliminate the effects of cutting tool wear. However, effective tool wear prediction usually requires extra sensors and numerous training samples. For the purpose of developing a low-cost, simple operation and a high reliability evaluation method for cutting tool wear status, KNN [31] and DTW [32] are adopted to classify workpieces with a similar cutting tool wear status in this paper.

Hence, the proposed parameter optimization approach to the turning process consists of a classification method and optimization methods, which are described as follows: Firstly, key sensing parameters are determined by experiments and prior knowledge. Then, overlapped sliding window and Principal Component Analysis (PCA) are utilized to extract the features and remove the negative effects of noise. Secondly, a classification model called encoder decoder-attention mechanism-major voting (EDAMMV) is proposed to classify different substages for each workpiece with high accuracy and robustness. Thirdly, in accordance with the classification results, this paper combines the KNN with the DTW in each substage to conveniently retrieve the historical dataset and filter out target samples that have a similar cutting tool wear status to the current sample. Based on features of filtered samples, the envelope curve strategy and boxplot method are utilized to detect the optimizable items. Meanwhile, physical explanations of optimizable items, such as a low feed speed or unsuitable cutting depth, will be obtained through further process analysis. Finally, by ensuring the quality of the product as a priority, the productivity of the turning process will be improved with optimizations of process parameters.

The main contributions are summarized as follows:

(1): The proposed EDAMMV model is an accurate and robustness classification model that can separate the whole turning process into different substages under background noise. Moreover, the application of the EDAMMV model contributes to simplifying the optimization problem of the complex turning process.
(2): In order to reduce the negative impact of cutting tool wear status on optimization results, a simple and convenient search algorithm is presented by combining the KNN and the DTW. The search algorithm aims to analyze the historical dataset and filter out target samples where the cutting tool wear status needs to be similar to that of the current sample.
(3): An adaptive calculation approach of parameter threshold is proposed based on the envelope curve strategy and boxplot method in the detection of optimization potential. Experimental results indicate that this approach succeeds in the provision of effective optimization suggestions.

The remainder of this paper is arranged as below: Section 2 reviews the relevant works. Then, Section 3 introduces the problem formulation and methodology. Subsequently, the proposed approach is verified in Section 4, which consists of the experimental background and the parameter optimization, as well as the classification and optimization of the turning process. Finally, the paper is summarized in Section 5.

2. Related Work

In this section, the exploration of parameter optimization techniques is reviewed, highlighting their strengths and the challenges encountered in complex manufacturing processes. Additionally, advancements in pattern recognition techniques are discussed, providing a foundation for understanding the classification model of the proposed approach.

2.1. Parameter Optimization Method

In the field of parameter optimization, the search algorithm has received widespread attention due to its ability to obtain the near-optimal solution with superior calculation efficiency. Farrell [22] develops a predictive model for the relationship between machining parameters and product surface roughness during the milling process. Then, the GA is utilized to optimize these parameters, resulting in a 44.22% reduction in surface roughness in experimental cases. Ameur and Assas [33] propose an improved algorithm of particle swarm optimization (PSO) to search the optimal cutting parameters for solving a multi-objective optimization problem in the turning process, and the algorithm achieves a 43% decrease in production costs. Xin [34] applies the ACO algorithm to globally optimize the cutting path in the high-speed milling process, and the experiments show that the milling time is significantly shortened by more than 13%. Usually, various search algorithms have different advantages in solving problems. In the literature [35], the performance of GA, PSO, and simulated annealing (SA) has been compared in the same turning process. Experimental results show that GA has the best performance in the search for the best solution, while PSO tends to obtain the optimal solution at a faster rate. Therefore, compared to using a single search algorithm, fusion algorithms that can combine the advantages of search algorithms and other algorithms are widely studied. Fang [36] proposes an improved adaptive method to optimize parameters for maximizing productivity. This method adopts the mechanism of SA to update each particle of PSO, preventing the PSO from falling into local optima, and experimental cases show that the improved adaptive method achieves higher productivity and lower cost than PSO. In the literature [37], a hybrid Taguchi-genetic algorithm (HTGA) is utilized to optimize the cutting parameters of a lathe, in which the Taguchi method is used to improve the process of cross selection in traditional GA, improving the convergence speed of GA and the robustness of results.

Additionally, in some cases of parameter optimization, important parameters (cutting force, tool life, surface finish, etc.) are difficult to acquire directly from the manufacturing process. Therefore, optimization algorithms are increasingly being combined with prediction algorithms for these parameters. Recently, with the development of deep learning, excellent nonlinear fitting performance has made it possible to predict the above parameters. Zhang [38] proposed a prediction framework that consists of a Long Short-Term Memory (LSTM) and attention mechanism, which successfully estimates the remaining useful life of the rotatory machine. In the literature [39], Convolutional Neural Networks (CNN) as well as LSTM are used to extract different frequency signal characteristics to identify the fault type of the bearing. Liang [40] introduced the Bidirectional Long Short-Term Memory (Bi-LSTM) to predict the spindle rotation error in the high-precision machining process, and experimental results show that the highest accuracy achieves 93.53% under different Revolutions Per Minute (RPM).

Due to the complexity and diversity of the manufacturing process, the performance of the optimization method based on search algorithms is usually limited. Therefore, there is still a lack of a simplified approach that can effectively transform the complex problem into several low-difficulty sub-problems.

2.2. Pattern Recognition Methods

Pattern recognition is a classification method of objects based on their features. In comparison to traditional machine learning algorithms, such as the Fisher Discriminant Analysis (FDA), deep learning has made tremendous advances in solving nonlinear problems, and it has proven to be particularly effective at revealing complex structures in high-dimensional data [41]. Specially, CNN [42], Recurrent Neural Networks (RNN) [43] and Artificial Neural Network (ANN) [44] have received widespread attention in recent years, in which the RNN has been dominantly adopted in studies with sequential data because of its cyclic connection [45]. By updating the current state by current input data and the past states, the cyclic connection allows the RNN to effectively utilize the time series characteristics of the data and make a more significant performance than other neural networks [46]. However, the problems of vanishing and exploding gradients limit the performance of RNN under long time series [47,48]. To address this difficulty, LSTM is proposed, which has been shown to have a natural advantage in processing manufacturing data. The cell structure of LSTM could remove noise from the data and extract the time characteristics of the data, leading to more accurate predictions [49]. Munasypov [50] proposed a diagnostic method for metal-cutting machines that uses an LSTM network to identify characteristic frequency caused by manufacturing errors, enabling the determination of fault types in machine tools. In the literature [51], an LSTM network is used to predict the parts demand of subsequent batches in the production process for solving the stock problems, and the accuracy of prediction is more than 97.66%.

As a variation of the LSTM, Bi-LSTM consists of two hidden neural layers, one of which works in the forward direction and the other in the backward direction. Hence, Bi-LSTM is superior to the LSTM in the capture of long-term dependencies in the time series or sequence data [52]. In the literature [53], Bi-LSTM is applied to Chinese word segmentation and shows expertise in creating a hierarchical feature representation of contextual information from both directions. The experiments suggest that Bi-LSTM performs well and generalizes on both simplified and traditional Chinese. Upletawala and Katratwar [1] proposed a novel framework based on CNN and Bi-LSTM for action recognition, and the experimental results indicated that the new framework achieves the best accuracy on different benchmark datasets.

As mentioned above, this paper aims to solve the optimization problem of turning processes under complex conditions by combining the advantage of deep learning algorithms in pattern recognition.

3. The Proposed Method

This section presents the proposed turning process optimization approach in detail. The proposed approach consists of three main parts, and its flowchart is shown in Figure 1:

Data Pre-processing: The raw data obtained from sensors and a numerical control system need to be pre-treated first to ensure data consistency. Then, the dataset will be compressed into key features;
Classification Method: Based on these extracted features, each stage of the turning process can be identified using the EDAMMV model;
Optimization Method: In accordance with classification results, the first step is to obtain historical samples where the cutting tool wear status needs to be similar to that of the current sample. Subsequently, statistical methods are employed to find the parameter threshold and detect optimizable items. Finally, all optimization suggestions need further confirmation with specific process analysis.

3.1. Data Pre-Processing

To eliminate the negative effects of different measurement units, the raw data needs to be normalized first, ranging from 0 to 1.

Next, to ensure high accuracy and robustness, feature engineering is necessary and consists of three aspects:

1. The selection of feature parameters from process parameters has a great impact on the result of the proposed approach. Thus, feature parameters will be chosen in view of process requirements and prior knowledge of the turning process [1], such as displacement parameters, velocity parameters, vibration parameters, main shaft power parameters, etc. Among these parameters, displacement parameters and velocity parameters can be collected from the numerical control system of equipment, while vibration parameters and main shaft power parameters can be acquired using an electric power sensor and vibration sensor, respectively.

2. Subsequently, in order to improve the calculation efficiency and avoid the negative effects of noise [54], PCA [55] is used to extract the main-relevant components, which is described as follows:

The original data (

X

) will be scaled to zero-mean and the covariance matrix (

Q

) can be calculated, which is formed as Equation (1). Then, by means of the singular value decomposition (SVD), eigenvalues (

λ

) and eigenvectors (

w

) can be obtained from the decomposition of the covariance matrix, which is shown in Equation (2). For comparing the importance among eigenvectors, the score of each eigenvector is calculated by eigenvalues, which is shown in Equation (3). Ultimately, based on the score of eigenvectors, the orthogonal matrix (

W

) will be constituted by the top K (K ≤

D

) eigenvectors with a high score, which can be used to transform the original data (

X

) into low-dimensional data (

Y

) in Equation (4).

Q = (\sum_{i = 1}^{N} (x_{i} - \bar{X}) {(x_{i} - \bar{X})}^{T}) / N

(1)

Q = \sum_{j = 1}^{D} w_{j} λ_{j} w_{j}^{T}

(2)

{score}_{j} = λ_{j} / \sum_{l = 1}^{D} λ_{l}

(3)

Y = W^{T} X

(4)

where

x_{i}

represents the

i

th sample vector;

\bar{X}

denotes the mean value of N samples;

{(\cdot)}^{T}

is the transposed matrix;

w_{j}

is the jth eigenvector of Q;

{score}_{j}

is the contribution rate of

j

th principal component.

3. Next, the sequential characteristics of the main-relevant components are further extracted using the overlapped sliding window strategy [56]. In this strategy, the data acquired with a certain frequency can be regarded as a series of independent data points, and adjacent data points are gathered as a whole. The schematic diagram is shown in Figure 2, where

L_{w i n d o w}

denotes the window size of each segment;

L_{o v e r l a p p e d}

represents the overlapped size between the current window and the next window; and

L_{i n c r e m e n t}

denotes the sliding size between the current window and the next window. The overlapped sliding window aims to serve as the foundation for subsequent ensemble learning optimization.

3.2. Classification Method

The EDAMMV model is developed to classify various substages of the turning process, which adopts the encoder-decoder framework as the basic framework and utilizes the attention mechanism and major voting as key strategies to further improve its accuracy and robustness. The main structure of the EDAMMV model is shown in Figure 3, and the calculation steps can be described as follows:

Step 1: Main-relevant components are regarded as inputs of the encoder-decoder framework and encoded by the Bi-LSTM network for automatic extraction of sequential characteristics.

Step 2: The correlations among extracted features are evaluated by the attention mechanism.

Step 3: The correlations and features are used as the inputs of the LSTM network, and the encoding information will be decoded as an unnormalized classification distribution.

Step 4: The SoftMax layer is used to convert the unnormalized classification distribution into a probability distribution of the turning stages, as shown in Equation (5):

P_{j} = e^{z_{j}} / \sum_{k = 1}^{9} e^{z_{j}}

(5)

where

P_{j}

denotes the probability in

j

^th stage; and z_j denotes the output from the

j

^th neuron.

Step 5: Outputs of the SoftMax layer will be filtered using major voting to remove the error results and further improve the robustness of classification.

3.2.1. Encoder-Decoder Framework

The framework has been widely employed to solve complex tasks [57], such as machine translation, image and video captioning, text summarization, etc. In addition, it has also been shown to be an effective way to solve problems of sequence-to-sequence [58]. Therefore, the encoder-decoder framework is utilized as the basic framework of the EDAMMV model.

Furthermore, in traditional neural networks (such as CNN, ANN, etc.), since the neurons in the same hidden layer work independently and in parallel, the location characteristic of hidden neurons and the sequential characteristics of input data are ignored [59,60]. In contrast, LSTM and Bi-LSTM are special structures of the RNN where the current output of neuron is affected by previous outputs of neuron, so the sequential characteristics of data can be automatically extracted to improve the classification accuracy [61].

The schematic diagram of the LSTM unit is shown in Figure 4, and operations of the LSTM unit are revealed in Equation (6) to Equation (13). Additionally, as an extension of the LSTM network, the Bi-LSTM network comprises two parallel layers that operate in both forward and backward propagation directions, allowing it to learn past and future information in the sequential data and have better performance than the LSTM network [62,63].

Consequently, the Bi-LSTM network is selected as the encoder of the basic framework due to its superior classification ability in extracting main features. Meanwhile, to ensure the interpretability of the EDAMMV model, the output data should correspond to the input data one by one, so it requires maintaining sequential characteristics of data in the decoding operation. Considering that the decoding operation is a unidirectional process from past to present, the LSTM network instead of the Bi-LSTM network is opted as the decoder of the basic framework.

s i g m o i d (q) = 1 / (1 + e^{1 - q})

(6)

t a n h (q) = (e^{q} - e^{- q}) / (e^{q} + e^{- q})

(7)

f = s i g m o i d (x_{t} W_{x}^{(f)} + h_{t - 1} W_{h}^{(f)} + b^{(f)})

(8)

g = t a n h (x_{t} W_{x}^{(g)} + h_{t - 1} W_{h}^{(g)} + b^{(g)})

(9)

i = s i g m o i d (x_{t} W_{x}^{(i)} + h_{t - 1} W_{h}^{(i)} + b^{(i)})

(10)

o = s i g m o i d (x_{t} W_{x}^{(o)} + h_{t - 1} W_{h}^{(o)} + b^{(o)})

(11)

c_{t} = f ⊙ c_{t - 1} + g ⊙ i

(12)

h_{t} = o ⊙ t a n h (c_{t})

(13)

where

x_{t - 1}

and

x_{t}

denote the input at time step

t - 1

and time step

t

, respectively;

h_{t - 1}

and

h_{t}

denote the hidden state at time step

t - 1

and time step

t

, respectively;

c_{t - 1}

and

c_{t}

denote the memory cell at time step

t - 1

and time step

t

, respectively;

f

g

i

o

are point-wise nonlinear functions and represent forget gate, add gate, input gate, output gate, respectively;

W_{x}^{(f)}

W_{h}^{(f)}

W_{x}^{(g)}

W_{h}^{(g)}

W_{x}^{(i)}

W_{h}^{(i)}

W_{x}^{(o)}

W_{h}^{(o)}

are weight matrices and

b^{(f)}

b^{(g)}

b^{(i)}

b^{(o)}

are bias vectors;

⊙

means the point-wise multiplication of two vectors.

3.2.2. Attention Mechanism

Attention mechanism is a novel processing technology for neural network parameters, which imitates the observation mechanism of the human brain that focuses on important aspects among amounts of information. Hence, it is usually applied to improve the calculation speed and the accuracy of models [64].

In this paper, to establish more precise associations between raw data and target patterns, the attention mechanism is employed between the encoder (a Bi-LSTM network) and the decoder (an LSTM network) to capture the complex mapping between inputs and output. The formulas of the attention mechanism are formed as Equation (14) to Equation (18).

K_{i} = V_{i} = [\begin{matrix} h_{i}^{f} & h_{n - i}^{b} \end{matrix}], Q = [\begin{matrix} h_{n}^{f} & h_{1}^{b} \end{matrix}]

(14)

c o e f_{Q} = w_{Q}^{T} Q, c o e f_{K, i} = w_{K}^{T} K_{i}

(15)

c o e f_{V, i} = c o e f_{Q} \times c o e f_{K, i}

(16)

C V_{i} = \sum_{i = 1}^{n} c o e f_{V, i} \times V_{i}

(17)

h_{i} = \{\begin{array}{l} L S T M ([\begin{matrix} C V_{1} & B O S \end{matrix}]), & i = 1 \\ L S T M ([\begin{matrix} C V_{i} & h_{i - 1} \end{matrix}]), & i \neq 1 \end{array}

(18)

where

[.]

denotes the concatenation of two vectors;

K

V

denote the key vector and value vector, respectively;

w_{Q}

and

w_{K}

denote the assigning weight vectors that can be learned in the training process;

c o e f_{Q}

c o e f_{K}

c o e f_{V}

denote the assigning coefficient of query vector, key vector and value vector, respectively;

B O S

represents the initial vector which is advised to be zero vector;

C V_{i}

denotes the content vector in

i

th block;

L S T M (.)

denotes the calculation process of LSTM; and h_i is the output of

i

th block.

3.2.3. Major Voting

In industrial environments, the inevitable noise has a great impact on classification performance. Therefore, it is extremely necessary to utilize an ensemble learning algorithm to further improve the robustness of the combination of the basic framework and attention mechanism.

Compared to other ensemble learning algorithms, the major voting has several advantages, including simpler operation, fewer parameters, faster training speed, and high interpretability [22]. To increase the overall classifier accuracy and robustness, major voting is introduced to filter out error classification results. The schematic diagram of major voting is shown in Figure 5, and its steps are described as follows:

Step 1: The strategy of the overlapped sliding window is utilized to generate numerous primary classification results; these classification results will form the high-density stream of pre-classification results.

Step 2: Each primary classification result will be considered as one vote, and the votes of different classification results for the same sample will be counted.

Step 3: The classification result with the highest number of votes will be selected as the final classification result.

3.3. Optimization Method

Based on the classification of substages, potential optimizable items are detected through the approaches shown in Figure 6, and optimization suggestions for turning process parameters will be given at the end.

3.3.1. Cutting Tool Wear Status Matching

The wear status of the cutting tool has a significant impact on the turning process, and it will make a difference in sensor data. During the whole optimization process, the significant difference among the cutting tool wear status of samples will result in unsuitable optimization suggestions that may lead to product quality issues and a decrease in equipment productivity [17]. Hence, it is necessary to eliminate the impact of the cutting tool wear status to ensure the accuracy and reliability of optimization advice.

The KNN is adopted as the basic matching algorithm, aiming to quickly retrieve the historical dataset to obtain target samples that have a similar cutting tool wear status to the current cutting tool. The KNN is a non-parametric algorithm; its principle is that a sample generally exists close to other samples with similar properties [31]. Thus, it can be used to search k samples with the most similar wear status to the current cutting tool in the historical database. The search steps of KNN are described as follows:

Step 1: When a current sample is given, the distance between the current sample and each historical sample will be measured first.

Step 2: Based on these distances, the top k historical samples, called k-nearest neighbors, with shorter distances will be selected.

Generally, the Euclidean distance is used to evaluate the distance between samples. The formula is defined as follows:

D (X, Y) = \sqrt{\sum_{i}^{n} {(x_{i} - y_{i})}^{2}}

(19)

where

x_{i}

and

y_{i}

denote the ith value of

X

and

Y

, respectively;

n

is the vector length of samples;

D (X, Y)

is the distance between

X

and

Y

However, the formula of the Euclidean distance requires that the vector length of

X

must be consistent with the vector length of

Y

. Due to the differences among various samples, it is difficult to achieve the above requirements in each substage. Consequently, DTW is introduced to replace the Euclidean distance. The technique of the DTW is to use a dynamic programming approach to align the time series from different samples. Specifically, for the

X

with a sample length of

n

and the

Y

with a sample length of

m

, the alignment task can be transformed into a search problem of the shortest path, as shown in Figure 7.

The warping process between two time series involves three steps:

Step 1: Calculation of the Construction Distance Matrix (CDM). Based on the

X

and

Y

, a Distance Matrix (DM) can be constructed first, which is defined in Equation (20) to Equation (23). Then, the

C D M

can be obtained using the

D M

, as expressed as Equation (24) to Equation (25).

Step 2: Search for the shortest path. As is shown in Figure 7, the path search is to find the shortest path between the start point (i.e.,

d_{1, 1}

) and the end point (i.e.,

d_{n, m}

). Based on the

C D M

, the distance of the shortest path is denoted by

c_{n, m}

, and each element that contributes to

c_{n, m}

D M

will be marked with green blocks.

Step 3: The search results marked with green blocks are used to determine the corresponding relationship among key elements in

X

and

Y

X = [\begin{matrix} x_{1} & \dots & x_{n} \end{matrix}]

(20)

Y = [\begin{matrix} y_{1} & \dots & y_{m} \end{matrix}]

(21)

D M = [\begin{matrix} d_{n, 1} & \dots & d_{n, m} \\ ⋮ & ⋱ & ⋮ \\ d_{1, 1} & \dots & d_{1, m} \end{matrix}]

(22)

d_{n, m} = |x_{n} - y_{m}|

(23)

C D M = [\begin{matrix} c_{n, 1} & \dots & c_{n, m} \\ ⋮ & ⋱ & ⋮ \\ c_{1, 1} & \dots & c_{1, m} \end{matrix}]

(24)

c_{n, m} = \{\begin{matrix} d_{n, m} + m i n (d_{n - 1, m}, d_{n, m - 1}, d_{n - 1, m - 1}), & n, m > 1 \\ d_{n, m} + d_{n, m - 1}, & n = 1 \\ d_{n, m} + d_{n - 1, m}, & m = 1 \\ d_{1, 1}, & m, n = 1 \end{matrix}

(25)

3.3.2. Optimizable Item Detection

In real industrial production lines, factors such as environmental noise and individual differences of workpieces will lead to fluctuation in the sensor data. Generally, this fluctuation conforms to a normal distribution. However, if the sample size is insufficient during the optimization analysis, it will result in serious distortions in some statistical variables, as shown in Figure 8.

It is obvious in Figure 8 that the mean value, minimum value and maximum value deviate from the real situation when the sample size is insufficient. But it is worth noting that the real distribution and the sampling distribution have good consistency at the boundary (i.e.,

μ - 3 σ

and

μ + 3 σ

). Thus, in order to obtain effective optimization advice, the statistical value at the boundary is used to calculate the upper envelope curve and lower envelope curve of the sensor dataset, and formulas are defined as follows:

V_{t} = [\begin{matrix} v_{1, t} & \dots & v_{p, t} \end{matrix}]

(26)

μ_{t} = \frac{1}{p} * \sum_{i = 1}^{p} v_{i, t}

(27)

σ_{t} = \sqrt{\frac{1}{p} * \sum_{i = 1}^{p} {(v_{i, t} - μ_{t})}^{2}}

(28)

E_{u p p e r, t} = μ_{t} + 3 σ_{t}

(29)

E_{l o w e r, t} = μ_{t} - 3 σ_{t}

(30)

where

V_{t}

denotes a set of samples at time t, and the

v_{p, t}

denotes the p-th value of sensor data at time t;

μ_{t}

and

σ_{t}

are the average value and standard deviation of

V_{t}

, respectively;

E_{u p p e r, t}

and

E_{l o w e r, t}

denote the value of the upper envelope curve and the value of the lower envelope curve at time t.

Based on prior experience, even if the main process parameters (i.e., feed speed, type of cutting tool, etc.) are kept constant in each substage, non-uniform materials and irregular edges of the workpiece still lead to significant numerical fluctuations of sensor values. Therefore, to achieve the adaptive optimization of the turning process, the boxplot method is employed in this paper to determine suitable values of threshold for sensor data in each substage. Meanwhile, these process fragments with sensor data below the threshold will be recorded as optimizable items. The determination of the threshold is expressed as follows:

E_{l o w e r} = [\begin{matrix} E_{l o w e r, 1} & \dots & E_{l o w e r, t} \end{matrix}]

(31)

Q_{3} = u p p e r q u a n t i l e (E_{l o w e r})

(32)

Q_{1} = l o w e r q u a n t i l e (E_{l o w e r})

(33)

θ = Q_{1} + α * (Q_{3} - Q_{1})

(34)

where

Q_{1}

and

Q_{3}

are the lower quartile value and upper quartile value of the lower envelope curve in each substage; the

θ

denotes the value of threshold; and the

α

is an adjustable coefficient.

Ultimately, in order to ensure the quality of products, the optimizable items detected in each substage need to be further investigated with process analysis to obtain specific optimization suggestions. After experimental verification, effective optimization suggestions will be executed to optimize the parameters of the turning process.

4. Experimental Case Study

A case from a bearing company is used to verify the proposed optimization approach. The raw data is acquired from a CNC lathe (HEADMAN T65, Zhejiang Headman Machinery CO., Ltd., Taizhou, China), and datasets of process parameters are obtained by the Internet of Things (IoT) and various sensors (electric power sensor and vibration sensor), as shown in Figure 9.

4.1. Dataset Description

Based on previous experiments and demonstrations, the cutting force has a significant impact on the elastic deformation of the workpiece during the turning process. Since the cutting force is difficult to detect and has a high correlation with the main shaft power, the main shaft power is regarded as a feature parameter to be studied. In addition, the feed rate is highly correlated with the productivity of the turning process, and it can be directly adjusted by operators. Additionally, the coordinates of the cutting tool can represent the cutting track, which plays an important role in the classification of substages in the turning process. Moreover, vibrations of the main shaft and turret have a significant impact on the surface quality of the workpiece and the lifespan of the cutting tools. Hence, eight relevant parameters are selected as feature parameters in this case. These parameters include main shaft power, feed rate, x-axis coordinate, z-axis coordinate, main shaft vibration in z-axis, main shaft vibration in x-axis, turret vibration in z-axis, and turret vibration in x-axis. The dataset of eight relevant parameters is displayed in Table 1.

Due to the difference in the acquisition frequency among the eight relevant parameters, the first four parameters (main shaft power, feed rate, x-axis coordinate, and z-axis coordinate) are 10 Hz, while the acquisition frequency of the other parameters is 4000 Hz. Thus, in order to simplify the data preprocess, the acquisition frequency of the last four parameters needs to be converted to 10 Hz by calculating their effective values. The formulas of effective value conversion are presented in Equations (35) and (36).

v_{i, k}^{e} = \sqrt{(\int_{k}^{k + T} v_{i, t}^{2} d t) / T}

(35)

v_{i, m a x}^{e} = \max_{1 \leq k \leq n_{i}} \{v_{i, k}^{e}\}, v_{i, a v g}^{e} = \sum_{k = 1}^{n_{i}} v_{i, k}^{e} / n_{i}

(36)

where

T

is the length of a signal period;

n_{i}

and

v_{i, t}

denote the data length and raw value of vibration in ith stage, respectively; and

v_{i, k}^{e}

v_{i, m a x}^{e}

and

v_{i, a v g}^{e}

represent the effective vibration value, the peak effective vibration value, and the average effective vibration value in

i

th stage, respectively.

4.2. Classification Labels Setup

In this case, due to the specific requirements of workpieces in a heavy cross-axis turning process, the turning process consists of rough cutting, fine cutting, drilling, etc. According to different cutting operations, the turning process can be separated into nine stages: cutting tool fast forward (1st stage), rough facing cutting (2nd stage), fine facing cutting (3rd stage), fast feed motion (4th stage), rough external cylindrical cutting and drilling (5th stage), rough external cylindrical cutting (6th stage), fine external cylindrical cutting (7th stage), cutting tool fast backward (8th stage), as well as loading and unloading (9th stage). Additionally, these stages are manually marked and shown in Figure 10.

Additionally, to ensure high accuracy and robustness of the proposed classification model, separating the raw data into a training set and a testing set is the simplest and most effective way in the majority of cases; a good rule of thumb is about 80% for training, and 20% for testing [64]. Moreover, the hyper-parameters of the classification model are determined for the training set using the five-fold cross-validation method [65].

4.3. Experiment Setup and Evaluation Metric

All experiments are evaluated in the Python 3.8 environment. Furthermore, to obtain a stable evaluation result of performance for various models, each experiment in this research is repeated three times, and then these results are averaged to obtain the final result.

Moreover, the Mean Absolute Percentage Error (MAPE) is used to assess the performance of classification models. Obviously, the closer the value of MAPE is to 1, the better the performance of the classification model. The formula of MAPE is shown below:

MAPE = 100 % \times \sum_{i = 1}^{n} I (l p_{i}, l t_{i}) / n

(37)

I (l p_{i}, l t_{i}) = \{\begin{matrix} 1, l p_{i} \neq l t_{i} \\ 0, l p_{i} = l t_{i} \end{matrix}

(38)

where

n

denotes sample size; and

l p_{i}

and

l t_{i}

represent the predicted stage and observed stage of

i

^th sample, respectively.

4.4. Parameter Optimization of Classification Method

The parameter setup of the machine learning algorithms has a significant impact on the classification performance of substages in the turning process. Thus, the key parameter setup is discussed in detail in this section, which consists of the number of main-relevant components in PCA, the size of the overlapped sliding window, and the number of neurons and layers in the EDAMMV model.

4.4.1. The Number of Main-Relevant Components in the PCA

To improve the speed of calculation and remove redundant information, the PCA is adopted to transform the raw data into several main-relevant components. However, it needs to be noted that too few main-relevant components result in the loss of effective information, while excessive main-relevant components are detrimental to the elimination of noise [54]. Therefore, the impact of the main-relevant component number in the PCA on the EDAMMV model is discussed, and results are shown in detail in Figure 11.

According to Figure 11, when the number of main-relevant components increases from 1 to 4, it is obvious that there is a rapid growth in the cumulative contribution rate of main-relevant components in Figure 11a and a noticeable decline in the MAPE of the EDAMMV model in Figure 11b. This trend is attributed to an increase in the number of effective main-relevant components. However, as the number of main-relevant components continues to increase, the subsequent curve of the cumulative contribution rate becomes gentle, and the MAPE of the EDAMMV model on the testing set also starts to increase, which means that there is little effective information in the 5th to 8th components, and there may even exist negative noise. Hence, the top 4 main-relevant components are finally selected.

4.4.2. The Size of the Overlapped Sliding Window

In an industrial analysis, the sequential characteristic of a dataset has a significant impact on classification tasks. Due to the fact that the dimension of the sequential characteristic is extremely high, directly learning the whole sequential characteristic using a neural network leads to a serious problem, namely that the complex concatenation among massive hyper-parameters needs extensive RAM and the calculation process is inefficient [66]. To address this issue, the overlapped sliding window strategy is utilized to pre-treat the raw data. This strategy can divide the whole sequential characteristic into several segments through the overlapped sliding window, achieving a compromise between the performance of hardware and the preservation of sequential characteristics. Additionally, the overlapped sliding window also provides a foundation for the subsequent ensemble learning optimization. Hence, as a key parameter of the overlapped sliding window strategy, the window sizes are investigated in this subsection, and the experimental results are presented in Figure 12.

As shown in Figure 12, the MAPE curve on the test set declines when the window size increases from 5 to 40, and it reaches its lowest point when the window size is 40. It illustrates that the overlapped sliding window with a small window size will result in insufficient sequential characteristics. After that, the MAPE curve on the test set starts to rebound when the window size continues to increase, which is due to the fact that the overlapped sliding window with a large window size may lead to information redundancy or oblivion in the calculation process of the neural network, as well as overfitting issues. Ultimately, a window size of 40 is chosen for further discussion.

4.4.3. The Number of Neurons and Layers in the EDAMMV Model

Generally, a neural network with a simple structure has difficulty effectively learning the complex association between process parameters and stages in the turning process. But its performance can be improved by optimizing the neural structure [67]. Specifically, increasing the width of the network (e.g., the number of neurons in each hidden layer) contributes to the diversity of extracted features [68]. In addition, improving the depth of the network (e.g., the number of hidden layers) enhances the nonlinear expression ability of the model [69]. However, it needs to be noted that excessive neurons or hidden layers may result in a reduction in the generalization of the model and the problem of overfitting. Therefore, to obtain the optimal number of neurons and layers, it is necessary to investigate their effects on the performance of the EDAMMV model:

1. The Effects of Hidden Neuron Numbers: As shown in Figure 3, due to the limitation of the neural structure in the EDAMMV model, the neuron number of the LSTM network (i.e., the feature number of new Q) needs to be twice as much as the neuron number of the Bi-LSTM network (i.e., the feature number of Q). Based on this factor, the comparisons of different hidden neuron numbers are shown in Table 2.

According to Table 2, when the number of hidden neurons in the Bi-LSTM network increases from 5 to 20, the values of MAPE in both training and testing sets decline at first, which means that increasing the number of hidden neurons helps to solve the underfitting problem. However, with a further raise in neuron number, the value of MAPE in the testing set starts to increase, indicating that an excess of hidden neurons leads to overfitting issues.

2. The Effects of Hidden Layer Numbers: The comparisons of different hidden layer numbers that range from 1 to 3 are presented in Table 3. It is obvious that the MAPE in the training set decreases as the neural network depth increases. However, compared to the EDAMMV model, where the hidden layer numbers of the Bi-LSTM and LSTM networks are both 1, further improving the network depth also raises the MAPE in the testing set, which indicates that the classification model has a symptom of an overfitting issue.

Based on the above discussions, the optimal values of hyperparameters have been determined as follows: The hidden neuron numbers of the Bi-LSTM and LSTM networks are 20 and 40, respectively. In addition, the number of hidden layers in the Bi-LSTM and LSTM networks is both 1.

4.5. Classification of Substages in Turning Process

4.5.1. Performance of the EDAMMV Model

With the combination of parameter optimization results, the main parameters of the proposed classification algorithm have been determined, as shown in Table 4. In addition, the loss curve and MAPE curves of the EDAMMV model during the training process are presented in Figure 13.

According to Figure 13, when the training epoch increases to 50, values of the loss and the MAPE decrease rapidly. After that, tendencies of the loss curve and the MAPE curve start to slow down. Meanwhile, the MAPE in the training set and the testing set converge to similar levels at the epoch of 200, and their final values are 1.07% and 1.66%, respectively. In addition, fluctuations on the convergence curves have been dramatically reduced, indicating that the EDAMMV model is well trained.

Ultimately, based on the raw data collected from the heavy cross-axis turning process line, the classification results of different substages are shown in Figure 14, where results with misclassification are highlighted by red points.

4.5.2. Comparison of Classification Models

To demonstrate the performance of the EDAMMV model for substage classification, the proposed model is compared with four related methods. These related methods include: Linear Support Vector Regression (LSVR), Kernelized Support Vector Regression (KSVR), a fusion network of the CNN and the LSTM (CNN-LSTM), a LSTM network with self-attention (LSTM-SA), and a Bi-LSTM network. Their main parameters and structure are shown as follows:

LSVR and KSVR are directly deployed using scikit-learn [70]. Penalty coefficients of LSVR and KSVR are set as 500 and 5000, respectively. The kernel trick for KSVR is set as a radial basis function (RBF). The LSVR is used as the baseline model.
CNN-LSTM and LSTM-SA: The CNN part in CNN-LSTM has six hidden layers, which include: a convolutional layer with 3 × 3 receptive field size and 10 channels, a pooling layer with 5 × 5 receptive field size, a convolutional layer with 3 × 3 receptive field size and 10 channels, a pooling layer with 5 × 5 receptive field size, a convolutional layer with 3 × 3 receptive field size and 10 channels, and a pooling layer with 5 × 5 receptive field size. Moreover, both LSTM parts in CNN-LSTM and LSTM-SA are set as the same structure as the LSTM network in the EDAMMV model (i.e., a single hidden layer with 40 neurons).
Bi-LSTM: The Bi-LSTM adopts the same structure as the Bi-LSTM network in the EDAMMV model (i.e., a single hidden layer with 20 neurons).

Furthermore, to ensure a fair comparison, these models are evaluated using the same dataset, and all models are trained for 200 epochs with a learning rate of 0.005. The results are presented in Table 5.

As shown in Table 5, the MAPE of LSVR is 42.5%, while the MAPE of KSVR is 16.04%, which means that the KSVR model performs significantly better than the LSVR model in terms of accuracy. Additionally, MAPE values of the CNN-LSTM model, the LSTM-SA model and the Bi-LSTM model achieve 3.81%, 3.29% and 2.66% in the testing set, respectively. It illustrates that the use of sequential characteristics enables these models to classify the whole complex process into different substages with much higher accuracy. Furthermore, based on the fusion of sequential characteristics, attention mechanism and major voting, and the EDAMMV model, achieve the highest accuracy of classification, its values of MAPE in the training set and testing set are 1.07% and 1.66%, respectively.

4.5.3. Robust Analysis

As is well known, raw data acquired in a real industrial environment usually contains background noise. Thus, it is important to analyze the robustness of the classification model to ensure its effectiveness. To test the robustness of the classification model, white Gaussian noise is added into the raw data, and the signal-to-noise ratio (

S N R

) is used as evaluation indices. The

S N R

is defined as follows:

S N R = 10 l o g_{10} (\sum_{i = 1}^{n} A_{i, s} / \sum_{i = 1}^{n} A_{i, n})

(39)

where n represents sample size; and

A_{i, s}

and

A_{i, n}

denote the ith amplitude value of raw data and noise, respectively.

According to Equation (39), the value of

S N R

decreases as the power of noise increases. Thus, the robustness of different models is analyzed under different

S N R

that ranges from 2 dB to 10 dB. Results are shown in Table 6.

As shown in Table 6, it is clear that the accuracy of classification models decreases with the reduction of

S N R

, and a rapid increase of MAPE in these classification models can be observed when the

S N R

is lower than 4 dB. Specifically, at an

S N R

of 2 dB, the MAPE values of the CNN-LSTM model and the LSTM-SA model, as well as the Bi-LSTM model, are 68.99%, 30.32%, and 20.74%, respectively, which indicates that those models fail to distinguish different turning substages under strong background noise. However, the EDAMMV model still has an MAPE of 6.88%, which shows the best robustness.

In addition, the effectiveness of major voting can be observed in the comparison between the EDAM model (i.e., an EDAMMV model without major voting) and the EDAMMV model. It is obvious that the MAPE of the EDAMMV model is consistently lower than that of the EDAM model under different

S N R

. Meanwhile, compared to the EDAM model, the use of major voting enables the EDAMMV model to achieve a further reduction in MAPE by 11.58% at an

S N R

of 2 dB.

As discussed above, the experimental results demonstrate that the EDAMMV model is more robust than other models and can achieve high accuracy in a noisy environment.

4.6. Optimization of Turning Process

4.6.1. Wear Status Analysis of the Cutting Tool

With the same type of workpieces, the same type of cutting tool, and the same Computerized Numerical Control (CNC) code, part of the vibration curves in the turning process under different cutting tool wear status are shown in Figure 15a. According to Figure 15, the degradation of the cutting tool leads to a rise in vibration. In addition, as the wear of the cutting tool increases, the growth of vibration at the boundary is significantly higher than that at the middle, which indicates that the impact of the cutting tool wear status on the vibration of the turning process is noticeably nonlinear.

Hence, in order to eliminate the adverse effect of the cutting tool wear status before further optimization, DTW is used to evaluate the similarity between the current vibration curve and the historical vibration curve. Specifically, the vibration curve of the current workpiece is selected as the target curve, and the distance between the target curve and each vibration curve of the historical workpieces is calculated by the DTW. These results are shown in Figure 15b.

Additionally, based on these normalized values of DTW, the KNN method is utilized to further filter out required samples, and the K is set as 30.

4.6.2. Comparing Differences Among Different Substages in the Same Workpiece

Based on the wear status analysis of the cutting tool, this study will utilize the samples filtered using the KNN and the DTW as the primary dataset for subsequent optimizations. Specifically, the data analysis consists of two steps:

Calculation of Envelopes and Threshold Values: Generally, insufficient sample size will result in serious distortions in some statistical variables, such as mean value, minimum value, and maximum value. Hence, to ensure the reliability of optimization results, the upper envelope curve and lower envelope curve are employed as the basis of further analysis. Based on samples obtained by KNN and DTW, the calculation of envelope curves is shown in Figure 16, where black lines denote those samples and red lines are envelope curves. Similarly, whole envelope curves of the turning process are displayed in Figure 17. Subsequently, based on the lower envelope curve, the boxplot method is utilized to obtain a suitable curve of vibration threshold for each substage, which is outlined as the red line in Figure 17.
Optimization Analysis: Based on the lower envelope curve and the vibration threshold curve, optimizable items of the turning process can be detected using the following techniques:
(1)
In the same substage, if the lower envelope curve of a partial turning process is lower than the threshold value, it indicates that the partial turning process is optimizable. For instance, according to the 6th stage (i.e., rough external cylindrical cutting) shown in Figure 17, it can be observed that the vibration value during the latter process is significantly lower than the corresponding vibration threshold. Hence, the latter process of the 6th stage is optimizable, and the further experiment indicates that the feed speed can be improved from 0.3 mm/r to 0.4 mm/r.
(2)
Comparing different substages that have similar machining conditions to find the substage with a lower vibration threshold, and this substage is optimizable; for example, as shown between the 5th stage (i.e., rough external cylindrical cutting and drilling) and the 6th stage. Both substages have similar machining conditions for rough external cylindrical cutting, but it is obvious that the vibration threshold of the similar cutting process in the 5th stage is significantly lower than that in the 6th stage. Through practical experiments, it is suggested to increase the feed speed of rough external cylindrical cutting by 0.1 mm/r in the 5th stage.
(3)
During the turning process, due to differences in the rough material from the supplier, the cutting tool is usually set to end its fast forward motion at a set distance that is far away from the workpiece to avoid collisions. However, excessive distance leads to a decrease in machining efficiency. By utilizing the proposed approach, analyzing the process data of batch workpieces in the corresponding substage can help to determine the optimal distance. A case can be found in the 1st stage (i.e., cutting tool fast forward). Based on process analysis, the cutting tool is set to end its fast forward motion at about 12 mm away from the surface of the workpiece, subsequently reducing the feed rate for cutting the workpiece. When the cutting tool first makes contact with the workpiece, it will lead to a significant increase in the vibration value. Based on this principle and the classification results of the proposed approach, the rough material dimensional deviation of the supplier can be analyzed to optimize the fast forward motion of the cutting tool. Further experiments demonstrate that the cutting tool can be optimized to end its fast forward motion at about 5 mm away from the surface of the workpiece.

Subsequently, it is worth noting that the optimization of parameters should be verified through further experiments to ensure that the quality of products meets the requirements and there is no considerable impact on the lifespan of the cutting tool.

4.6.3. Comparing Differences in the Same Substage Among Different Workpieces

Usually, due to the individual difference of workpieces, the productive time of each substage will slightly fluctuate around a constant value. Therefore, significant fluctuations may indicate that there are some optimizable items in the turning process.

Based on the above filtered samples, the distribution of productive time in each substage among different workpieces is presented in Figure 18.

According to Figure 18, the 9th stage also has a wide distribution of productive time that ranges from 3.4 s to 8.6 s, while the average value of the distribution is 5.7 s. With the combination of process analysis, the drastic fluctuation is attributed to the instability of the feeding system. Subsequently, with the maintenance and optimization of the feeding system, the productive time of the 9th stage as well as its fluctuation are notably decreased, which helps to improve the productivity of the turning process.

4.6.4. Optimization Results

With the above optimizations, the results are shown in Figure 19, where the productive time of the whole heavy cross-axis turning process decreases from 148.5 s to 113.7 s (each product has 4 axes that need to be treated), and the productivity of the heavy cross-axis turning process is increased by 23.43%.

5. Conclusions

In this research, a classification-based parameter optimization approach is proposed to solve the optimization problem of a complex turning process. This proposed approach can be divided into two parts: a classification part and an optimization part. In the classification part, the PCA and overlapped sliding window are used as feature engineering methods to pre-treat the raw data first. Then, the EDAMMV model is constructed to classify different substages of the turning process, which comprises an encoder-decoder framework with excellent performance in the solution of the sequence-to-sequence problem, an attention mechanism that can improve calculation speed as well as classification accuracy, and a major voting with good advantages of anti-interference. In the optimization part, in order to eliminate the adverse effect of the cutting tool wear status before further optimization, KNN and DTW are used to analyze the historical dataset and filter out target samples where the cutting tool wear status is similar to that of the current sample. Subsequently, the envelope curve strategy and boxplot method are applied in the analysis of the difference comparison in the same substage among different workpieces and the difference comparison among different substages in the same workpiece, aiming to evaluate suitable threshold values of substages and then to detect optimizable items for parameters of the turning process.

In comparative experiments, the proposed EDAMMV model outperforms other kinds of classification models (LSVR, KSVR, CNN-LSTM, LSTM-SA, and Bi-LSTM), and achieves the best MAPE value of 1.66% in the testing set. Furthermore, the robust analysis confirms the superior anti-interference ability of the EDAMMV model. In addition, based on the above optimization strategies, experimental results demonstrate that the productivity of the studied production line achieves a significant improvement of 23.43%.

In future research, the proposed approach is planned to be validated in more cases of real turning processes. Meanwhile, different machine learning algorithms will be utilized to improve the proposed approach.

Author Contributions

Conceptualization, L.Y., Y.J. and Y.Y.; methodology, L.Y. and Y.J.; software, L.Y., Y.J. and Y.Y.; validation, L.Y., Y.J., Y.Y., G.Z., Z.Z. and J.C.; formal analysis, L.Y., Y.J. and Y.Y.; investigation, L.Y., Y.J., Y.Y., G.Z., Z.Z. and J.C.; resources, L.Y., Y.Y. and G.Z.; data curation, L.Y., Y.Y., G.Z. and Z.Z.; writing—original draft preparation, L.Y. and Y.J.; writing—review and editing, L.Y., Y.J., Y.Y., G.Z., Z.Z. and J.C.; visualization, L.Y. and Y.J.; supervision, Y.Y., G.Z., Z.Z. and J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program of Zhejiang Province (No. 2023C01153).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Author Yibo Jiang was employed by the company The Jiaxing Shutuo Technology Co., Ltd. Yawei Yang and Guowen Zeng were employed by the company China Unicom Jinhua Branch. Zongzhi Zhu was employed by the company The Zhejiang Guoli Security Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Upletawala, M.A.I.; Katratwar, T. A literature review on various factors affecting Turning Operation. Int. J. Eng. Technol. Manag. Sci. IJETMAS 2016, 4, 2349–4476. [Google Scholar]
Nguyen, H.S.; Vo, N.U.T. Multi-Objective Optimization in Turning Process Using RIM Method. Appl. Eng. Lett. J. Eng. Appl. Sci. 2022, 7, 143–153. [Google Scholar] [CrossRef]
Saleh, S.; Ranjbar, M. A Review on Cutting Tool Optimization Approaches. Comput. Res. Prog. Appl. Sci. Eng. 2020, 6, 163–172. [Google Scholar]
Ojstersek, R.; Brezocnik, M.; Buchmeister, B. Multi-objective optimization of production scheduling with evolutionary computation: A review. Int. J. Ind. Eng. Comput. 2020, 11, 359–376. [Google Scholar] [CrossRef]
Divya, C.; Raju, L.S.; Singaravel, B. Application of MCDM methods for process parameter optimization in turning process—A review. In Recent Trends in Mechanical Engineering: Select Proceedings of ICIME 2020; Springer: Berlin/Heidelberg, Germany, 2021; pp. 199–207. [Google Scholar]
Qi, Q.; Tao, F. Digital Twin and Big Data Towards Smart Manufacturing and Industry 4.0: 360 Degree Comparison. IEEE Access 2018, 6, 3585–3593. [Google Scholar] [CrossRef]
Djurović, S.; Lazarević, D.; Ćirković, B.; Mišić, M.; Ivković, M.; Stojčetović, B.; Petković, M.; Ašonja, A. Modeling and Prediction of Surface Roughness in Hybrid Manufacturing–Milling after FDM Using Artificial Neural Networks. Appl. Sci. 2024, 14, 5980. [Google Scholar] [CrossRef]
Chen, M.C.; Tsai, D.M. A simulated annealing approach for optimization of multi-pass turning operations. Int. J. Prod. Res. 1996, 34, 2803–2825. [Google Scholar] [CrossRef]
Rana, P.B.; Patel, J.L.; Lalwani, D.I. Parametric Optimization of Turning Process Using Evolutionary Optimization Techniques—A Review (2000–2016). In Proceedings of the Soft Computing for Problem Solving, Liverpool, UK, 2–4 September 2019; pp. 165–180. [Google Scholar]
An, L.B. Optimal Selection of Machining Parameters for Multi-Pass Turning Operations. Adv. Mater. Res. 2011, 156–157, 956–960. [Google Scholar] [CrossRef]
Dewil, R.; Vansteenwegen, P.; Cattrysse, D. Cutting Path Optimization Using Tabu Search. Key Eng. Mater. 2011, 473, 739–748. [Google Scholar] [CrossRef]
Li, Z.Q.; Liu, X.; Duan, L.S.; Liu, L. An improved hybrid genetic algorithm for holes machining path optimization using helical milling operation. J. Phys. Conf. Ser. 2021, 1798, 012035. [Google Scholar] [CrossRef]
Zhu, Z.; To, S. Adaptive tool servo diamond turning for enhancing machining efficiency and surface quality of freeform optics. Opt. Express 2015, 23, 20234–20248. [Google Scholar] [CrossRef] [PubMed]
Park, K.T.; Nam, Y.W.; Lee, H.S.; Im, S.J.; Noh, S.D.; Son, J.Y.; Kim, H. Design and implementation of a digital twin application for a connected micro smart factory. Int. J. Comput. Integr. Manuf. 2019, 32, 596–614. [Google Scholar] [CrossRef]
Pourmostaghimi, V.; Zadshakoyan, M. Designing and implementation of a novel online adaptive control with optimization technique in hard turning. Proc. Inst. Mech. Eng. Part I J. Syst. Control. Eng. 2020, 235, 652–663. [Google Scholar] [CrossRef]
Shi, S.; Zhang, H.; Mou, P. Self-learning optimization of turning process parameters based on NSGA-II and ANNs. Int. J. Mech. Eng. Robot. Res. 2020, 841–846. [Google Scholar] [CrossRef]
Marani, M.; Zeinali, M.; Kouam, J.; Songmene, V.; Mechefske, C.K. Prediction of cutting tool wear during a turning process using artificial intelligence techniques. Int. J. Adv. Manuf. Technol. 2020, 111, 505–515. [Google Scholar] [CrossRef]
Zhang, Z.; Shan, S.; Fang, Y.; Shao, L. Deep Learning for Pattern Recognition. Pattern Recognit. Lett. 2019, 119, 1–2. [Google Scholar] [CrossRef]
Efkolidis, N.; Markopoulos, A.; Karkalos, N.; Hernández, C.G.; Talón, J.L.H.; Kyratsis, P. Optimizing Models for Sustainable Drilling Operations Using Genetic Algorithm for the Optimum ANN. Appl. Artif. Intell. 2019, 33, 881–901. [Google Scholar] [CrossRef]
Wang, M.; Zhou, J.; Gao, J.; Li, Z.; Li, E. Milling Tool Wear Prediction Method Based on Deep Learning Under Variable Working Conditions. IEEE Access 2020, 8, 140726–140735. [Google Scholar] [CrossRef]
Liang, Y.C.; Li, W.D.; Lou, P.; Hu, J.M. Thermal error prediction for heavy-duty CNC machines enabled by long short-term memory networks and fog-cloud architecture. J. Manuf. Syst. 2022, 62, 950–963. [Google Scholar] [CrossRef]
Farrell, T.R. Determining delay created by multifunctional prosthesis controllers. J. Rehabil. Res. Dev. 2011, 48, xxi–xxxviii. [Google Scholar] [CrossRef]
Kim, P.K.; Lim, K.T. Vehicle Type Classification Using Bagging and Convolutional Neural Network on Multi View Surveillance Image. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 914–919. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Kabari, L.G. Comparison of Bagging and Voting Ensemble Machine Learning Algorithm a s a Classifier. Proc. Int. J. Comput. Sci. Softw. Eng. 2019, 9, 19–23. [Google Scholar]
Mienye, I.D.; Sun, Y. A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
Xie, X.; Huang, M.; Liu, Y.; An, Q. Intelligent Tool-Wear Prediction Based on Informer Encoder and Bi-Directional Long Short-Term Memory. Machines 2023, 11, 94. [Google Scholar] [CrossRef]
Liu, X.; Zhang, B.; Li, X.; Liu, S.; Yue, C.; Liang, S.Y. An approach for tool wear prediction using customized DenseNet and GRU integrated model based on multi-sensor feature fusion. J. Intell. Manuf. 2023, 34, 885–902. [Google Scholar] [CrossRef]
Tambake, N.R.; Deshmukh, B.B.; Patange, A.D. Data Driven Cutting Tool Fault Diagnosis System Using Machine Learning Approach: A Review. J. Phys. Conf. Ser. 2021, 1969, 012049. [Google Scholar] [CrossRef]
Hu, H.; Qin, C.; Guan, F.; Su, H. A Tool Wear Monitoring Method Based on WOA and KNN for Small-Deep Hole Drilling. In Proceedings of the 2021 International Symposium on Computer Technology and Information Science (ISCTIS), Guilin, China, 4–6 June 2021; pp. 284–287. [Google Scholar]
Hastie, T.; Tibshirani, R. Discriminant adaptive nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 1996, 18, 607–616. [Google Scholar] [CrossRef]
Rakthanmanon, T.; Campana, B.; Mueen, A.; Batista, G.; Westover, B.; Zhu, Q.; Zakaria, J.; Keogh, E. Searching and mining trillions of time series subsequences under dynamic time warping. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 262–270. [Google Scholar]
Ameur, T.; Assas, M. Modified PSO algorithm for multi-objective optimization of the cutting parameters. Prod. Eng. 2012, 6, 569–576. [Google Scholar] [CrossRef]
Xin, Y.; Li, Y.; Li, W.; Wang, G. Towards Efficient Milling of Multi-Cavity Aeronautical Structural Parts Considering ACO-Based Optimal Tool Feed Position and Path. Micromachines 2021, 12, 88. [Google Scholar] [CrossRef]
Gayatri, R.; Baskar, N. Performance analysis of non-traditional algorithmic parameters in machining operation. Int. J. Adv. Manuf. Technol. 2015, 77, 443–460. [Google Scholar] [CrossRef]
Fang, Y.; Zhao, L.; Lou, P.; Yan, J. Cutting parameter optimization method in multi-pass milling based on improved adaptive PSO and SA. J. Phys. Conf. Ser. 2021, 1848, 012116. [Google Scholar] [CrossRef]
Chu, W.L.; Xie, M.J.; Wu, L.W.; Guo, Y.S.; Yau, H.T. The Optimization of Lathe Cutting Parameters Using a Hybrid Taguchi-Genetic Algorithm. IEEE Access 2020, 8, 169576–169584. [Google Scholar] [CrossRef]
Zhang, H.; Zhang, Q.; Shao, S.; Niu, T.; Yang, X. Attention-Based LSTM Network for Rotatory Machine Remaining Useful Life Prediction. IEEE Access 2020, 8, 132188–132199. [Google Scholar] [CrossRef]
Chen, X.; Zhang, B.; Gao, D. Bearing fault diagnosis base on multi-scale CNN and LSTM model. J. Intell. Manuf. 2021, 32, 971–987. [Google Scholar] [CrossRef]
Liang, J.; Wang, L.; Wu, J.; Liu, Z.; Yu, G. Prediction of Spindle Rotation Error through Vibration Signal based on Bi-LSTM Classification Network. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1043, 042033. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Tulbure, A.-A.; Tulbure, A.-A.; Dulf, E.-H. A review on modern defect detection models using DCNNs—Deep convolutional neural networks. J. Adv. Res. 2022, 35, 33–48. [Google Scholar] [CrossRef]
Zeebaree, D.Q.; Abdulazeez, A.M.; Abdullrhman, L.M.; Hasan, D.A.; Kareem, O.S. The Prediction Process Based on Deep Recurrent Neural Networks: A Review. Asian J. Res. Comput. Sci. 2021, 11, 29–45. [Google Scholar] [CrossRef]
Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Umar, A.M.; Linus, O.U.; Arshad, H.; Kazaure, A.A.; Gana, U.; Kiru, M.U. Comprehensive Review of Artificial Neural Network Applications to Pattern Recognition. IEEE Access 2019, 7, 158820–158846. [Google Scholar] [CrossRef]
Kavita, D.; Saxena, D.A.; Joshi, J. Using of Recurrent Neural Networks (RNN) Process. Int. J. Res. Anal. Rev. 2019. [Google Scholar]
Zhang, X.Y.; Yin, F.; Zhang, Y.M.; Liu, C.L.; Bengio, Y. Drawing and Recognizing Chinese Characters with Recurrent Neural Network. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 849–862. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Du, W.; Zhu, Z.; Wang, C.; Yue, Z. The Real-time Big Data Processing Method Based on LSTM for the Intelligent Workshop Production Process. In Proceedings of the 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), Xiamen, China, 8–11 May 2020; pp. 63–67. [Google Scholar]
Munasypov, R.A.; Idrisova, Y.V.; Masalimov, K.A.; Kudoyarov, R.G.; Fetsak, S.I. Real-Time Diagnostics of Metal-Cutting Machines by Means of Recurrent LSTM Neural Networks. Russ. Eng. Res. 2020, 40, 416–421. [Google Scholar] [CrossRef]
Pang, K.; Zhu, B.; Zhang, H.; Liu, N.; Xu, M.; Zhang, L. An Approach Based on Demand Prediction with LSTM for Solving Multi-batch 2D Cutting Stock Problems. In Proceedings of the Advances in Artificial Intelligence and Security, Dublin, Ireland, 19–23 July 2021; pp. 3–15. [Google Scholar]
Shen, S.-L.; Atangana Njock, P.G.; Zhou, A.; Lyu, H.-M. Dynamic prediction of jet grouted column diameter in soft soil using Bi-LSTM deep learning. Acta Geotech. 2021, 16, 303–315. [Google Scholar] [CrossRef]
Yao, Y.; Huang, Z. Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation. In Proceedings of the Neural Information Processing, Kyoto, Japan, 16–21 October 2016; pp. 345–353. [Google Scholar]
Qiao, M.; Li, H. Application of PCA-LSTM model in human behavior recognition. J. Phys. Conf. Ser. 2020, 1650, 032161. [Google Scholar] [CrossRef]
Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
Wahid, M.F.; Tafreshi, R.; Langari, R. A Multi-Window Majority Voting Strategy to Improve Hand Gesture Recognition Accuracies Using Electromyography Signal. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 427–436. [Google Scholar] [CrossRef]
Aitken, K.; Ramasesh, V.V.; Cao, Y.; Maheswaranathan, N. Understanding How Encoder-Decoder Architectures Attend. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–14 December 2021; pp. 22184–22195. [Google Scholar]
Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. Part C Emerg. Technol. 2015, 54, 187–197. [Google Scholar] [CrossRef]
Cho, K.; Courville, A.; Bengio, Y. Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks. IEEE Trans. Multimed. 2015, 17, 1875–1886. [Google Scholar] [CrossRef]
Zhang, J.; Du, J.; Dai, L. A GRU-Based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; pp. 902–907. [Google Scholar]
Kim, B.; Lee, J. A Bayesian Network-Based Information Fusion Combined with DNNs for Robust Video Fire Detection. Appl. Sci. 2021, 11, 7624. [Google Scholar] [CrossRef]
Georgoulas, G.; Karvelis, P.; Loutas, T.; Stylios, C.D. Rolling element bearings diagnostics using the Symbolic Aggregate approXimation. Mech. Syst. Signal Process. 2015, 60–61, 229–242. [Google Scholar] [CrossRef]
Pérez-Alvarado, M.E.; Gómez-Espinosa, A.; González-García, J.; García-Valdovinos, L.G.; Salgado-Jiménez, T. Convolutional Long Short-Term Memory Predictor for Collaborative Remotely Operated Vehicle Trajectory Tracking in a Leader–Follower Formation Subject to Communication and Sensor Latency in the Presence of External Disturbances. Machines 2024, 12, 691. [Google Scholar] [CrossRef]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer Science Business Media: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
Kohavi, R. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 20–25 August 1995. [Google Scholar]
Nugroho, A.; Suhartanto, H. Hyper-Parameter Tuning based on Random Search for DenseNet Optimization. In Proceedings of the 2020 7th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), Semarang, Indonesia, 24–25 September 2020; pp. 96–99. [Google Scholar]
Nguyen, T.; Raghu, M.; Kornblith, S. Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth. arXiv 2020, arXiv:2010.15327. [Google Scholar]
Lu, Z.; Pu, H.; Wang, F.; Hu, Z.; Wang, L. The expressive power of neural networks: A view from the width. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Bengio, Y.; LeCun, Y. Scaling Learning Algorithms toward AI. In Large-Scale Kernel Machines; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]

Figure 1. The flowchart of the proposed approach.

Figure 2. The schematic diagram of the overlapped sliding window.

Figure 3. The structure of the EDAMMV model.

Figure 4. The schematic diagram of LSTM unit.

Figure 5. The schematic diagram of major voting.

Figure 6. The schematic diagram of the optimization method.

Figure 7. The schematic diagram of the alignment task when n is 6 and m is 9.

Figure 8. The comparison between real distribution and sampling distribution (30 samples).

Figure 9. The picture of the data acquisition and turning machine. (a) The data acquisition of main shaft; (b) the data acquisition of turret; (c) the data acquisition of main shaft power; (d) the workpiece.

Figure 10. The schematic diagram of nine stages in the turning process.

Figure 11. The effects of main-relevant component number. (a) Cumulative contribution rate. (b) The performance of main-relevant component numbers on the training set and the testing set.

Figure 12. The effects of overlapped sliding window size.

Figure 13. The training process of EDAMMV model. (a) The loss curve of EDAMMV model on the training set. (b) MAPE curves during training process.

Figure 14. Classification results of the turning process. (a) The classification results from time step 1 to 3400. (b) The classification results from time step 3401 to 6800. (c) The classification results from time step 6801 to 10,200. (d) The classification results from time step 10,201 to 13,600.

Figure 15. The wear status analysis of Cutting Tool. (a) Effects of tool wear status on vibration. (b) Distance calculation by DTW.

Figure 16. The calculation of envelopes.

Figure 17. Envelopes of the turning process.

Figure 18. The distribution of productive time in each substage among different workpieces.

Figure 19. The optimization effect of the turning process.

Table 1. Dataset details.

Name	Unit	Min	Max	Mean
Main shaft power	w	1	9426	2969
Feed rate	mm/min	0	39,715	2434
X-axis coordinate	mm	−180	3	−141
Z-axis coordinate	mm	−672	3	−576
Main shaft vibration in z-axis	mm/s²	3	3574	532
Main shaft vibration in x-axis	mm/s²	3	1930	688
Turret vibration in z-axis	mm/s²	3	4627	528
Turret vibration in x-axis	mm/s²	3	4944	1322

Table 2. The effects of hidden neuron numbers.

Hidden Neuron Numbers		MAPE (%)
Bi-LSTM Network	LSTM Network	Training Set	Testing Set
5	10	2.91	4.24
10	20	1.75	3.40
15	30	1.06	1.76
20	40	1.07	1.66
25	50	1.17	2.32
30	60	1.06	3.18
35	70	1.16	3.24

Table 3. The effects of hidden layer numbers.

Hidden Layer Numbers		MAPE (%)
Bi-LSTM Network	LSTM Network	Training Set	Testing Set
1	1	1.07	1.66
1	2	1.01	2.48
1	3	0.97	2.10
2	1	0.96	1.88
2	2	0.93	1.86
2	3	0.91	2.18
3	1	0.70	1.74
3	2	0.67	1.80
3	3	0.82	2.26

Table 4. Main parameters of the classification algorithm.

Name	Parameter	Value
PCA	Main-relevant component number	4
Overlapping sliding window	Window size	40
Overlapping sliding window	Sliding size	1
Input layer	Output size	None × 40 × 4
Encoder	Network type	Bi-LSTM
Encoder	Hidden layer	1
Encoder	Hidden neuron	20
Encoder	Output size	None × 40 × 40
Decoder	Network type	LSTM
Decoder	Hidden layer	1
Decoder	Hidden neuron	40
Decoder	Output size	None × 40 × 40
SoftMax layer	Output size	None × 40 × 9
Major voting	Output size	None × 1

Table 5. The comparison among classification models.

Model	MAPE (%)
Model	Training Set	Testing Set
LSVR	42.26	42.50
KSVR	13.38	16.04
CNN-LSTM	1.52	3.81
LSTM-SA	1.51	3.29
Bi-LSTM	1.05	2.66
EDAMMV	1.07	1.66

Table 6. The performance of classification models under different background noise.

SNR (dB)	MAPE of Test Set (%)
SNR (dB)	CNN-LSTM	LSTM-SA	Bi-LSTM	EDAM	EDAMMV
2	68.99	30.32	20.74	18.46	6.88
3	22.96	8.05	3.64	3.12	1.98
4	5.04	4.00	2.78	2.35	1.68
5	3.95	3.51	2.73	2.33	1.70
6	3.87	3.35	2.69	2.27	1.70
8	3.83	3.30	2.67	2.27	1.68
10	3.81	3.29	2.66	2.27	1.66

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, L.; Jiang, Y.; Yang, Y.; Zeng, G.; Zhu, Z.; Chen, J. Classification-Based Parameter Optimization Approach of the Turning Process. Machines 2024, 12, 805. https://doi.org/10.3390/machines12110805

AMA Style

Yang L, Jiang Y, Yang Y, Zeng G, Zhu Z, Chen J. Classification-Based Parameter Optimization Approach of the Turning Process. Machines. 2024; 12(11):805. https://doi.org/10.3390/machines12110805

Chicago/Turabian Style

Yang, Lei, Yibo Jiang, Yawei Yang, Guowen Zeng, Zongzhi Zhu, and Jiaxi Chen. 2024. "Classification-Based Parameter Optimization Approach of the Turning Process" Machines 12, no. 11: 805. https://doi.org/10.3390/machines12110805

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Classification-Based Parameter Optimization Approach of the Turning Process

Abstract

1. Introduction

2. Related Work

2.1. Parameter Optimization Method

2.2. Pattern Recognition Methods

3. The Proposed Method

3.1. Data Pre-Processing

3.2. Classification Method

3.2.1. Encoder-Decoder Framework

3.2.2. Attention Mechanism

3.2.3. Major Voting

3.3. Optimization Method

3.3.1. Cutting Tool Wear Status Matching

3.3.2. Optimizable Item Detection

4. Experimental Case Study

4.1. Dataset Description

4.2. Classification Labels Setup

4.3. Experiment Setup and Evaluation Metric

4.4. Parameter Optimization of Classification Method

4.4.1. The Number of Main-Relevant Components in the PCA

4.4.2. The Size of the Overlapped Sliding Window

4.4.3. The Number of Neurons and Layers in the EDAMMV Model

4.5. Classification of Substages in Turning Process

4.5.1. Performance of the EDAMMV Model

4.5.2. Comparison of Classification Models

4.5.3. Robust Analysis

4.6. Optimization of Turning Process

4.6.1. Wear Status Analysis of the Cutting Tool

4.6.2. Comparing Differences Among Different Substages in the Same Workpiece

4.6.3. Comparing Differences in the Same Substage Among Different Workpieces

4.6.4. Optimization Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI