[go: up one dir, main page]

CN105809203B - A kind of systematic steady state detection algorithm based on hierarchical clustering - Google Patents

A kind of systematic steady state detection algorithm based on hierarchical clustering Download PDF

Info

Publication number
CN105809203B
CN105809203B CN201610146318.XA CN201610146318A CN105809203B CN 105809203 B CN105809203 B CN 105809203B CN 201610146318 A CN201610146318 A CN 201610146318A CN 105809203 B CN105809203 B CN 105809203B
Authority
CN
China
Prior art keywords
class
matrix
value
clustering
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610146318.XA
Other languages
Chinese (zh)
Other versions
CN105809203A (en
Inventor
赵均
季丁
季一丁
邵之江
徐祖华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201610146318.XA priority Critical patent/CN105809203B/en
Publication of CN105809203A publication Critical patent/CN105809203A/en
Application granted granted Critical
Publication of CN105809203B publication Critical patent/CN105809203B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于层次聚类的系统稳态检测算法。对于一段连续时间区间内的工业数据,计算所有两两类之间的距离,得到矩阵,找到矩阵中的最小距离值合并为一个新类,并作删除,再处理获得更新后的矩阵,迭代计算获得层次聚类树的表达矩阵;对于表达矩阵,由最后一次迭代数值向第一次迭代数值的顺序依次计算获得各次的聚类结果合理性值,由每次计算得到的聚类结果合理性值求取计算获得最终的阈值,以作为最终的阈值对工业数据进行聚类计算,获得最终聚类结果序列,通过联合时序判断获得系统稳态的情况。本发明适用性强,而且避免了处理数据中因为数据维度而产生的计算量剧增现象。The invention discloses a system steady state detection algorithm based on hierarchical clustering. For industrial data in a continuous time interval, calculate the distance between all two categories, get a matrix, find the minimum distance value in the matrix and merge it into a new category, delete it, and then process to obtain the updated matrix, iterative calculation Obtain the expression matrix of the hierarchical clustering tree; for the expression matrix, the rationality value of each clustering result is obtained by calculating sequentially from the last iteration value to the first iteration value, and the rationality of the clustering result obtained by each calculation is The final threshold is obtained by calculating the value, which is used as the final threshold to perform clustering calculation on industrial data to obtain the final clustering result sequence, and the steady state of the system is obtained through joint timing judgment. The present invention has strong applicability, and avoids the phenomenon of sharp increase of calculation amount caused by data dimension in processing data.

Description

A kind of systematic steady state detection algorithm based on hierarchical clustering
Technical field
The present invention relates to a kind of system detecting methods, examine more particularly, to a kind of systematic steady state based on hierarchical clustering Method of determining and calculating.
Background technique
In the research to a procedures system, stable state is the most important and most common hypothesis.Whether system is in stable state, It is directly related to the subsequent method to the modeling of system, control and optimization.Principle and structure is complicated inside procedures system, shows as Physical quantity is there is stronger coupling, and there is extremely strong non-linear for system.When system is in unsteady state, system respectively becomes The data characteristic of amount changes acutely, numerically shows as unstable or abnormal input/output relation presence with real system Relatively large deviation.Only system is under steady working condition, and parameters and variable just have stronger state consistency.Based on such Situation, the evaluation to equipment runnability, the analysis of plant characteristic and controller's effect require steady to obtain the operation of system Premised on state.
With the development of process industrial, system and production method all tend to complicate in actual production process, are related to Procedures system object be often multivariable, high dimension and close coupling, system overall performance is non-linear, time variation, not really Qualitative and imperfection.Although complicated system has caused great difficulties mechanism study and modeling, simultaneously because DCS is controlled The application of system processed and intelligence instrument, so that more and more process datas are recorded.It is each in actual procedures system Between a variable existing strong and complicated coupled relation to study by measurement data the stability of whole system at In order to may, with data mining and the continuous development of Statistical Learning Theory and perfect, process industrial field is also gradually being used Related algorithm solving practical problems produce such as statistical Process Control field.On the problem of handling stable state detection, big data Thought and method it is also different from traditional Study of Control Process method, the latter judge system whether be in stablize often by Several key variables carry out decision to formulate stable state standard, and the former then requires to divide system from whole data Analysis, the result theoretically obtained based on total data more comprehensively and will be truly reflected the actual conditions of system, because This available more acurrate reliable result.
For stable state test problems since last century, the eighties was suggested, many scholars at home and abroad are proposed different stable states The method of detection, but due to the complexity of field data, it is much existing that the stable state of validity is demonstrated in l-G simulation test The testing result that detection means obtains in practical applications is not exactly accurate.And various methods are by itself affect and reality Specific requirements limitation in, the occasion of application are also not quite similar.
Cluster is technology critically important in unsupervised learning field, for the individual in data sample to be divided into different classes Not, so that individual has similitude as high as possible in class, individual has otherness as high as possible between class.Hierarchical clustering thought It is to be proposed by Johnson S C. in 1967 earliest, it is different not needed with other clusters, hierarchical clusterings such as EM and K-means Classification number is given in advance, and does not need iterative optimization procedure, can be obtained by given similarity function and threshold value poly- Class result.It, can be to avoid " dimension disaster ", but since hierarchical clustering needs to calculate due to the problems such as not being related to Optimization Solution The distance between sample two-by-two, complexity can be with the increase of sample size and square multiple increases.
The flow chart of Traditional calculating methods structure such as Fig. 1, it is assumed that have N number of sample in sample data sets to be clustered Body, the algorithm of hierarchical clustering specifically have following step:
1) similarity function and algorithm termination condition threshold value (generally maximum inter- object distance or infima species spacing are defined From);
2) each individual in data set is gathered for one kind, total N class;
3) current data set is allocated as calculating the similarity between every class for k class, and d (i, j) is indicated between i and j class Similarity merges i-th and jth class if d (i, j) is minimum value in current similarity set between any two.Number is clustered at this time K-1 is become by k;
4) termination condition value at this time is calculated, terminates algorithm if meeting threshold value;
5) it returns to 3), as k=1, all sampled points are all gathered for same class, and clustering algorithm can not continue, then algorithm Stop.
Conventional method major defect is strong to the dependence of process object as a result, and universality is poor, and mainly for monotropic Amount carries out stable state detection, therefore applicability is not strong, and not can avoid the meter generated in processing data because of data dimension The problem of calculation amount increases severely.
Summary of the invention
In order to solve the problems, such as background technique, the stable state based on hierarchical clustering that the invention proposes a kind of, which detects, to be calculated Method.
The technical solution adopted by the present invention is that:
STEP 1 generates clustering tree:
1.1) for including the industrial data { d of N number of sampled point in one section of continuous time sectioni, i=1,2,3 ..., N, with Sampled point is as class, d in setiIndicate the industrial data of class;
1.2) all the distance between classes two-by-two are calculated, the matrix A of N × N is obtained, the element that i row j is arranged in matrix A is denoted as aij, aii=0, aijIndicate class diWith class djThe distance between, obtained matrix A is as follows:
1.3) the lowest distance value a in matrix A is foundmn, amnFor its class spacing.amnIt indicates m class and the n-th class is distance M class and the n-th class are merged into a new class by nearest class, m row, n row, m column and n column in puncture table A, will be in matrix A Remaining class uses the identical mode of step 1.2) to carry out processing again with the new class merged and obtains updated matrix A;
1.4) step 1.2)~1.3 are repeated) it is iterated, until matrix A becomes 1 × 1 matrix, record each iteration meter M, n and a during calculationmn, constitute the expression matrix Z of N × 3 of hierarchical clustering tree;
2 threshold value of STEP is chosen:
2.1) it for expression matrix Z, successively calculates and adopts from last time iterative numerical to the sequence of first time iterative numerical It is calculated with the following methods:
With current lowest distance value amnIt is threshold value to industrial data { diCarry out cluster calculation, obtained cluster result sequence It is classified as Tk, k expression iteration ordinal number, cluster result sequence TkFor the integer sequence of 1 × N, wherein Tk(i) cluster result sequence is indicated TkI-th cluster result, Tk(i)=p;Cluster result sequence T is sought in calculatingkDifference sequence D, be to every in sequence A adjacent element subtracts each other acquisition difference, calculates in difference sequence D zero number as cluster result reasonability value D_zero (k).
2.2) cluster result reasonability value sequence is made of the cluster result reasonability value D_zero (k) being calculated every time D_zero calculates the difference sequence for seeking cluster result reasonability value sequence D_zero, is to the adjacent member of each of sequence Element subtracts each other acquisition difference, finds maximum difference value and its serial number k at place in difference sequence, and therefrom with z3kAs final Threshold value.
The step STEP 2 is successively calculated from last time iterative numerical to the sequence of first time iterative numerical refers to meter It calculates until the iterative numerical of the centre of the first time iterative numerical of k=1 or k=N/2 time.
When k is too small, the result of cluster is nonsensical to final stable state identification, generally selects to reduce calculation amount Select stop condition of the k=N/2 as algorithm.
STEP 3 combines timing and judges stable state:
Element T in final cluster result sequence TiIf the following conditions are met, then it is assumed that m-th of sampled point to m+k-1 System between a sampled point is in stable state:
Ti=c, i=m, m+1, m+2 ..., m+k-1
Wherein, TiFor the corresponding cluster result of the i-th sampled point in final cluster result sequence T, c indicates result constant, k= τ/Ts, τ is above-mentioned time span threshold value, TsFor the sampling time interval of data.
The size of τ is determined according to the time response of institute's research object, generally takes τ=3t*, t*For the unit step of system Response regulation time.
Merge obtained new class in the step 1.3) and be placed in the end in updated matrix A, end is added new one Row/column can be denoted as [a1(N+1) a2(N+1)…]T
The ranks digital number of remaining class remains unchanged in the matrix A, the ranks serial number number of the new class merged Word is all different using new ranks digital number and with the ranks digital number of all classes before, so that whole system process is every One kind has unique ranks digital number.
Merge the distance between remaining each class in obtained class and matrix A in the step 1.3) and uses following formula It calculates:
asi=α ami+(1-α)an*
Wherein, α is weight parameter, 0≤α≤1, αsiIndicate the spacing between s class and i class, aniIt indicates between n class and i class Spacing, amiIndicate the spacing between m class and i class.
" feature vector " (or characteristic point) that each sampled point of system is regarded as to a state space, when system is in When stable state, characteristic point fluctuates in certain section, the higher-dimension Gaussian Profile being rendered as centered on a certain specified point.At system When transition state or unstable state, state point will be disengaged from original Gaussian Profile, and then show the distribution shape in addition dispersed State.When being in unstable state based on system between the moment of front and back state this widely different feature, can be by comparing system shape Difference degree between state feature vector carries out the detection of procedures system stable state.Here clustering algorithm is introduced, is examined in stable state In survey, there is very high similitude between steady state data, can be gathered in cluster for one kind, while dynamic data point and stable state number There are great differences between, then can be assigned to different classifications.
Due to the characteristic of system, it can only guarantee that concentration is compared in state distribution when system is in stable state, but if be When system is in fluctuation status, its fluctuation being distributed is very big in state space.That is, being deposited between stable state sampled point While similitude, the similitude between dynamic sampling point is very little, the dynamic point in cluster result, in continuous time It may be gathered in many different classes.Therefore, be before cluster can not determine finally there are class number, be based on this Hierarchical clustering algorithm of the present invention is used a bit.
The beneficial effects of the present invention are:
The present invention by " big data " thought and technology be introduced into stable state detection in, by comparing sampled point each in data set it Between similarity degree, carry out stable state detection in combination with the temporal characteristics of data, and propose and how to determine cluster threshold value Method.
It is characteristic of the invention that strong applicability, and avoid the calculation amount generated in processing data because of data dimension Sharp increase phenomenon.
Detailed description of the invention
Fig. 1 is the flow chart of hierarchical clustering algorithm.
Fig. 2 is the flow chart of the method for the present invention threshold value selection course.
Fig. 3 is the flow chart that the method for the present invention combination timing judges stable state.
Fig. 4 is the input-output curve figure of 1 second order analogue system of embodiment.
Fig. 5 is 1 second-order system state point cluster result figure of embodiment.
Fig. 6 is that embodiment 1 clusters testing result figure.
Fig. 7 is the input-output curve figure of 2 second order analogue system of embodiment.
Fig. 8 is that embodiment 2 takes the cluster result comparison diagram obtained when different threshold values.
Fig. 9 is the pending data curve graph that embodiment 3 inputs.
Figure 10 is the cluster result in the 50th iteration of embodiment 3.
Figure 11 is 0 number and its change profile figure in D in 3 all 50 iteration of embodiment.
Figure 12 is the representative part cluster result of embodiment 3.
Figure 13 is the stable state testing result figure of embodiment 3.
Specific embodiment
Present invention will be further explained below with reference to the attached drawings and examples.
The method of the present invention is applied particularly in the detection of procedures system stable state, as shown in Fig. 2, there is following process: wait divide Data point { the d of classi, the metric form of the distance and between class distance put in its state space is defined first (due to process system The state point of system, according to certain probability and aggregation extent random distribution, detects in its corresponding state space in stable state We select to measure using Euclidean distance in method).When measuring between class distance, we tie stable state of concern in cluster It is a class in fruit, we will improve the aggregation extent of every one kind as far as possible, and maximum distance is selected more to meet above-mentioned requirements.
The input of algorithm: according to the system history data state point { d of time-sequencingi, wherein { diIt is the vector number that p is tieed up According to i=1,2,3 ..., N, N are number of sampling points.
The output of algorithm: the corresponding classification T of each sampled point, T is the one-dimensional sequence that length is N, i.e., sampled point classification is compiled Number sequence.Here class number only plays the role of distinguishing classification, meaning without any physics and numerically.
The step of according to hierarchical clustering, following algorithm are iterated by calculating the distance between data centrostigma, often One step, which checks whether, reaches algorithm termination condition, and algorithm terminates, and exports cluster result T.
Since the data object of analysis is the time series according to Time alignment, when system is in steady whithin a period of time State, spatial distribution also Relatively centralized.Therefore the timing for utilizing data, meets following two condition, that is, is regarded as stable state: 1) data sampling point is continuous in time;2) very high (i.e. data are gathered one kind to the similarity of data sampling point by clustering algorithm In).
Stable state is obtained by cluster result, it is thus necessary to determine that the threshold value of cluster, and judge stable minimum time section T, At selected threshold value δ, when the state point of system in a period of time of length not less than T is gathered in same class, i.e., explanation should System is in stable state in the section time.The determination of minimum time section T can mainly beg for below according to the time response of system object Influence by the selection and threshold value of cluster threshold value δ to algorithm.
During cluster calculation, there is N number of sample in data acquisition system to be clustered, data are gradually aggregated to 1 from N class In the cluster tree matrix Z of the process record N*3 of class, wherein Zi1Merge the first kind serial number in two classes when recording the i-th step cluster, Zi2Merge the second class serial number in two classes, Z when recording the i-th step clusteri3Merge the distance between two classes when recording the i-th step cluster.Cause It is that a line is found in Z that this, which obtains optimum cluster threshold value, as the end line of algorithm output, wherein Zi3As threshold value.Here The mode that we take the result to every a line to be enumerated obtains optimal threshold.
In addition innovation of the invention is have using specifying reasonable quantizating index to be calculated to find reasonable threshold value Body principle is: TkFor the cluster result that kth time is enumerated, its Difference Terms D=diff (T is usedk) imitated to measure overall cluster Fruit, as threshold value reduces, the number of 0 element is gradually reduced in D, ordinary circumstance such as (1)~(9), and 0 number is in slower in D Trend is reduced, but when a kind of stable state is split off out, as a result in will appear in a period of time system point frequently between two classes Switching, this will make in D 0 number sharply reduce, therefore, it is necessary to by cluster result enumerate until the threshold value enumerated too Stop when small meaningless to stable state identification.The number for finding out in wherein D 0 changes a maximum step, and as cluster threshold value reduces Termination step, suitable stable state threshold value required for corresponding threshold value namely current data cluster, the flow chart of threshold value Such as 3.
The embodiment of the present invention is as follows:
Embodiment 1
Embodiment 1 is directed to a second order single-input single-output system, using outputting and inputting after the method for the present invention simulation process As shown in figure 4, simulation time length is 100s, sampling interval 0.1s, Fig. 4 (a) indicate that input variable curve, Fig. 4 (b) indicate Output variable curve.
It is no noise added in 1 variable of embodiment, it can see from the curve of Fig. 4, system is in incipient stage (t=0~300) In stable state, input generates a slope adjustment at t=300, and system enters transition state, when t=600 rear slopes signal node Beam exports the final system that also tends towards stability and enters another stable state.Due to no noise added, system mode is poly- in stable state Conjunction degree is very high, and the method that wouldn't use above-mentioned algorithms selection threshold value here, direct labor selects the threshold value t=of a very little 0.01, data are clustered, obtained result is as shown in Figure 5.
Can be seen that system in whole process from the curve in Fig. 4, there are two different stable states altogether, cluster in Fig. 5 As a result in, the data point category label in same stable state is consistent, it can be seen that there are two sections of horizontal parts in Fig. 5, It is corresponding with two stable states of system on time.
The result of cluster illustrates distance of the state point in space, judges whether system is in stable state and needs combined data Timing.When system is in stable state whithin a period of time, then the data in this section of time interval will be because of similarity height And gathered same class.It in turn can be as steady-state criterion, when all data in one section of continuous time τ are all clustered calculation Method is gathered for one kind, then is considered as system and is in stable state.Wherein the length τ of period is according to the time response of institute's research object come really Fixed, taking τ=2t*, t* here is the unit-step response regulating time of system.
Therefore obtained system stabilization result is as shown in fig. 6, Fig. 6 (a) indicates second-order system status number strong point cluster result, Fig. 6 (b) indicates the final stable state testing result of system, and 1 indicates to stablize in steady result, and 0 indicates unstable, as a result in can see Out, system is about detached from from first stable state near t=300, into transition state, is again introduced into when close to t=700 steady State compares input and output, although input, which just finishes ramp signal in t=600, enters stabilization, output y is passed through after it The adjustment for having gone through a period of time just settles out.
Embodiment 2
Embodiment 2 joined noise and be emulated, and the single-input single-output systematic procedure used based on embodiment 1 is furtherly Bright threshold value chooses process, and the analogue system input and output of noise are added as shown in fig. 7, Fig. 7 (a) indicates input variable curve, Fig. 7 (b) output variable curve is indicated:
It can be seen that in Fig. 7, system is in a stable state in the section of time t=0~300, then inputs u and generates one A slope variation, through transition state after a period of time, system enters another stable state.In the state space of whole process, two The state point aggregation extent of a stable state is higher, and transition state point is then more dispersed.Descending order is taken in the selection of threshold value It is enumerated, with the diminution of cluster threshold value, the intensity inside every one kind is higher and higher.When proceeding to jth step, threshold value contracts It is small then a kind of point for belonging to same stable state originally to be separated into multiclass to a certain extent, it is believed that threshold value at this time is too small super The expectation for having gone out us, continuing to zoom out threshold value is also to be not necessarily to.At this point, -1 step of back, that is, jth threshold value achieved is For current suitable value.
Preceding 12 step that the single-input single-output second-order system of embodiment 2 enumerate cluster is as shown in Figure 8: as can be seen from Figure 8, Threshold value is larger in Fig. 8 (1), and data are only divided into two classes by algorithm, intermediate in an interim state in Fig. 8 (2)~Fig. 8 (9) Data be constantly separated under the driving of algorithm.When clustering threshold value diminution, the extent of polymerization in every one kind is higher and higher, Result of the invention is more accurate.But in Fig. 8 (10), when t is 700~1000, the state point of system is originally belonged to together One kind, but this step threshold value diminution after, many points are therefrom stripped out, and two classes being separated out in time sequencing that This interlocks.Indicate that during this period of time system is continually at two if these two types to be all considered as to the stable state of system, in Fig. 8 (10) Switch between stable state, and there is no transition state --- this is not present in natural procedures system.
It is thus regarded that the state in Fig. 8 (10) has had reached the limit of threshold value diminution, reasonable distance threshold size is answered This takes in Fig. 8 (9) the corresponding threshold value of result as final threshold value.
Embodiment 3:
Embodiment 3 is applied to the historical data that generates in real process, used certain power plant 60MW unit boiler data into Row data experiments.Data specifically include that boiler load instructs, generated output, coal input quantity, intake, each measurement point carbonated drink temperature, Pressure etc. 180 is tieed up totally.It chooses wherein 10000 point datas and uses the progress stable state detection of above-mentioned clustering algorithm.It is main in this time The variable change situation wanted is as shown in Figure 9:
As seen from Figure 9, in the period studied, system loading shares adjustment biggish twice, generally in area It is interior to produce the data of 3 sections of stable states.Main steam temperature and main steam pressure have biggish fluctuation, only from vapor (steam) temperature curve The upper fluctuation situation for not seeing system even.The state point of all variable compositions of system is clustered below, according to above It is middle selection threshold value method, selecting total the number of iterations is 50 times, the 50th time result such as Figure 10 is proved in data experiments, without Method obtains useful message from result, therefore the iteration of stopping continuation herein, 0 number and its variation in the D finally obtained If Figure 11, Figure 11 (a) indicate that 0 number is distributed in D in 50 iteration, Figure 11 (b) indicates 0 several change in D in 50 iteration Change situation.
In step 16, the number of 0 element mutates embodiment in D, in order to more be intuitive to see the variation of cluster, The representative iteration result of selected part is as shown in Figure 12.It can be seen that in Figure 12, when iteration is from step 16 to step 17, Point in 0~2000 period is split into two classes, and the mutation of 0 element number coincide in the picture and D on cluster result, Illustrate the reasonability of the method for the present invention.
According to cluster obtain as a result, suitable time span is selected to carry out stable state detection, according to the characteristic of industrial object, Think to keep stablizing in 500 sampled points when system, then it is believed that system is in stable state.In the cluster result of embodiment, Using being not less than in 500 points of section in cluster result, all the points cluster number is all identical, then it is assumed that and system is in stable state, The stable state detection output result for obtaining system accordingly is as shown in figure 13, and Figure 13 (c) wherein 1 indicates to stablize, and 0 indicates unstable.
The final visible present invention has its significant technical effect, and stable state detects strong applicability, avoids in processing data The calculation amount sharp increase phenomenon generated because of data dimension.
Above-described embodiment is not for limitation of the invention, and the present invention is not limited only to above-described embodiment, as long as meeting The present invention claims all belong to the scope of protection of the present invention.

Claims (4)

1.一种基于层次聚类的系统稳态检测算法,其特征在于:1. a system steady state detection algorithm based on hierarchical clustering is characterized in that: STEP 1生成聚类树:STEP 1 generates a clustering tree: 1.1)对于一段连续时间区间内包含N个采样点的工业数据{di},i=1,2,3…,N,以集合中采样点作为类,di表示类的工业数据;1.1) For industrial data {d i }, i = 1, 2, 3..., N containing N sampling points in a continuous time interval, take the sampling points in the set as the class, and d i represents the industrial data of the class; 1.2)计算所有两两类之间的距离,得到N×N的矩阵矩阵A中p行q列的元素记为apq,apq表示类dp和类dq之间的距离,app=0;1.2) Calculate the distance between all two classes to get an N×N matrix The elements of p row and q column in matrix A are denoted as a pq , a pq represents the distance between class d p and class d q , and a pp =0; 1.3)找到矩阵A中的类间距最小值amn,即第m类和第n类为当前距离最近的两个类,将第m类和第n类合并为一个新类,删除矩阵A中的m行、n行、m列和n列,将当前剩余的类与合并得到的新类再采用步骤1.2)相同的方式计算类间距,获得更新后的矩阵A;1.3) Find the minimum class spacing a mn in matrix A, that is, the mth class and the nth class are the two classes with the closest distance, merge the mth class and the nth class into a new class, and delete the matrix A. m rows, n rows, m columns and n columns, calculate the class spacing in the same way as step 1.2) between the current remaining classes and the new class obtained by merging, and obtain the updated matrix A; 1.4)重复步骤1.2)~1.3)进行迭代,直至矩阵A变成1×1的矩阵,记录每次迭代计算过程中的m,n和amn,构成层次聚类树的N×3的表达矩阵Z,m、n和amn分别作为矩阵Z的第一、第二、第三列数值;1.4) Repeat steps 1.2) to 1.3) for iteration until matrix A becomes a 1×1 matrix, record m, n and a mn in each iteration calculation process, and form an N×3 expression matrix of the hierarchical clustering tree Z, m, n and a mn are used as the first, second and third column values of matrix Z respectively; STEP 2阈值选取:STEP 2 threshold selection: 对于表达矩阵Z,由最后一次迭代数值向第一次迭代数值的顺序依次计算获得各次的聚类结果合理性值D_zero(k),由每次计算得到的聚类结果合理性值D_zero(k)求取计算获得最终的阈值,以z3f作为最终的阈值对工业数据{di}进行聚类计算,获得最终聚类结果序列T;For the expression matrix Z, the rationality value D_zero(k) of each clustering result is obtained by calculating sequentially from the last iteration value to the first iteration value, and the rationality value D_zero(k) of the clustering result obtained by each calculation ) obtain the final threshold by calculating, and perform clustering calculation on the industrial data {d i } with z 3f as the final threshold to obtain the final clustering result sequence T; STEP 3联合时序判断稳态:STEP 3 joint timing to judge steady state: 最终聚类结果序列T中的元素Ti若满足以下条件,则认为第m个采样点到第m+l-1个采样点之间的系统处于稳态:If the element T i in the final clustering result sequence T satisfies the following conditions, the system between the mth sampling point and the m+l-1th sampling point is considered to be in a steady state: Ti=c,i=m,m+1,m+2,…,m+l-1T i =c, i=m,m+1,m+2,...,m+l-1 其中,Ti为最终聚类结果序列T中第i采样点对应的聚类结果,c表示结果常数,l=τ/Ts,τ为时间长度阈值,Ts为数据的采样时间间隔;Among them, T i is the clustering result corresponding to the i-th sampling point in the final clustering result sequence T, c is the result constant, l=τ/T s , τ is the time length threshold, and T s is the data sampling time interval; 所述步骤STEP 2每次计算均采用以下方式计算:以当前的最小距离值amn为阈值对工业数据{di}进行聚类计算,得到的聚类结果序列为Tk,k表示迭代序数,聚类结果序列Tk为1×N的整数序列,计算求取聚类结果序列Tk的差分序列D,计算差分序列D中零的个数作为聚类结果合理性值D_zero(k);In the step STEP 2, each calculation is calculated in the following manner: the current minimum distance value a mn is used as a threshold to perform clustering calculation on the industrial data {d i }, and the obtained clustering result sequence is T k , where k represents the iteration ordinal number , the clustering result sequence T k is a 1×N integer sequence, calculate the difference sequence D of the clustering result sequence T k , and calculate the number of zeros in the difference sequence D as the clustering result rationality value D_zero(k); 所述步骤STEP 2中由每次计算得到的聚类结果合理性值D_zero(k)求取计算获得最终的阈值具体为:由每次计算得到的聚类结果合理性值D_zero(k)构成聚类结果合理性值序列D_zero,计算求取聚类结果合理性值序列D_zero的差分序列,从中找到差分序列中最大的差分值及其所在的序号j,并以z3j作为最终的阈值。In the step STEP 2, the final threshold value obtained by calculating the rationality value D_zero(k) of the clustering result obtained by each calculation is specifically: the rationality value D_zero(k) of the clustering result obtained by each calculation constitutes a cluster. Class result rationality value sequence D_zero, calculate the difference sequence of clustering result rationality value sequence D_zero, find the largest difference value in the difference sequence and its serial number j, and use z 3j as the final threshold. 2.根据权利要求1所述的一种基于层次聚类的系统稳态检测算法,其特征在于:所述步骤1.3)中合并得到的新类置于更新后的矩阵A中的末尾,所述矩阵A中剩余的类的行列数字序号保持不变,合并得到的新类的行列序号数字采用新的行列数字序号且与之前所有类的行列数字序号均不相同。2. a kind of system steady state detection algorithm based on hierarchical clustering according to claim 1, is characterized in that: the new class that merges and obtains in described step 1.3) is placed at the end in the updated matrix A, and described The row and column numbers of the remaining classes in matrix A remain unchanged, and the row and column numbers of the new classes obtained by merging adopt the new row and column numbers and are different from the row and column numbers of all previous classes. 3.根据权利要求1所述的一种基于层次聚类的系统稳态检测算法,其特征在于:所述步骤1.3)中合并得到的类与矩阵A中剩余的各个类之间的距离计算方式如下,记m类和n类合并后的类标号为s,i类为除s类外任一类,则计算s类和i类的距离有:3. a kind of system steady state detection algorithm based on hierarchical clustering according to claim 1, is characterized in that: the distance calculation method between the class that merges in described step 1.3) and the remaining each class in matrix A As follows, the class label of the merged class m and class n is s, and the class i is any class except class s, then the distance between class s and class i is calculated as follows: asi=αami+(1-α)ani a si =αa mi +(1-α)a ni 其中,α为权重参数,0≤α≤1,asi表示s类和i类之间的间距,ani表示n类和i类之间的间距,ami表示m类和i类之间的间距。Among them, α is the weight parameter, 0≤α≤1, a si represents the distance between class s and class i, a ni represents the distance between class n and class i, and a mi represents the distance between class m and class i spacing. 4.根据权利要求1所述的一种基于层次聚类的系统稳态检测算法,其特征在于:所述步骤STEP 2由最后一次迭代数值向第一次迭代数值的顺序依次计算是指计算到k=1的第一次迭代数值或者k=N/2的中间次迭代数值为止。4. a kind of system steady state detection algorithm based on hierarchical clustering according to claim 1, it is characterized in that: described step STEP 2 is calculated successively by the sequence of the last iteration value to the first iteration value means calculating to until the first iteration value of k=1 or the intermediate iteration value of k=N/2.
CN201610146318.XA 2016-03-15 2016-03-15 A kind of systematic steady state detection algorithm based on hierarchical clustering Expired - Fee Related CN105809203B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610146318.XA CN105809203B (en) 2016-03-15 2016-03-15 A kind of systematic steady state detection algorithm based on hierarchical clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610146318.XA CN105809203B (en) 2016-03-15 2016-03-15 A kind of systematic steady state detection algorithm based on hierarchical clustering

Publications (2)

Publication Number Publication Date
CN105809203A CN105809203A (en) 2016-07-27
CN105809203B true CN105809203B (en) 2019-01-18

Family

ID=56467412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610146318.XA Expired - Fee Related CN105809203B (en) 2016-03-15 2016-03-15 A kind of systematic steady state detection algorithm based on hierarchical clustering

Country Status (1)

Country Link
CN (1) CN105809203B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10984334B2 (en) * 2017-05-04 2021-04-20 Viavi Solutions Inc. Endpoint detection in manufacturing process by near infrared spectroscopy and machine learning techniques
CN108154173B (en) * 2017-12-21 2021-08-24 陕西科技大学 A crude oil storage tank oil-water interface measuring device and method
CN108596207A (en) * 2018-03-21 2018-09-28 上海海事大学 A kind of gantry crane motor data processing method based on hierarchical clustering
CN108664000A (en) * 2018-03-26 2018-10-16 中南大学 A kind of alumina producing evaporation process steady state detecting method for use and system
CN112148942B (en) * 2019-06-27 2024-04-09 北京达佳互联信息技术有限公司 Business index data classification method and device based on data clustering
CN118277802B (en) * 2024-03-29 2024-12-13 欧亚高科数字技术有限公司 Data interaction method for virtual simulation of anesthesiology crisis skills

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286702A (en) * 2008-05-06 2008-10-15 深圳航天科技创新研究院 Adaptive digital DC/DC control method and converter with fast dynamic response
CN103105556A (en) * 2013-01-30 2013-05-15 西安交通大学 Intelligent power grid load testing and recognition method based on steady state and transient state characteristic joint matching

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120197560A1 (en) * 2011-01-28 2012-08-02 Hampden Kuhns Signal identification methods and systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286702A (en) * 2008-05-06 2008-10-15 深圳航天科技创新研究院 Adaptive digital DC/DC control method and converter with fast dynamic response
CN103105556A (en) * 2013-01-30 2013-05-15 西安交通大学 Intelligent power grid load testing and recognition method based on steady state and transient state characteristic joint matching

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A steady-state detection (SSD) algorithm to detect non-stationary drifts in processes;Jeffrey D. Kelly 等;《Journal of Process Control》;20130331;第23卷(第3期);326-331
基于Fisher有序聚类的汽轮机试验数据稳态检测方法;饶宛 等;《电站系统工程》;20160115;第32卷(第1期);64-66
多变量复杂系统的稳态检测和提取方法研究;季一丁;《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》;20160815(第08期);B015-4
聚类算法在机电系统稳定性检测中的应用;林敬茂 等;《机电工程》;20070220;第24卷(第2期);44-46

Also Published As

Publication number Publication date
CN105809203A (en) 2016-07-27

Similar Documents

Publication Publication Date Title
CN105809203B (en) A kind of systematic steady state detection algorithm based on hierarchical clustering
CN105335752A (en) Principal component analysis multivariable decision-making tree-based connection manner identification method
CN110069467A (en) System peak load based on Pearson's coefficient and MapReduce parallel computation clusters extraction method
CN105843733B (en) A kind of method for testing performance and device of big data platform
CN106933211B (en) A method and device for identifying dynamic adjustment interval of industrial process
CN115454988B (en) Satellite power supply system missing data complement method based on random forest network
CN110782546A (en) Resistivity virtual measurement method of semiconductor PVD (physical vapor deposition) process based on combined tree model
He et al. Dynamic mutual information similarity based transient process identification and fault detection
CN114722730A (en) LightGBM and random search method based coal-fired boiler exhaust gas temperature prediction method and system
CN116819368A (en) Method and device for estimating battery health
CN110362911A (en) A kind of agent model selection method of Design-Oriented process
CN108537249B (en) Industrial process data clustering method for density peak clustering
CN118584885A (en) Intelligent switch cabinet intelligent control method and control system
CN113177078B (en) Approximate query processing algorithm based on condition generation model
CN114139446A (en) On-line detection soft measurement method for components in special rectification process
CN117909656A (en) A distillation tower temperature prediction method, system and storage medium based on ResNet-LSTM neural network model
CN117574679A (en) Main pipeline aging state evaluation method and system based on multiple data
CN117851863A (en) Feature index selection method for microservice anomaly detection
Okoro et al. Adoption of machine learning in estimating compressibility factor for natural gas mixtures under high temperature and pressure applications
Xu et al. Randomized Kd tree ReliefF algorithm for feature selection in handling high dimensional process parameter data
CN117216640A (en) Power time sequence data anomaly detection method
CN110032585A (en) A kind of time series bilayer symbolism method and device
Zhu et al. Modern big data analytics for “old-fashioned” semiconductor industry applications
JP2024037022A (en) Model generation apparatus and model generation method
CN113111588A (en) NO of gas turbineXEmission concentration prediction method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190118