WO2024098553A1 - 一种针对心电图的分析识别方法、系统以及存储介质 - Google Patents
一种针对心电图的分析识别方法、系统以及存储介质 Download PDFInfo
- Publication number
- WO2024098553A1 WO2024098553A1 PCT/CN2023/072195 CN2023072195W WO2024098553A1 WO 2024098553 A1 WO2024098553 A1 WO 2024098553A1 CN 2023072195 W CN2023072195 W CN 2023072195W WO 2024098553 A1 WO2024098553 A1 WO 2024098553A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ecg
- electrocardiogram
- data
- prior
- arrhythmia
- Prior art date
Links
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/318—Heart-related electrical modalities, e.g. electrocardiography [ECG]
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/318—Heart-related electrical modalities, e.g. electrocardiography [ECG]
- A61B5/346—Analysis of electrocardiograms
- A61B5/349—Detecting specific parameters of the electrocardiograph cycle
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/318—Heart-related electrical modalities, e.g. electrocardiography [ECG]
- A61B5/346—Analysis of electrocardiograms
- A61B5/349—Detecting specific parameters of the electrocardiograph cycle
- A61B5/363—Detecting tachycardia or bradycardia
Definitions
- the present invention relates to the field of computer technology, and in particular to an analysis and recognition method, system and storage medium for electrocardiograms.
- Arrhythmias are often accompanied by a series of clinical symptoms and complications, and can even be life-threatening. With the aging of the population and changes in lifestyle, the incidence of arrhythmias is rising rapidly, and is showing an age-related and growing trend. ECG is a basic tool in medical practice, with more than 300 million electrocardiograms taken each year around the world, and it plays a key role in diagnosing arrhythmias. The electrocardiogram can more accurately reflect the nature and extent of arrhythmias. Detecting arrhythmias from electrocardiogram records is a challenging and significant task.
- Deep learning is an algorithm that models the implicit distribution of data in machine learning.
- deep learning algorithms automatically extract low-level or high-level features required for classification. Therefore, deep learning can better represent the characteristics of data.
- the model has many levels and parameters and sufficient capacity, deep learning models are capable of representing large-scale data. Therefore, for difficult problems such as images and speech where the features are not obvious, deep learning can achieve better results on large-scale training data.
- deep learning combines features and classifiers into a framework and uses data to learn features, it reduces the huge workload of manually extracting features during use. Therefore, not only can the effect be better, but it is also very convenient to apply.
- the embodiments of the present invention provide a method, system and storage medium for analyzing and identifying an electrocardiogram with high accuracy and efficiency.
- An aspect of an embodiment of the present invention provides an analysis and recognition method for an electrocardiogram, comprising:
- the electrocardiogram to be analyzed is analyzed and identified according to the target recognition model, and an identification result of the electrocardiogram to be analyzed is determined.
- performing data enhancement processing on the arrhythmia electrocardiogram dataset according to the concept tree to construct prior data includes:
- the P-QRS-T characteristic waves and bands are divided;
- the K-means method is used to cluster the features of each ECG segment to form prior data
- the prior data of each segment are concatenated to obtain the expanded and enhanced prior data.
- the step of constructing an ECG priori model according to the priori data includes the step of constructing an ECG arrhythmia data set, which includes:
- Data distribution and class similarity analysis is performed on the training data set and the test data set to improve the discrimination of the heart rhythm data in the training data set and the test data set.
- the step of constructing an ECG priori model based on the priori data further includes a step of preprocessing the priori data, which includes:
- the electromyographic signal in the prior data is removed by using a Butterworth low-pass filter
- the power frequency interference signal in the prior data is removed by a 50 Hz finite impulse response notch filter with a Kaiser window function;
- the ECG baseline drift in the prior data is removed by an infinite impulse response zero-phase shift digital filter.
- the method of dividing the P-QRS-T characteristic waves and bands according to the characteristic points obtained by positioning includes:
- the position of the R peak is determined by the R peak detection algorithm
- the time window is traversed starting from the QRS complex to detect the P wave and T wave;
- the ECG signal is decomposed into multiple periodic heartbeats
- each heartbeat cycle is divided into 6 segments: P wave, P-Q interval, QRS complex, S-T interval, T wave, T-P interval; among them, P wave is defined as the signal between P start point and P end point;
- the method of performing time window traversal starting from the QRS complex based on the characteristic points obtained by positioning to detect the P wave and the T wave includes:
- the starting point and the end point of the P wave and the T wave are found by calculating the maximum distance between the starting point and the end point of the auxiliary line segment at each point on the signal;
- a 200 ms time window was established before the Q start point of the QRS complex, and a 400 ms time window was established after the S end point;
- the P and T peaks are detected using the same R peak detection algorithm as that used for R peak detection, and the starting and ending points of the P and T waves are determined using local distance transformation.
- determining the control point of each waveform to form ECG features includes:
- the number of control points and the number of indexes, the control points in each band are calculated, and the corresponding ECG features are determined.
- the K-means method is used to cluster the features of each ECG segment to form prior data, including:
- the distance between the instance and the cluster centroid is calculated using the Euclidean distance
- the bands are clustered using a K-means method
- each band is assigned a cluster number based on the cluster category to which it belongs.
- the method further comprises:
- the concatenated ECG signals are classified using classification naive Bayes.
- Another aspect of the embodiment of the present invention further provides an analysis and recognition system for an electrocardiogram, comprising:
- the first module is used to construct a concept tree based on the definition of electrocardiogram and arrhythmia knowledge, and establish an arrhythmia electrocardiogram data set;
- the second module is used to perform data enhancement processing on the arrhythmia electrocardiogram data set according to the concept tree to construct prior data;
- the third module is used to construct an ECG prior model according to the prior data
- the fourth module is used to optimize the ECG prior model to obtain a target recognition model
- the fifth module is used to analyze and identify the electrocardiogram to be analyzed according to the target recognition model, and determine the recognition result of the electrocardiogram to be analyzed.
- Another aspect of the embodiments of the present invention further provides a computer-readable storage medium, wherein the storage medium stores a program, and the program is executed by a processor to implement the method described above.
- the embodiment of the present invention also discloses a computer program product or a computer program, which includes a computer instruction stored in a computer-readable storage medium.
- a processor of a computer device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above method.
- the embodiment of the present invention constructs a concept tree according to the definition of electrocardiogram and arrhythmia knowledge, and establishes an arrhythmia electrocardiogram data set; performs data enhancement processing on the arrhythmia electrocardiogram data set according to the concept tree, and constructs prior data; constructs an ECG prior model according to the prior data; optimizes the ECG prior model to obtain a target recognition model; analyzes and recognizes the electrocardiogram to be analyzed according to the target recognition model, and determines the recognition result of the electrocardiogram to be analyzed.
- the present invention has high accuracy and high efficiency.
- FIG1 is an overall technical roadmap provided by an embodiment of the present invention.
- FIG2 is a flowchart of processing an arrhythmia data set provided by an embodiment of the present invention.
- FIG3 is a schematic diagram of different heartbeat positioning points provided by an embodiment of the present invention.
- FIG4 is a flowchart of heartbeat splicing and encoding provided by an embodiment of the present invention.
- FIG. 5 is a flowchart of the overall steps of an embodiment of the present invention.
- the ECG categories classified and identified by the present invention are as follows:
- Atrial flutter AFL
- VT Ventricular tachycardia
- Atrial tachycardia (AT)
- JT Atrioventricular junction tachycardia
- SA block 9.Sinoatrial conduction block
- IV block Intraventricular conduction block
- Atrioventricular conduction block (AV block)
- PVC Premature ventricular contraction
- Atrioventricular junctional premature contraction PLC
- JER Atrioventricular junction escape rhythm
- Electrocardiogram is a technique that uses an electrocardiograph to record the graph of electrical activity changes produced by each cardiac cycle from the body surface. It reflects a series of changes in the occurrence, propagation and recovery of cardiac excitement.
- the electrical excitation of a normal heart starts from the sinoatrial node. Since the sinoatrial node is located at the junction of the right atrium and the superior vena cava, the excitation of the sinoatrial node is first transmitted to the right atrium, and then to the left atrium through the atrial bundle, forming the P wave on the electrocardiogram.
- the P wave represents the excitation of the atria, the first half represents the excitation of the right atrium, and the second half represents the excitation of the left atrium.
- the P wave has a duration of 0.12 seconds and a height of 0.25mv. When the atria are enlarged and the conduction between the two chambers is abnormal, the P wave may appear as a high-peaked or double-peaked P wave.
- the PR interval represents the time required for the excitement generated by the sinoatrial node to reach the ventricles via the atria, atrioventricular junction and atrioventricular bundle and cause the ventricular muscle to begin to be excited, so it is also called atrioventricular conduction time.
- the normal PR interval is 0.12 to 0.20 seconds. When the conduction from the atria to the ventricles is blocked, it manifests as a prolongation of the PR interval or the disappearance of the ventricular wave after the P wave.
- QRS complex The excitation goes down through the His bundle and the left and right bundle branches to synchronously excite the left and right ventricles to form a QRS complex.
- the QRS complex represents the depolarization of the ventricles, and the excitation duration is less than 0.11 seconds.
- the QRS complex becomes widened, deformed, and the duration is prolonged.
- J point The intersection where the QRS wave ends and the ST segment begins. It represents the completion of depolarization of all ventricular myocytes.
- ST segment A period of time when the ventricular muscle is completely depolarized but repolarization has not yet begun. At this time, the ventricular muscles in all parts are in a depolarized state, and there is no potential difference between cells. Therefore, under normal circumstances, the ST segment should be on the isoelectric line. When a certain part of the myocardium shows signs of ischemia or necrosis, there is still a potential difference in the ventricle after depolarization, which is manifested as a deviation of the ST segment on the electrocardiogram.
- T wave The subsequent T wave represents the repolarization of the ventricle.
- the T wave In the lead with the main wave of the QRS wave upward, the T wave should be in the same direction as the main wave of the QRS wave.
- the change of the T wave on the electrocardiogram is affected by many factors. For example, myocardial ischemia may show a flat and inverted T wave. The towering T wave can be seen in hyperkalemia, the super-acute phase of acute myocardial infarction, etc.
- U wave can be seen after the T wave in some leads, which is currently believed to be related to ventricular repolarization.
- QT interval represents the time from ventricular depolarization to repolarization.
- the normal QT interval is 0.44 seconds. Since the QT interval is affected by heart rate, the concept of corrected QT interval (QTC) is introduced.
- QTC corrected QT interval
- Electrocardiogram leads The heart is a three-dimensional structure. In order to reflect the electrical activity of different planes of the heart, electrodes are placed in different parts of the human body to record and reflect the electrical activity of the heart. During a routine electrocardiogram examination, usually only 4 limb lead electrodes and 6 chest lead electrodes from V1 to V6 are placed to record a conventional 12-lead electrocardiogram.
- Cardiac conduction system is composed of special myocardial cells located in the myocardium that can generate and conduct impulses, including the sinoatrial node, internodal bundle, atrioventricular node, atrioventricular bundle, right bundle branch, left bundle branch and Purkinje fibers.
- the sinoatrial node is the pacemaker of normal heart rate, located below the epicardium between the entrance of the superior vena cava and the right auricle;
- the internodal bundle is the conduction pathway between the sinoatrial node and the atrioventricular node, which is divided into three conduction bundles: the anterior internodal bundle, the middle internodal bundle and the posterior internodal bundle.
- the anterior internodal bundle sends a branch to the left atrium called the atrioventricular bundle.
- the atrioventricular node is located below the endocardium on the right side of the atrial septum, lying horizontally in the area between the coronary sinus ostium, the oval fossa and the upper edge of the tricuspid valve septum, and extends downward as the atrioventricular bundle.
- the atrioventricular node and the atrioventricular bundle (His bundle) form the atrioventricular junction, and then extend forward and downward to the lower end of the interventricular septum membrane, and are divided into left and right bundle branches, which are located below the endocardium on the left and right sides of the interventricular septum, respectively.
- the left bundle branch starts at the left side of the interventricular septum and is divided into two bundles of fibers, the anterior branch and the superior branch; the right bundle branch descends along the right side of the interventricular septum until it begins to branch into Purkinje fibers at the apex.
- the right bundle branch is connected to the Purkinje fiber network below the endocardium and finally to the ventricular muscle.
- the function of the cardiac conduction system is to generate impulses and transmit them to various parts of the heart, causing the atrial and ventricular muscles to contract in a certain rhythm.
- the conduction system of the heart includes the sinoatrial node, atrioventricular node, atrioventricular bundle, left and right atrioventricular bundle branches, and many fine branches distributed to the ventricular papillary muscles and ventricular walls.
- pacemaker cells participating in the formation of the sinoatrial node and atrioventricular node
- transitional cells playing the role of conducting impulses
- Purkinje fibers which can transmit impulses quickly.
- the Purkinje fibers at the end of the atrioventricular bundle branches are connected to the ventricular muscle.
- the function of the cardiac conduction system is to generate and conduct impulses to maintain the rhythmic beating of the heart.
- Arrhythmia is caused by abnormal excitation of the sinus node or excitation generated outside the sinus node, slow conduction, blockage or conduction through abnormal channels, that is, the origin and (or) conduction disorder of cardiac activity leads to abnormal frequency and (or) rhythm of heart beats.
- Arrhythmia is an important group of cardiovascular diseases. It can occur alone or with other cardiovascular diseases. Its prognosis is related to the cause, inducement, evolution trend of arrhythmia, and whether it leads to severe hemodynamic disorders. It can occur suddenly and cause sudden death, or it can continue to affect the heart and cause its failure.
- Premature beats also known as premature contractions, are heart beats caused by premature impulses in a certain part of the heart. They are classified into atrial, junctional and ventricular according to the location of occurrence. Premature beats may not cause symptoms. If there is no organic heart disease, the prognosis is good. Some patients may experience palpitations, dizziness, and fatigue, which can be treated symptomatically. If there is organic heart disease, the underlying heart disease should be treated.
- Atrial flutter and atrial fibrillation During atrial flutter, the atrial rate is often between 220 and 360 beats/minute, and generally cannot be fully transmitted to the ventricles. Due to physiological atrioventricular block, a 2:1 or 3:1 transmission is formed, and occasionally there is a 1:1 atrioventricular conduction. Atrial fibrillation is a very fast arrhythmia with multiple microreentries in the atrium, with a frequency of 350 to 600 beats/minute and ventricular arrhythmia of 120 to 160 beats/minute. Atrial flutter and atrial fibrillation are common in rheumatic heart disease, hyperthyroidism, coronary heart disease, cardiomyopathy, and hypertensive heart disease. The cause of many patients with atrial fibrillation is unknown.
- Paroxysmal supraventricular tachycardia It is a paroxysmal rapid and regular ectopic rhythm, with a heart rate of generally 160 to 220 beats/minute, but it can also be as slow as 130 beats/minute or as fast as 300 beats/minute. According to the mechanism of occurrence, it can be divided into three categories: atrial, atrioventricular node reentry and atrioventricular bypass reentry. It is common in people without organic heart disease, with unknown causes, and can also be seen in rheumatic heart disease, cardiomyopathy, coronary heart disease, etc. The clinical manifestations are sudden attacks, which last for a few seconds, minutes to hours, or even a few days before suddenly stopping. Severe attacks can cause insufficient blood supply to organs such as the heart and brain, leading to a drop in blood pressure, dizziness, nausea, angina pectoris or fainting.
- ventricular tachycardia More than three consecutive ventricular premature beats are ventricular tachycardia, which is more common in patients with organic heart disease. Sustained ventricular tachycardia refers to those that last for more than 30 seconds or have severe hemodynamic disorders within 30 seconds, and non-sustained ventricular tachycardia refers to those that terminate on their own within 30 seconds. Torsades de pointes is a special type of ventricular tachycardia, which is more common in long QT syndrome and is divided into congenital and acquired types. If ventricular tachycardia is not treated in time, it can turn into ventricular fibrillation. Ventricular fibrillation is the most serious arrhythmia and requires immediate electrical defibrillation to convert the heart rhythm.
- bradycardia An adult heart rate lower than 60 beats/min is called bradycardia, which is caused by sick sinus syndrome or atrioventricular block.
- the myocardial cells in the sinoatrial node, the interatrial bundle, the vicinity of the coronary sinus ostium, the distal end of the atrioventricular node, and the His-Purkinje system have autonomy. Changes in the excitability of the autonomic nervous system or its intrinsic lesions can lead to inappropriate impulse release.
- myocardial cells that originally had no autonomy such as atrial and ventricular cells, can also develop abnormal autonomy under pathological conditions, such as myocardial ischemia, drugs, electrolyte disorders, and increased catecholamines, which can all lead to the formation of abnormal autonomy.
- Reentry is the most common mechanism of all tachyarrhythmias.
- arrhythmias are classified into the following two categories:
- Sinus tachycardia Sinus bradycardia (3) Sinus arrhythmia (4) Sinus arrest
- an embodiment of the present invention provides an electrocardiogram analysis and recognition method, as shown in FIG5 , the method includes the following steps:
- the electrocardiogram to be analyzed is analyzed and identified according to the target recognition model, and an identification result of the electrocardiogram to be analyzed is determined.
- performing data enhancement processing on the arrhythmia electrocardiogram dataset according to the concept tree to construct prior data includes:
- the P-QRS-T characteristic waves and bands are divided;
- the K-means method is used to cluster the features of each ECG segment to form prior data
- the prior data of each segment are concatenated to obtain the expanded and enhanced prior data.
- the step of constructing an ECG priori model according to the priori data includes the step of constructing an ECG arrhythmia data set, which includes:
- Data distribution and class similarity analysis is performed on the training data set and the test data set to improve the discrimination of the heart rhythm data in the training data set and the test data set.
- the step of constructing an ECG priori model based on the priori data further includes a step of preprocessing the priori data, which includes:
- the electromyographic signal in the prior data is removed by using a Butterworth low-pass filter
- the power frequency interference signal in the prior data is removed by a 50 Hz finite impulse response notch filter with a Kaiser window function;
- the ECG baseline drift in the prior data is removed by an infinite impulse response zero-phase shift digital filter.
- the method of dividing the P-QRS-T characteristic waves and bands according to the characteristic points obtained by positioning includes:
- the position of the R peak is determined by the R peak detection algorithm
- the time window is traversed starting from the QRS complex to detect the P wave and T wave;
- the ECG signal is decomposed into multiple periodic heartbeats
- each heartbeat cycle is divided into 6 segments: P wave, P-Q interval, QRS complex, S-T interval, T wave, T-P interval; among them, P wave is defined as the signal between P start point and P end point;
- the method of performing time window traversal starting from the QRS complex based on the characteristic points obtained by positioning to detect the P wave and the T wave includes:
- the starting point and the end point of the P wave and the T wave are found by calculating the maximum distance between the starting point and the end point of the auxiliary line segment at each point on the signal;
- a 200 ms time window was established before the Q start point of the QRS complex, and a 400 ms time window was established after the S end point;
- the P and T peaks are detected using the same R peak detection algorithm as that used for R peak detection, and the starting and ending points of the P and T waves are determined using local distance transformation.
- determining the control point of each waveform to form ECG features includes:
- the number of control points and the number of indexes, the control points in each band are calculated, and the corresponding ECG features are determined.
- the K-means method is used to cluster the features of each ECG segment to form prior data, including:
- the distance between the instance and the cluster centroid is calculated using the Euclidean distance
- the bands are clustered using a K-means method
- each band is assigned a cluster number based on the cluster category to which it belongs.
- the method further comprises:
- the concatenated ECG signals are classified using classification naive Bayes.
- Another aspect of the embodiment of the present invention further provides an analysis and recognition system for an electrocardiogram, comprising:
- the first module is used to construct a concept tree based on the definition of electrocardiogram and arrhythmia knowledge, and establish an arrhythmia electrocardiogram data set;
- the second module is used to perform data enhancement processing on the arrhythmia electrocardiogram data set according to the concept tree to construct prior data;
- the third module is used to construct an ECG prior model according to the prior data
- the fourth module is used to optimize the ECG prior model to obtain a target recognition model
- the fifth module is used to analyze and identify the electrocardiogram to be analyzed according to the target recognition model, and determine the recognition result of the electrocardiogram to be analyzed.
- Another aspect of the embodiments of the present invention further provides a computer-readable storage medium, wherein the storage medium stores a program, and the program is executed by a processor to implement the method described above.
- the embodiment of the present invention also discloses a computer program product or a computer program, which includes a computer instruction stored in a computer-readable storage medium.
- a processor of a computer device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above method.
- the overall implementation process of the present invention includes five major steps:
- ECG and arrhythmia concept tree sort out and redefine the knowledge of ECG and arrhythmia to form a concept tree.
- the characteristic points in the electrogram data are located, and the P-QRS-T characteristic waves and bands are segmented according to the characteristic points;
- Electrocardiogram and arrhythmia concept tree 1. Electrocardiogram and arrhythmia concept tree:
- the embodiment of the present invention systematically sorts out the categories of arrhythmias to form an arrhythmia concept tree. It can be understood that the arrhythmia concept tree can be classified according to the location of the occurrence of arrhythmias.
- the present invention can diagnose and classify 18 ECG categories.
- FIG. 2 The data annotation, processing and expression process of an embodiment of the present invention is shown in Figure 2.
- the upper part of Figure 2 is the process of subdividing and annotating samples, and the lower part is the process of processing and expressing ECG data.
- ECG annotation and ECG data together constitute the data set used by the ECG post-model.
- the present invention obtains ECG data from the relevant hospital database and marks the ECG data of each participant as normal or other 17 arrhythmias.
- the 17 arrhythmias are: ventricular premature beats (PVC), intraventricular block (IV block), ventricular tachycardia (VT), ventricular escape (VE), atrial flutter (AFL), atrial tachycardia (AT), atrial fibrillation (AF), atrial premature beats (PAC), junctional premature beats (PJC), junctional escape (JE), junctional tachycardia (JT), junctional escape rhythm (JER), atrioventricular block (AV block), sinoatrial block (SA block), sinus tachycardia (ST), sinus bradycardia (SB), sinus arrhythmia (SA).
- PVC ventricular premature beats
- IV block intraventricular block
- VT ventricular tachycardia
- VE atrial flutter
- AFL atrial tachycardia
- AT atrial fibrillation
- the 17 arrhythmias could be divided into 4 supercategories: sinus (including SA block, ST, SB, and SA), atrial (including AFL, AT, AF, and PAC), atrioventricular node (including PJC, JE, JT, JER, and AV block), and ventricular (including PVC, IV block, VT, and VE).
- sinus including SA block, ST, SB, and SA
- atrial including AFL, AT, AF, and PAC
- atrioventricular node including PJC, JE, JT, JER, and AV block
- ventricular including PVC, IV block, VT, and VE.
- the embodiment of the present invention excludes some ECG cases due to the following reasons: (1) The ECG cases are severely distorted due to missing signals or excessive noise. Since ECG records the electrical activity of the heart through electrodes placed on the skin, large movements and noisy surroundings may add irremovable noise to the signal. In addition, this distortion is more common in children's electrocardiograms. (2) The labels of ECG cases are unavailable or uncertain. (3) For participants with multiple ECG tests, in order to avoid introducing bias among participants, our study only used the last ECG test and excluded other ECG tests. Finally, 48,063 participants were excluded. The entire dataset was randomly divided into a training dataset and a test dataset at a ratio of 4:1 at the label level.
- the embodiment of the present invention also analyzes the data distribution and class similarity of the constructed training data set and test data set.
- the standard deviation of the sample size for all arrhythmia types is 5249.63.
- the mean and median of the sample size are 2670 and 106, respectively.
- the imbalance of the data set in the sample size may prevent the model from learning all arrhythmia categories equally.
- the embodiment of the present invention constructs a category-level similarity matrix, each element in the matrix is the similarity between two arrhythmia categories.
- Category-level similarity is the average similarity of all pairs of ECG signals from two arrhythmia categories.
- Dynamic time warping DTW is used to measure the similarity between two ECG signals.
- the similarity matrix is normalized to the range of [0,1] and sorted from the minimum to the maximum.
- Arrhythmias are unevenly distributed in the representation space. SB, JER, SA, JE, AV block, PJC, PVC, AT and AF are relatively similar. Those similar arrhythmia types may be more likely to fool the model. However, other arrhythmia types are more dispersed and easier to distinguish.
- ECG signal is a weak physiological signal, which is easily interfered during the acquisition process. Therefore, ECG signal needs to be preprocessed before analysis.
- the three most common interferences of ECG signal are electromyography (EMG) interference, power frequency interference and baseline drift.
- ECG electromyography
- the preprocessing of the embodiment of the present invention is based on the above three interferences.
- EMG is also a physiological signal and is the main noise in ECG.
- the frequency of EMG is related to the muscle type and is generally in the range of 30-300HZ, while the frequency of ECG is mainly in the range of 5-20HZ. Therefore, EMG signals can overlap with ECG signals.
- a Butterworth low-pass filter is used to remove EMG.
- the Butterworth low-pass filter has the flattest passband frequency response curve and gradually decreases to zero with the adjustment of the stopband.
- the amplitude of the diagonal frequency decreases monotonically, and the higher the filter order, the faster the amplitude decays in the stopband.
- the interference signal with a frequency of 50Hz is the most common interference signal.
- a 50Hz finite impulse response (FIR) notch filter with Kaiser window function is used to eliminate the power frequency signal.
- the FIR filter has the linear phase characteristics required for ECG signal processing and can obtain the best filtering performance with minimal waveform distortion.
- the Kaiser moving window is a window function close to the optimal structure, which can adaptively adjust the parameters of the filter according to different parameters.
- an infinite impulse response zero-phase shift digital filter is used to remove ECG baseline drift.
- preprocessing step for ECG analysis, it prevents the introduction of artifact information that may distort the true oscillation phase.
- main noise is removed while retaining the key information of the ECG signal.
- the embodiment of the present invention locates characteristic points such as P-QRS-T in the electrogram data and divides the bands according to the characteristic points.
- the main steps include:
- This embodiment uses an adaptive and efficient R peak detection algorithm to determine the position of the R peak. Then, the Q peak and S peak are found by searching on both sides of the R peak. Since multiple peaks rarely appear in a QRS complex, a 250ms moving window is used to iteratively query both sides. The minimum value in the first window on the left is the position of the Q peak. The minimum value of the first window on the right is the S peak.
- this embodiment proposes an improved boundary detection algorithm based on local distance transformation.
- the local distance transformation finds the starting and ending points of the wave by calculating the maximum distance between the starting and ending points of the auxiliary line segments at each point on the signal. From a morphological point of view, this point is the point with the largest curvature, which is consistent with the doctor's subjective judgment.
- a 200ms time window is established before the start of Q and a 400ms time window is established after the end of S.
- the P and T peaks are detected using the same detection algorithm as the R peak. Local distance transform is also used here to determine the start and end of the P and T waves.
- FIG. 3 shows an example where each heartbeat is divided into 6 bands by 11 positioning points.
- a total of 11 points are identified for each heartbeat, namely P start point, P point, P end point, Q start point, Q point, R point, S point, S end point, T start point, T point, T end point.
- the division of each heartbeat and the division of the heartbeat inner wave are based on the positioning of this feature point.
- this embodiment decomposes the ECG signal into multiple periodic heartbeats.
- the entire data set of this embodiment generates a total of 459,818 heartbeats.
- the number of ECG samples and the number of heartbeat cycles in each category are shown in Table 1.
- Table 1 shows the number of heartbeats decomposed by the ECG signal in each category.
- Each heartbeat cycle is further divided into 6 segments based on the positioning points: P wave, PQ interval, QRS complex, ST interval, T wave, and TP interval.
- the P wave is defined as the signal between the P start point and the P end point, and so on for the definitions of other segments.
- Each heartbeat corresponds to 6 segments, so the number of instances in each band is also 459818, which is equal to the number of samples of the heartbeat.
- This embodiment extracts control points for each band, and uses the extracted control points and their corresponding indexes as representations of each band.
- nth control point of the mth instance can be obtained by the following equation:
- sig m (*) represents the signal value of the mth instance with index *, Indicates the corresponding index value of the nth control point.
- control points will be extracted for each P wave, 6 control points for the P-Q interval, 20 control points for the QRS wave, 6 control points for the S-T interval, 14 control points for the T wave, and 18 control points for the T-P interval. 82 control points will be extracted for each heartbeat cycle.
- each cycle has 6 bands, and there are 459818*6 instances in total.
- the index and its ECG signal value are calculated separately, and finally 459818*82 pairs of control points (index, ECG signal value) are obtained.
- the K-means method is used to cluster the bands to form prior data.
- the distance between the instance and the cluster centroid is measured using the Euclidean distance.
- the algorithm stops when the centroid does not change significantly in the iteration.
- the number of clusters is one of the hyperparameters, which has 18 options: 3, 5, 8, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80.
- the number of clusters is set to 25. After the number of clusters is determined, when clustering is completed, a cluster number is assigned to each band according to the cluster category to which each band belongs.
- This section splices the data fragments and expands the prior data.
- multiple original heartbeats are spliced together as a spliced ECG signal.
- the spliced ECG signal can be regarded as a replacement of the original heartbeat, and repetition is allowed in the replacement.
- the spliced ECG signal can simulate the ECG signal that does not exist in the training data set, further enhancing the comprehensiveness of the training data set.
- No is the control is another hyperparameter of , which has 4 options: 2, 3, 4, 5.
- the N value in the training dataset corresponds to No with four options of 2869966, 3021293, 11486438, and 25508046. It is too difficult for the computing resources of this embodiment to analyze such a large number of ECG signals, so the upper limit of the number of spliced ECG signals in each class is set to 1500000 in the training dataset.
- 20 spliced ECG signals are randomly selected. The prediction summary of the 20 spliced ECG signals is used as the final prediction of the original ECG signal.
- the spliced ECG signal can be encoded as a vector.
- Each heartbeat has 6 segments on each lead.
- a cluster number is assigned to each band.
- Each heartbeat has 6 bands and 6 cluster numbers.
- Formula 8 can be derived as:
- Niac represents the number of spliced ECG signals with the i-th feature of class c
- Nc represents the number of spliced ECC signals in class c.
- the predicted label of the original ECG signal is the average predicted label of the spliced ECG signals.
- a feature engineering + classifier mode is used to construct a control model.
- 114 features were extracted from the ECG signal. Among them, 10 features (such as the average value of the P wave peak) were defined on 9 individual lead signals, resulting in 90 (10 ⁇ 9) scalar features. The remaining 24 features were related to the recording time (such as the position in the time dimension) when the segment key points appeared.
- KNN K-nearest neighbor
- RF random forest
- XGBoost extreme gradient boosting
- the first one is 1D CNN, which achieves state-of-the-art performance in multi-classification of arrhythmia subtypes.
- the second one is LSTM, which is proposed for ECG signal classification.
- recall sensitivity
- macro-recall refers to the average recall of all individual classes.
- each class has the same weight when calculating the average recall, rather than setting the weight according to the sample size of each class. Therefore, macro-recall is fair at the class level and more sensitive to the performance of minority classes.
- the recall rate with a 95% confidence interval was calculated using a non-parametric bootstrap method with 1000 iterations.
- the linear correlation between the recall rate of 18 categories and the sample size or class similarity is measured by the correlation coefficient (CC). The relationship of the CC value is interpreted as: very weak (0.00-0.19), weak (0.20-0.39), moderate (0.40-0.59), strong (0.60-0.79) and very strong (0.80-1.00).
- arrhythmias can be further divided into strong arrhythmias and weak arrhythmias in computer-aided diagnosis.
- the crowding out of weak arrhythmias by strong arrhythmias may lead to underdiagnosis of patients with weak arrhythmias.
- the present invention proposes an arrhythmia diagnosis method that combines electrocardiogram segment clustering and Bayesian theory.
- the GDPH ECG arrhythmia dataset is used to verify the method of the present invention. Through hyperparameter optimization, the optimal configuration of the method of the present invention in heartbeat splicing and segment clustering was determined.
- the method of the present invention has comparable performance in strong arrhythmias, but better performance in weak arrhythmias.
- the method of the present invention can still make accurate diagnosis of weak arrhythmias.
- the morphological features of ECG signals are key information for cardiologists to diagnose arrhythmias.
- the present invention uses segmented clustering to distinguish ECG signals with different morphological features.
- the arrangement of ECG signals at the beat level can effectively enrich the sample size of weak arrhythmias.
- the splicing of multiple heartbeats increases the dimensionality of the ECG signal, which may increase the distance between different arrhythmias in the representation space.
- the method of the present invention can alleviate the squeeze of weak arrhythmias by strong arrhythmias to a certain extent.
- An interpretable ECG computer-assisted interpretation system will be more trusted by cardiologists and therefore easier to use.
- features are encoded by segment cluster numbers with clear practical meanings. Diagnostic decisions are made based on the conditional probabilities of arrhythmia subsegments, which are simplified mathematical descriptions of the cardiologist's previous diagnostic experience. Therefore, some morphological features of arrhythmias may be implicit in segment clusters with higher conditional probabilities. Some of the discovered morphological features match the current arrhythmia diagnostic criteria. Other findings may not match the current diagnostic criteria. However, these mismatched findings may hide new diagnostic markers for arrhythmias.
- the joint conditional probability of multiple segments may be a feasible method for discovering new diagnostic markers.
- the present invention provides a complete set of systems, platforms and storage media including electrocardiogram processing methods and intelligent diagnosis algorithms for arrhythmias.
- the methods include: 1. Electrocardiogram interference elimination algorithm based on signal filtering technology; 2. P-QRS-T feature point positioning and band segmentation algorithm based on extreme cycle iteration; 3. Band control point selection and band unsupervised classification algorithm; 4. Cardiac cycle splicing and encoding algorithm based on cutting-splicing data augmentation.
- the present invention can convert and enhance the original electrocardiogram into modeling code.
- the intelligent diagnosis algorithm for arrhythmias of the present invention models the learning and cognitive abilities of humans through a Bayesian-based dynamic programming algorithm, and accurately diagnoses 17 types of arrhythmias beyond previous technologies; the model also has the ability to learn with a single sample or a small sample, can identify and judge rare or unseen arrhythmias, and generalizes applications in a manner similar to that of humans.
- the method of the present invention can also be seamlessly connected with multiple systems in a hospital to form a system, platform and storage medium for an electrocardiogram processing method and an intelligent arrhythmia diagnosis algorithm, which can be applied to clinical electrocardiogram diagnosis practice.
- the function/operation mentioned in the block diagram may not occur in the order mentioned in the operation diagram.
- the two boxes shown in succession can actually be executed substantially simultaneously or the boxes can sometimes be executed in reverse order.
- the embodiment presented and described in the flow chart of the present invention is provided by way of example, for the purpose of providing a more comprehensive understanding of technology. The disclosed method is not limited to the operation and logic flow presented herein. Selectable embodiments are expected, wherein the order of various operations is changed and the sub-operation of a part for which is described as a larger operation is performed independently.
- the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present invention.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), disk or optical disk, and other media that can store program codes.
- the logic and/or steps represented in the flowchart or otherwise described herein, for example, can be considered as an ordered list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by an instruction execution system, device or apparatus (such as a computer-based system, a system including a processor, or other system that can fetch instructions from an instruction execution system, device or apparatus and execute instructions), or in conjunction with such instruction execution systems, devices or apparatuses.
- "computer-readable medium” can be any device that can contain, store, communicate, propagate or transmit a program for use by an instruction execution system, device or apparatus, or in conjunction with such instruction execution systems, devices or apparatuses.
- computer-readable media include the following: an electrical connection with one or more wires (electronic device), a portable computer disk case (magnetic device), a random access memory (RAM), a read-only memory (ROM), an erasable and programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disk read-only memory (CDROM).
- the computer-readable medium may even be a paper or other suitable medium on which the program is printed, since the program may be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, deciphering or, if necessary, processing in another suitable manner, and then stored in a computer memory.
- a plurality of steps or methods can be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system.
- a discrete logic circuit having a logic gate circuit for implementing a logic function for a data signal
- a dedicated integrated circuit having a suitable combination of logic gate circuits
- PGA programmable gate array
- FPGA field programmable gate array
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Cardiology (AREA)
- Heart & Thoracic Surgery (AREA)
- Molecular Biology (AREA)
- Pathology (AREA)
- Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
一种针对心电图的分析识别方法、系统以及存储介质,方法包括:根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;根据概念树对心律失常心电图数据集进行数据增强处理,构建先验数据;根据先验数据,构建ECG先验模型;对ECG先验模型进行优化,得到目标识别模型;根据目标识别模型,对待分析的心电图进行分析识别,确定待分析的心电图的识别结果,准确性高而且效率高,可广泛应用于计算机技术领域。
Description
本发明涉及计算机技术领域,尤其是一种针对心电图的分析识别方法、系统以及存储介质。
心律失常往往伴随着一系列的临床症状以及并发症,甚至危及生命。随着人口老龄化和生活方式的改变,心律失常发病率快速上升,且呈年经化和不断增长的趋势。ECG是医疗实践中的基础工具,全世界每年有超过3亿张心电图,它在诊断心律不齐过程中起关键作用。心电图可以较为准确的反映心律失常的性质和程度。从心电图记录中检测心律失常是一项具有挑战性而意义重大的任务。
目前ECG机器判读的错误率还非常高,临床上ECG检测仍然依赖于有经验的医师进行视觉观察,人工判读方法受限于专业人员的水平和数量,过程枯燥、费时、效率又低,且缺乏统一的客观标准,容易造成误判和漏判。
近年来,深度学习的飞速发展让计算机在许多任务中表现出接近甚至达到人类水准的认知能力。深度学习是一种机器学习中建模数据的隐含分布的多层表达的算法。换句话来说,深度学习算法自动提取分类中所需要的低层次或者高层次特征。因此深度学习能够更好地表示数据的特征,同时由于模型的层次、参数很多,容量也足够,因此,深度学习模型有能力表示大规模数据,所以对于图像、语音这种特征不明显的棘手问题,反而能够借助深度学习在大规模训练数据上取得更好的效果。而且由于深度学习将特征和分类器结合到一个框架中,用数据去学习特征,在使用中减少了手工提取特征的巨大工作量,因此,不仅效果可以更好,而且应用起来非常方便。
发明内容
有鉴于此,本发明实施例提供一种准确性高而且效率高的,针对心电图的分析识别方法、系统以及存储介质。
本发明实施例的一方面提供了一种针对心电图的分析识别方法,包括:
根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;
根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;
根据所述先验数据,构建ECG先验模型;
对所述ECG先验模型进行优化,得到目标识别模型;
根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。
可选地,所述根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据,包括:
对心电图数据中的特征点进行定位;
根据定位得到的特征点,切分出P-QRS-T特点波与波段;
确定每段波形的控制点,形成ECG特征;
采用K-means方法对每段ECG特征进行聚类,形成先验数据;
对每个片段的先验数据进行拼接,得到扩展增强后的先验数据。
可选地,所述根据所述先验数据,构建ECG先验模型这一步骤中,包括构建ECG心律失常数据集的步骤,该步骤包括:
提取ECG数据样本;
对所述ECG数据样本进行筛选过滤,得到训练数据集和测试数据集;
对所述训练数据集和所述测试数据集进行数据分布与类相似性分析,提高训练数据集和测试数据集中心律数据的区分度。
可选地,所述根据所述先验数据,构建ECG先验模型这一步骤中,还包括对所述先验数据进行预处理的步骤,该步骤包括:
通过巴特沃斯低通滤波器对所述先验数据中的肌电信号进行去除;
通过具有Kaiser窗函数的50Hz有限冲激响应陷波滤波器对所述先验数据中的工频干扰信号进行去除;
通过无限脉冲响应零相移数字滤波器对所述先验数据中的ECG基线漂移进行去除。
可选地,所述根据定位得到的特征点,切分出P-QRS-T特点波与波段,包括:
根据定位得到的特征点,通过R峰检测算法来确定R峰的位置;
使用250ms的移动窗口来迭代查询R峰的两侧,将左侧第一个窗口中的最小值确定为Q峰值的位置,将右侧第一个窗口的最小值确定为S峰值;
根据定位得到的特征点,从QRS波群开始进行时间窗遍历,检测得到P波和T波;
根据检测到的各个关键点,将ECG信号分解为多个周期性心跳;
根据定位点,将每一个心跳周期划分为6种分段:P波、P-Q间期、QRS波群、S-T间期、T波、T-P间期;其中,P波被定义为P起点和P终点之间的信号;
具体地,所述根据定位得到的特征点,从QRS波群开始进行时间窗遍历,检测得到P波和T波,包括:
根据局部距离变换的边界检测方法,通过计算信号上每个点处辅助线段的起点和终点之间的最大距离来查找P波和T波的起点以及终点;
在QRS波群的Q起点之前建立200ms时间窗口,在S终点之后建立400ms时间窗口;
采用与R峰检测相同的R峰检测算法检测得到P峰和T峰,使用局部距离变换来确定P波和T波的起点和终点。
可选地,所述确定每段波形的控制点,形成ECG特征,包括:
分别统计每个波段的长度的分布;
计算每类波段的平均长度;
根据所述平均长度,确定每个波段的控制点个数和索引数;
根据所述平均长度、所述控制点个数和所述索引数,计算每个波段中的控制点,并确定对应的ECG特征。
可选地,所述采用K-means方法对每段ECG特征进行聚类,形成先验数据,包括:
使用欧几里德距离计算实例与簇质心之间的距离;
配置聚类数;
根据所述距离和所述聚类数,对用控制点表示的每类波段的,采用K-means方法对波段进行聚类;
聚类完成时,会根据每个波段所属的聚类类别,每个波段分配一个聚类编号。
可选地,所述方法还包括:
基于编码特征,利用分类朴素贝叶斯对拼接后的心电信号进行分类。
本发明实施例的另一方面还提供了一种针对心电图的分析识别系统,包括:
第一模块,用于根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;
第二模块,用于根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;
第三模块,用于根据所述先验数据,构建ECG先验模型;
第四模块,用于对所述ECG先验模型进行优化,得到目标识别模型;
第五模块,用于根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。
本发明实施例的另一方面还提供了一种计算机可读存储介质,所述存储介质存储有程序,所述程序被处理器执行实现如前面所述的方法。
本发明实施例还公开了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行前面的方法。
本发明的实施例根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;根据所述先验数据,构建ECG先验模型;对所述ECG先验模型进行优化,得到目标识别模型;根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。本发明的准确性高而且效率高。
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的整体技术路线图;
图2为本发明实施例提供的心律失常数据集的处理流程图;
图3为本发明实施例提供的不同心跳定位点的示意图;
图4为本发明实施例提供的心跳拼接与编码的流程图;
图5为本发明实施例的整体步骤流程图。
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
名词解释:
本发明分类识别的ECG类别有以下18种:
过速过缓:
1.房扑atrial flutter(AFL)
2.房颤atrial fibrillation(AF)
3.室性心动过速ventricular tachycardia(VT)
4.房性心动过速atrial tachycardia(AT)
5.窦性心动过速sinus tachycardia(ST)
6.房室交界区心动过速atrioventricular junction tachycardia(JT)
7.窦性心动过缓sinus bradycardia(SB)
8.窦性心律不齐sinus arrhythmia(SA)
传导异常:
9.窦房传导阻滞sinoatrial conduction block(SA阻滞)
10.室内传导阻滞intraventricular conduction block(IV阻滞)
11.房室传导阻滞atrioventricular conduction block(AV阻滞)
起源异常:
12.房性早搏premature atrial contraction(PAC)
13.室性早搏premature ventricular contraction(PVC)
14.房室交界性早搏atrioventricular junctional premature contraction(PJC)
15.室性逸搏ventricular escape(VE)
16.房室交界区逸搏atrioventricular junction escape(JE)
17.房室交界区逸搏心律atrioventricular junction escape rhythm(JER)
18.正常normal。
心电图(electrocardiogram,ECG)是利用心电图机从体表记录心脏每一心动周期所产生的电活动变化图形的技术。它反映了心脏激动的发生、传播和恢复过程的一系列变化。
心电图各波及波段的组成:
P波:正常心脏的电激动从窦房结开始。由于窦房结位于右心房与上腔静脉的交界处,所以窦房结的激动首先传导到右心房,通过房间束传到左心房,形成心电图上的P波。P波代表了心房的激动,前半部代表右心房激动,后半部代表左心房的激动。P波时限为0.12秒,高度为0.25mv。当心房扩大,两房间传导出现异常时,P波可表现为高尖或双峰的P波。
PR间期:PR间期代表由窦房结产生的兴奋经由心房、房室交界和房室束到达心室并引起心室肌开始兴奋所需要的时间,故也称为房室传导时间。正常PR间期在0.12~0.20秒。当心房到心室的传导出现阻滞,则表现为PR间期的延长或P波之后心室波消失。
QRS波群:激动向下经希氏束、左右束枝同步激动左右心室形成QRS波群。QRS波群代表了心室的除极,激动时限小于0.11秒。当出现心脏左右束枝的传导阻滞、心室扩大或肥厚等情况时,QRS波群出现增宽、变形和时限延长。
J点:QRS波结束,ST段开始的交点。代表心室肌细胞全部除极完毕。
ST段:心室肌全部除极完成,复极尚未开始的一段时间。此时各部位的心室肌都处于除极状态,细胞之间并没有电位差。因此正常情况下ST段应处于等电位线上。当某部位的心肌出现缺血或坏死的表现,心室在除极完毕后仍存在电位差,此时表现为心电图上ST段发生偏移。
T波:之后的T波代表了心室的复极。在QRS波主波向上的导联,T波应与QRS主波方向相同。心电图上T波的改变受多种因素的影响。例如心肌缺血时可表现为T波低平倒置。T波的高耸可见于高血钾、急性心肌梗死的超急期等。
U波:某些导联上T波之后可见U波,目前认为与心室的复极有关。
QT间期:代表了心室从除极到复极的时间。正常QT间期为0.44秒。由于QT间期受心率的影响,因此引入了矫正的QT间期(QTC)的概念。其中一种计算方法为QTc=QT/√RR。QT间期的延长往往与恶性心律失常的发生相关。
心电图导联:心脏是一个立体的结构,为了反应心脏不同面的电活动,在人体不同部位放置电极,以记录和反应心脏的电活动。在行常规心电图检查时,通常只安放4个肢体导联电极和V1~V66个胸前导联电极,记录常规12导联心电图。
心脏传导系统:心脏传导系统是由位于心肌内能够产生和传导冲动的特殊心肌细胞构成,包括窦房结,结间束,房室结,房室束,右束支,左束支和Purkinje纤维等。窦房结是正常心率的起搏点,位于上腔静脉入口与右心耳之间的心外膜下方;结间束是窦房结与房室结之间的传导通路,分为前结间束,中结间束和后结间束三个传导束,其中前结间束向左房发出个分支称为房间束。房室结位于房间隔右侧心内膜下方,横卧于冠状窦口、卵圆窝与三尖瓣隔瓣上缘之间的区域内,向下延伸为房室束。房室结与房室束(His束)构成房室交界区,再向前下延伸到室间隔膜部下端,分成左、右束支,分别位于室间隔左右侧心内膜下方.左束支在室间隔左侧起始部,又分为前、上支两束纤维;右束支沿室间隔右侧下行,直到心尖处才开始分支为Purkinje纤维。右束支在心内膜下方与Purkinje纤维网相连,最后连于心室肌。
心脏传导系统功能是发生冲动并传导到心脏各部,使心房肌和心室肌按一定节律性收缩。
1)心脏的传导系统包括窦房结、房室结、房室束、左右房室束分支以及分布到心室乳头肌和心室壁的许多细支。
2)除窦房结位于右心房心外膜深部,其余的部分均分布在心内膜下层。
3)组成心脏传导系统的特殊心肌纤维有以下三种类型:起搏细胞(参与组成窦房结和房室结)、移行细胞(起传导冲动的作用)和浦肯野纤维(能快速传递冲动)。
4)房室束分支末端的浦肯野纤维与心室肌相连。
5)心脏传导系统的功能是产生并传导冲动,维持心脏的节律性搏动。
心律失常(arrhythmia)是由于窦房结激动异常或激动产生于窦房结以外,激动的传导缓慢、阻滞或经异常通道传导,即心脏活动的起源和(或)传导障碍导致心脏搏动的频率和(或)节律异常。心律失常是心血管疾病中重要的一组疾病。它可单独发病,亦可与其他心血管病伴发。其预后与心律失常的病因、诱因、演变趋势、是否导致严重血流动力障碍有关,可突然发作而致猝死,亦可持续累及心脏而致其衰竭。
心律失常的类别包括以下五种:
①过早搏动,简称早搏,又称期前收缩。是心脏某一部位过早地形成冲动引起的心脏搏动。根据发生部位不同分为房性、交界区性和室性。早搏可不引起症状,如无器质性心脏病预后良好,部分病人可有心悸,头晕,乏力,可对症治疗,如有器质性心脏病应治疗其基础心脏病。
②心房扑动与心房颤动。心房扑动时,心房率常在220~360次/分,一般不能全部下传心室,由于生理性房室阻滞而形成2∶1或3∶1下传,偶有1∶1房室传导者。心房颤动为房内多灶微折返的极速心律失常,频率350~600次/分,心室节律不齐,120~160次/分。心房扑动和心房颤动常见于风湿性心脏病、甲状腺功能亢进、冠心病、心肌病和高血压性心脏病等。不少心房颤动患者的发病原因不明。
③室上性阵发性心动过速。是阵发性快速而规则的异位心律,心率一般160~220次/分,但也有慢至130次/分或快达300次/分的。按发生机制可分为心房性、房室结折返性和房室旁路折返性3类,常见于无器质性心脏病者,病因不明,也可见于风湿性心脏病、心肌病、冠心病等。临床表现为突然发作,持续数秒、数分至数小时,甚至数天突然中止,发作严重者可引起心脑等器官供血不足,导致血压下降、头晕、恶心、心绞痛或昏厥。
④室性心动过速与心室颤动。连续3个以上的室性早搏为室性心动过速,多见于器质性心脏病患者。持续性室速为持续时间在30秒以上或30秒内发生严重血流动力学障碍者,非持续性室速指30秒内自行终止者。扭转型室速是一种特殊类型室速,多见于长QT综合征,分先天性和获得性两类。室速若不及时治疗可转为心室颤动,心室颤动是最严重的心律失常,需立即进行电除颤转复心律。
⑤心动过缓。成人心率低于60次/分称心动过缓,由病态窦房结综合征或房室传导阻滞引起。
心律失常的发生机制包括以下两种
1).冲动形成的异常:
窦房结,结间束,冠状窦口附近,房室结的远端和希室束-普肯耶系统等处的心肌细胞具有自律性。自主神经系统兴奋性改变或其内在病变,均可导致不适当的冲动发放。此外,原来无自律性的心肌细胞,如心房,心室细胞,亦可在病理状态下出现异常自律性,诸如心肌缺血,药物,电解质紊乱,儿茶酚胺增多等均可导致异常自律性的形成。
2).冲动传导异常:
折返是所有快速心律失常中最常见的发生机制。
按发生机制,心律失常的分类包括以下两大类:
1.冲动形成异常:
1)窦性心律失常
(1)窦性心动过速(2)窦性心动过缓(3)窦性心律不齐(4)窦性停泊
2)异位心率
①被动异位心率(1)逸博(房性,房室交界性,室性);(2)阵发性心动过速(房性,房室交界性,室性)
②主动异位心率(1)期前收缩(房性,房室交界性,室性);(2)阵发性心动过速(房性,房室交界性,房室折返性,室性);(3)心房扑动,心房颤动(4)心室扑动,心室颤动
2.冲动传导异常:
1)生理性干扰及房室分离
2)病理性(1)窦房传导阻滞(2)房内传导阻滞(3)房室传导阻滞(4)束支或分支阻滞(左,右束支及左束支分支传导阻滞)或室内阻滞
3)房室间传导途径异常预激综合症
由于目前对于心电图的分析方法中,人工判读方法受限于专业人员的水平和数量,过程枯燥、费时、效率又低,针对现有技术存在的问题,本发明实施例的一方面提供了一种针对心电图的分析识别方法,如图5所示,方法包括以下步骤:
根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;
根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;
根据所述先验数据,构建ECG先验模型;
对所述ECG先验模型进行优化,得到目标识别模型;
根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。
可选地,所述根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据,包括:
对心电图数据中的特征点进行定位;
根据定位得到的特征点,切分出P-QRS-T特点波与波段;
确定每段波形的控制点,形成ECG特征;
采用K-means方法对每段ECG特征进行聚类,形成先验数据;
对每个片段的先验数据进行拼接,得到扩展增强后的先验数据。
可选地,所述根据所述先验数据,构建ECG先验模型这一步骤中,包括构建ECG心律失常数据集的步骤,该步骤包括:
提取ECG数据样本;
对所述ECG数据样本进行筛选过滤,得到训练数据集和测试数据集;
对所述训练数据集和所述测试数据集进行数据分布与类相似性分析,提高训练数据集和测试数据集中心律数据的区分度。
可选地,所述根据所述先验数据,构建ECG先验模型这一步骤中,还包括对所述先验数据进行预处理的步骤,该步骤包括:
通过巴特沃斯低通滤波器对所述先验数据中的肌电信号进行去除;
通过具有Kaiser窗函数的50Hz有限冲激响应陷波滤波器对所述先验数据中的工频干扰信号进行去除;
通过无限脉冲响应零相移数字滤波器对所述先验数据中的ECG基线漂移进行去除。
可选地,所述根据定位得到的特征点,切分出P-QRS-T特点波与波段,包括:
根据定位得到的特征点,通过R峰检测算法来确定R峰的位置;
使用250ms的移动窗口来迭代查询R峰的两侧,将左侧第一个窗口中的最小值确定为Q峰值的位置,将右侧第一个窗口的最小值确定为S峰值;
根据定位得到的特征点,从QRS波群开始进行时间窗遍历,检测得到P波和T波;
根据检测到的各个关键点,将ECG信号分解为多个周期性心跳;
根据定位点,将每一个心跳周期划分为6种分段:P波、P-Q间期、QRS波群、S-T间期、T波、T-P间期;其中,P波被定义为P起点和P终点之间的信号;
具体地,所述根据定位得到的特征点,从QRS波群开始进行时间窗遍历,检测得到P波和T波,包括:
根据局部距离变换的边界检测方法,通过计算信号上每个点处辅助线段的起点和终点之间的最大距离来查找P波和T波的起点以及终点;
在QRS波群的Q起点之前建立200ms时间窗口,在S终点之后建立400ms时间窗口;
采用与R峰检测相同的R峰检测算法检测得到P峰和T峰,使用局部距离变换来确定P波和T波的起点和终点。
可选地,所述确定每段波形的控制点,形成ECG特征,包括:
分别统计每个波段的长度的分布;
计算每类波段的平均长度;
根据所述平均长度,确定每个波段的控制点个数和索引数;
根据所述平均长度、所述控制点个数和所述索引数,计算每个波段中的控制点,并确定对应的ECG特征。
可选地,所述采用K-means方法对每段ECG特征进行聚类,形成先验数据,包括:
使用欧几里德距离计算实例与簇质心之间的距离;
配置聚类数;
根据所述距离和所述聚类数,对用控制点表示的每类波段的,采用K-means方法对波段进行聚类;
聚类完成时,会根据每个波段所属的聚类类别,每个波段分配一个聚类编号。
可选地,所述方法还包括:
基于编码特征,利用分类朴素贝叶斯对拼接后的心电信号进行分类。
本发明实施例的另一方面还提供了一种针对心电图的分析识别系统,包括:
第一模块,用于根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;
第二模块,用于根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;
第三模块,用于根据所述先验数据,构建ECG先验模型;
第四模块,用于对所述ECG先验模型进行优化,得到目标识别模型;
第五模块,用于根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。
本发明实施例的另一方面还提供了一种计算机可读存储介质,所述存储介质存储有程序,所述程序被处理器执行实现如前面所述的方法。
本发明实施例还公开了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行前面的方法。
下面结合说明书附图,对本发明的具体实现过程进行详细描述:
如图1所示,本发明的整体实施过程包括5大步骤:
1、心电图与心律失常概念树:对心电图及心律失常知识的梳理及重新定义,形成概念树。
2、数据:通过创新的预处理方法,建立心律失常心电图数据集。
3、切分与表达:
a)通过自创的方法,对电图数据中的特征点进行定位,并根据特征点切分出P-QRS-T特点波与波段;
b)确定每段波形的控制点,形成ECG特征;
c)采用K-means方法对数据段进行聚类,形成先验数据;
d)对数据片段进行拼接,扩展增强先验数据。
4、先验模型:建立基于贝叶斯理论的ECG先验模型
5、先验模型应用与演绎:以ECG先验模型为基础,进行演绎与举一反三,达成小样本学习、ECG的分解与生成。
下面详细描述各个步骤的具体实施过程:
1、心电图与心律失常概念树:
本发明实施例对心律失常的类别进行系统梳理,形成失常概念树,可以理解的是,失常概念树可以按心律失常的发生位置进行分类,本发明能够诊断分类的18个ECG类别。
2、GDPH ECG心律失常数据集
本发明实施例的数据标注、处理与表达过程如图2所示,图2中上半部分是对样本进行细分类与标注的过程,下半部分是对ECG数据进行处理与表达的过程,ECG标注和ECG数据一起构成ECG后期模型使用的数据集。
2.1、数据样本提取
本发明从相关医院数据库中获取ECG数据,将每位参与者的心电数据标记为正常或其他17种心律失常。17种心律失常分别为:室性早搏(PVC)、室内阻滞(IV阻滞)、室性心动过速(VT)、心室逃逸(VE)、心房扑动(AFL)、房性心动过速(AT)、心房颤动(AF)、房性早搏(PAC)、交界性早搏(PJC)、交界性逃逸(JE)、交界性心动过速(JT)、交界性逃逸心律(JER),房室传导阻滞(AV阻滞)、窦房传导阻滞(SA阻滞)、窦性心动过速(ST)、窦性心动过缓(SB)、窦性心律失常(SA)。根据发生位置,17例心律失常可分为4个超级类别:窦(包括SA阻滞、ST、SB和SA)、心房(包括AFL、AT、AF和PAC)、房室结(包括PJC、JE、JT、JER和AV阻滞)和心室(包括PVC、IV阻滞、VT和VE)。
2.2、纳排标准
本发明实施例由于以下原因排除了部分ECG病例:(1)ECG病例由于信号缺失或噪声过大而严重失真。由于ECG通过放置在皮肤上的电极记录心脏的电活动,较大的运动和嘈杂的周围环境可能会给信号增加不可消除的噪声。此外,这种失真在儿童心电图中更为常见。(2)ECG病例的标签不可用或不确定。(3)对于多次ECG测试的参与者,为了避免在参与者中引入偏差,我们的研究只使用最后一次ECG测试,而排除了其他ECG测试。最后,48063名参与者被排除在外。整个数据集在标签级别以4:1的比例随机分为训练数据集和测试数据集。
另外,本发明实施例还对构建得到的训练数据集和测试数据集进行数据分布与类相似性的分析。作为来自现实世界的数据集,GDPH ECG心律失常数据集极不平衡。ST的样本量最大(N=20273),VE的样本量最小(N=10)。所有心律失常类型的样本量标准偏差为5249.63。样本量的平均值和中位数分别为2670和106。样本量中数据集的不平衡可能使模型无法平等地学习所有心律失常类别。
本发明实施例通过构建类别级相似矩阵,矩阵中的每个元素都是两种心律失常类别之间的相似性。类别级相似性是分别来自两个心律失常类别的所有心电信号对的平均相似性。动态时间扭曲(DTW)用于测量两个心电信号之间的相似性。将相似矩阵归一化为[0,1]的范围,并从最小值到最大值进行排序。心律失常在表示空间中分布不均匀。SB、JER、SA、JE、AV块、PJC、PVC、AT和AF相对相似。那些类似的心律失常类型可能更能愚弄模型。然而,其他心律失常类型更为分散,更容易区分。
2.3、ECG预处理
心电信号是一种微弱的生理信号,在采集过程中容易受到干扰,因此在分析之前需要对心电信号进行预处理。ECG信号最常见的三种干扰是肌电图(EMG)干扰、工频干扰和基线漂移。本发明实施例的预处理是基于上述三种干扰进行的。
肌电信号也是一种生理信号,是心电信号中的主要噪声。肌电信号的频率与肌肉类型有关,一般在30-300HZ范围内,而心电信号的频率主要在5-20HZ范围内。因此,EMG信号可以与ECG信号重叠。在我们的研究中,巴特沃斯低通滤波器用于去除肌电信号。巴特沃斯低通滤波器具有最平坦的带通频率响应曲线,并随着阻带的调整逐渐降至零。此外,对角频率的振幅单调减小,并且滤波器阶数越高,阻带中的振幅衰减越快。
随着供电网络的存在,工频干扰无处不在,而频率为50Hz的干扰信号是最常见的干扰信号。在本研究中,采用了具有Kaiser窗函数的50Hz有限冲激响应(FIR)陷波滤波器来消除工频信号。FIR滤波器具有心电信号处理所需的线性相位特性,能够以最小的波形失真获得最佳的滤波性能。考虑到样本的总体差异,Kaiser移动窗口是一个接近最优结构的窗口函数,它可以根据不同的参数自适应调整滤波器的参数。
最后,通过一个无限脉冲响应零相移数字滤波器来消除ECG基线漂移。作为ECG分析的常规预处理步骤,它可以防止引入可能扭曲真实振荡相位的伪影信息。经过预处理后,去除了主要噪声,同时保留了心电信号的关键信息。
2.4 P-QRS-T定位与切分
本本发明实施例对电图数据中的P-QRS-T等特征点进行定位,并根据特征点切分出波段。主要包括以下步骤:
2.4.1、心电图数据中的P-QRS-T定位
本实施例使用了一种自适应且高效的R峰检测算法来确定R峰的位置。然后,通过在R峰两侧搜索找到Q峰和S峰。由于QRS波群中很少出现多个峰值,因此使用250ms的移动窗口来迭代查询两侧。左侧第一个窗口中的最小值是Q峰值的位置。右侧第一个窗口的最小值为S峰值。
检测P波和T波需要从QRS波群开始进行时间窗遍历。因此,这里需要一种波的起点和终点边界的检测算法。为此,本实施例提出了一种改进的基于局部距离变换的边界检测算法。局部距离变换通过计算信号上每个点处辅助线段的起点和终点之间的最大距离来查找波的起点和终点。从形态学角度来看,这一点是曲率最大的点,符合医生的主观判断。
在Q起点之前建立200ms时间窗口,在S终点之后建立400ms时间窗口。P峰和T峰采用与R峰检测相同的检测算法进行检测。这里还使用局部距离变换来确定P波和T波的起点和终点。
图3显示了一个示例,每一个心跳由11个定位点,分为6个波段。每个心跳总共识别11个点,分别是P起点、P点、P终点、Q起点、Q点、R点、S点、S终点、T起点、T点、T终点。每个心跳的划分和心跳内波的划分都基于此特征点的定位。
2.4.2、心跳周期切分与心跳周期波段切分
根据前一步检测到的关键点,本实施例将ECG信号分解为多个周期性心跳。本实施例的整个数据集共产生459818个心跳,每个类别的ECG样本数和心跳周期数如表1所示,表1展示了每种类别下ECG信号分解的心跳次数。
表1
[根据细则91更正 15.02.2023]
将每一个心跳周期,进一步根据定位点,再细分为6种分段:P波、P-Q间期、QRS波群、S-T间期、T波、T-P间期。P波被定义为P起点和P终点之间的信号,以此类推其他段的定义。每个心跳对应6个分段,因此每种波段中的实例数也为459818,等于心跳的样本数。
将每一个心跳周期,进一步根据定位点,再细分为6种分段:P波、P-Q间期、QRS波群、S-T间期、T波、T-P间期。P波被定义为P起点和P终点之间的信号,以此类推其他段的定义。每个心跳对应6个分段,因此每种波段中的实例数也为459818,等于心跳的样本数。
2.5、确定每段波形的控制点,形成ECG特征:
本实施例对每个波段提取控制点,用提取的控制点及其相应的索引作为每个波段的表示。
首先,我们对6个波段,分别统计了其长度的分布;
接着,对每类波段求其平均长度。
假设lm(m=1,2,…,M)表示第m个实例的长度,其中M为总实例数。波段的平均长度可通过以下等式获得:
平均长度的每五个点取一控制点。因此,该类波段提取的控制点数N:
对于每个实例,以相等的间隔提取控制点。因此,可通过以下等式获得第m个实例的第n个控制点:
其中,表示第m个实例的第n个控制点的值,sigm(*)表示第m个实例的索引为*的信号值,表示第n个控制点的相应索引值。
每种波段的平均长度、控制点数及索引数,如下表2所示,即每个P波将提取12个控制点,P-Q间期提取6个控制点,QRS波将提取20个控制点,S-T间期将提取6个控制点,T波将提取14个控制点,T-P间期将提取18个控制点。每个心跳周期将提取82个控制点。
表2
数据集中有459818个心跳周期,每个周期有6个波段,共有459818*6个实例。对每个实例分别计算索引及其ECG信号值,最后得到459818*82对控制点(索引,ECG信号值)。
2.6、波段聚类
对用控制点表示的每类波段的,采用K-means方法对波段进行聚类,形成先验数据。
使用欧几里德距离度量实例与簇质心之间的距离。当质心在迭代中没有显著变化时,算法停止。总共需要对6类波段的9个导联,实施54(=6波段×9导联)次聚类。在本实施例中,聚类数是超参数之一,它有18个选项:3、5、8、10、15、20、25、30、35、40、45、50、55、60、65、70、75、80。
本发明实施例中簇数设置为25。聚类数确定后,聚类完成时,会根据每个波段所属的聚类类别,每个波段分配一个聚类编号。
2.7、拼接与增强
本小节对数据片段进行拼接,扩展先验数据。
2.7.1、心跳拼接和编码
在本实施例中,多个原始心跳拼接在一起作为拼接的ECG信号。拼接的ECG信号可被视为原始心跳的置换,置换中允许重复。拼接后的心电信号可以模拟训练数据集中不存在的心电信号,进一步增强训练数据集的全面性。
假设表示第i类ECG信号中的第j个原始心跳数,其中C为ECG总类数,Mi表示第i类ECG的心跳数。No表示拟拼接的原始心跳个数。对于第i类的第j个ECG信号,拼接可得到的ECG信号数可以通过以下等式获得:
所以,总拼接ECG信号数No为:
将公式(4)代入公式(5),可得:
在本实施例中,No是控制的另一个超参数,它有4个选项:2、3、4、5。训练数据集中的N值对应于No有2869966、3021293、11486438和25508046四个选项。对于本实施例的计算资源来说,分析如此大量的ECG信号太难了,因此在训练数据集中将每个类中拼接ECG信号的数量上限设置为1500000。在测试数据集中,对于每个原始ECG信号,随机选择20个拼接ECG信号。20个拼接ECG信号的预测摘要用作原始ECG信号的最终预测。
基于波段的簇号,拼接ECG信号可以编码为向量。每个心跳在每条导联上分别有6段。在无监督聚类步骤中,为每个波段分配一个聚类编号。一个心跳信号可以编码为大小为54(=6段×9个导联)的向量。因此,拼接ECG信号的向量大小为54×No。图4以拼接长度No=3为例,演示了3个原始心跳进行拼接与编码的过程。3个原始心跳,拼接长度为3可产生33=27个新心跳序列,每个心跳6个波段就有6个聚类编号,每个新心跳系列的向量长度为No*6=18。
每个ECG含9个导联,包含3个心跳的ECG最后编码为162(=6波段×9导联×3心跳)个特征。
3、贝叶期分类模型
基于编码特征,利用分类朴素贝叶斯(CNB)对拼接后的心电信号进行分类。假设xi(i=1,2,…,n)是ECG的特征,n是特征数,y是类别。根据条件概率,有:
P(x1,x2,…,xn)P(y|x1,x2,…,xn)=P(y)P(x1,x2,…,xn|y) (7)
可推导为:
基于特征独立的朴素贝叶斯假设,有:
P(xi|y,x1,…,xi-1,xi+1,…,xi)=P(xi|y) (9)
公式8可推导为:
因为公式10中的P(x1,x2,…,xn)都是常量,分类规则可写为:
其中,是ECG信号的类别标签,P(xi|y)按以下的公式计算:
其中Niac表示第i个特征为c类的拼接ECG信号的数量,Nc表示第c类中的拼接ECC信号的数量。原始心电图信号的预测标签是拼接心电图信号的平均预测标签。
4、对照模型:
在本实施例中,使用特征工程+分类器的模式来构建对照模型。
首先从ECG信号中提取了114个特征。其中在9个单独的导联信号上定义了10个特征(如P波峰值的平均值),从而获得90个(10×9)标量特征。其余24个特征与节段关键点出现时的记录时间(如时间维度中的位置)相关。
确定114个特征之后采用3个经典分类器,K最近邻(KNN)、随机森林(RF)和极端梯度增强(XGBoost)。3个分类器使用了默认配置。
另外,两个基于深度学习的模型也被用作对照模型。第一个是1D CNN,它在心律失常亚型的多分类方面取得了最先进的性能。第二种是LSTM,它被提出用于ECG信号分类。
5、量化和统计分析
在本实施例中,召回(敏感性)被用来衡量模型对某种心律失常的诊断性能。此外,模型的整体诊断性能是通过宏观召回来衡量的,宏观召回是指所有个体类别的召回平均值。在宏观召回中,每个类在计算平均召回时具有相同的权重,而不是根据每个类的样本大小设置权重。因此,宏观回忆在班级层面上是公平的,对少数班级的表现更为敏感。使用1000次迭代的非参数自举方法计算95%置信区间的召回率。18个类别的召回率与样本量或类别相似性之间的线性相关性通过相关系数(CC)来衡量。CC值的关系解释为:非常弱(0.00-0.19)、弱(0.20-0.39)、中等(0.40-0.59)、强(0.60-0.79)和非常强(0.80-1.00)。
因此,由于样本量和类间相似性的差异,在计算机辅助诊断中,心律失常可进一步分为强心律失常和弱心律失常。强心律失常对弱心律失常的挤兑,使弱心律失常患者可能诊断不足。为此,受心脏病专家诊断思维的启发,本发明提出了一种结合心电图片段聚类和贝叶斯理论的心律失常诊断方法。GDPH ECG心律失常数据集用于验证本发明的方法。通过超参数优化,确定了本发明的方法在心跳拼接和片段聚类中的最佳配置。
与其他方法相比,本发明的方法在强心律失常方面的性能相当,但在弱心律失常方面性能更好。此外,随着强心律失常挤兑行为的增加,本发明的方法仍然可以对弱心律失常做出精确诊断。
近年来,深度学习在心电图解释方面表现突出。它的层次结构允许获得更高级别的特征,其强大的特征提取能力有助于适应复杂的映射。基于心律失常强度的定义,深度学习模型很难在训练期间公平地诊断弱心律失常。
在这项研究中,探索了保护弱心律失常免受强心律失常挤兑的可能途径。心电信号的形态特征是心脏病专家诊断心律失常的关键信息。受此启发,本发明使用分段聚类来区分具有不同形态学特征的ECG信号。心电信号在节拍级的排列可以有效地丰富微弱心律失常的样本量。另一方面,与单个心跳相比,多个心跳的拼接增加了ECG信号的维数,这可能会增加不同心律失常在表征空间的距离。本发明的方法可以在一定程度上缓解强心律失常对弱心律失常的挤兑现象。
可解释的心电计算机辅助解释系统将更受心脏病专家的信任,因此易于使用。在本发明的方法中,特征是由具有明确实际意义的段簇数编码的。诊断决策是基于心律失常子段的条件概率做出的,这是心脏病专家先前诊断经验的简化数学描述。因此,心律失常的某些形态学特征可能隐含在条件概率较高的分段簇中。一些发现的形态学特征与当前心律失常诊断标准相匹配。还有一些发现可能与当前诊断标准不匹配。然而,这些不匹配的发现可能隐藏了心律失常的新诊断标记。多个片段的联合条件概率可能是发现新诊断标记的可行方法。
综上所述,相较于现有技术,本发明提供了一整套包含心电图处理方法及心律失常智能诊断算法的系统、平台和存储介质。方法包括:1.基于信号滤波技术的心电图干扰消除算法;2.基于极值周期迭代的P-QRS-T特征点定位与波段切分算法;3.波段控制点选取与波段无监督分类算法;4.基于剪拼增强(cutting-splicing data augmentation)进行心动周期拼接和编码算法。本发明能够将原始心电图转换增强为建模编码。本发明的心律失常智能诊断算法,通过基于贝叶斯的动态规划算法,来建模人类学习与认知的能力,对17类心律失常进行超越以往技术的准确诊断;模型同时具有单样本或小样本学习的能力,能对罕见或未见心律失常进行识别判断,并且以近似人类的方式进行泛化应用。最后,本发明的方法还能与医院多系统进行无缝对接,形成心电图处理方法及心律失常智能诊断算法的系统、平台和存储介质,应用于临床心电图诊断实践。
在一些可选择的实施例中,在方框图中提到的功能/操作可以不按照操作示图提到的顺序发生。例如,取决于所涉及的功能/操作,连续示出的两个方框实际上可以被大体上同时地执行或所述方框有时能以相反顺序被执行。此外,在本发明的流程图中所呈现和描述的实施例以示例的方式被提供,目的在于提供对技术更全面的理解。所公开的方法不限于本文所呈现的操作和逻辑流程。可选择的实施例是可预期的,其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。
此外,虽然在功能性模块的背景下描述了本发明,但应当理解的是,除非另有相反说明,所述的功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中,或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是,有关每个模块的实际实现的详细讨论对于理解本发明是不必要的。更确切地说,考虑到在本文中公开的装置中各种功能模块的属性、功能和内部关系的情况下,在工程师的常规技术内将会了解该模块的实际实现。因此,本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本发明。还可以理解的是,所公开的特定概念仅仅是说明性的,并不意在限制本发明的范围,本发明的范围由所附权利要求书及其等同方案的全部范围来决定。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,“计算机可读介质”可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。
计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置)、便携式计算机盘盒(磁装置)、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编辑只读存储器(EPROM或闪速存储器)、光纤装置以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。
应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。
以上是对本发明的较佳实施进行了具体说明,但本发明并不限于所述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。
Claims (10)
- 一种针对心电图的分析识别方法,其特征在于,包括:根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;根据所述先验数据,构建ECG先验模型;对所述ECG先验模型进行优化,得到目标识别模型;根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。
- 根据权利要求1所述的一种针对心电图的分析识别方法,其特征在于,所述根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据,包括:对心电图数据中的特征点进行定位;根据定位得到的特征点,切分出P-QRS-T特点波与波段;确定每段波形的控制点,形成ECG特征;采用K-means方法对每段ECG特征进行聚类,形成先验数据;对每个片段的先验数据进行拼接,得到扩展增强后的先验数据。
- 根据权利要求1所述的一种针对心电图的分析识别方法,其特征在于,所述根据所述先验数据,构建ECG先验模型这一步骤中,包括构建ECG心律失常数据集的步骤,该步骤包括:提取ECG数据样本;对所述ECG数据样本进行筛选过滤,得到训练数据集和测试数据集;对所述训练数据集和所述测试数据集进行数据分布与类相似性分析,提高训练数据集和测试数据集中心律数据的区分度。
- 根据权利要求1所述的一种针对心电图的分析识别方法,其特征在于,所述根据所述先验数据,构建ECG先验模型这一步骤中,还包括对所述先验数据进行预处理的步骤,该步骤包括:通过巴特沃斯低通滤波器对所述先验数据中的肌电信号进行去除;通过具有Kaiser窗函数的50Hz有限冲激响应陷波滤波器对所述先验数据中的工频干扰信号进行去除;通过无限脉冲响应零相移数字滤波器对所述先验数据中的ECG基线漂移进行去除。
- 根据权利要求2所述的一种针对心电图的分析识别方法,其特征在于,所述根据定位 得到的特征点,切分出P-QRS-T特点波与波段,包括:根据定位得到的特征点,通过R峰检测算法来确定R峰的位置;使用250ms的移动窗口来迭代查询R峰的两侧,将左侧第一个窗口中的最小值确定为Q峰值的位置,将右侧第一个窗口的最小值确定为S峰值;根据定位得到的特征点,从QRS波群开始进行时间窗遍历,检测得到P波和T波;根据检测到的各个关键点,将ECG信号分解为多个周期性心跳;根据定位点,将每一个心跳周期划分为6种分段:P波、P-Q间期、QRS波群、S-T间期、T波、T-P间期;其中,P波被定义为P起点和P终点之间的信号;具体地,所述根据定位得到的特征点,从QRS波群开始进行时间窗遍历,检测得到P波和T波,包括:根据局部距离变换的边界检测方法,通过计算信号上每个点处辅助线段的起点和终点之间的最大距离来查找P波和T波的起点以及终点;在QRS波群的Q起点之前建立200ms时间窗口,在S终点之后建立400ms时间窗口;采用与R峰检测相同的R峰检测算法检测得到P峰和T峰,使用局部距离变换来确定P波和T波的起点和终点。
- 根据权利要求2所述的一种针对心电图的分析识别方法,其特征在于,所述确定每段波形的控制点,形成ECG特征,包括:分别统计每个波段的长度的分布;计算每类波段的平均长度;根据所述平均长度,确定每个波段的控制点个数和索引数;根据所述平均长度、所述控制点个数和所述索引数,计算每个波段中的控制点,并确定对应的ECG特征。
- 根据权利要求2所述的一种针对心电图的分析识别方法,其特征在于,所述采用K-means方法对每段ECG特征进行聚类,形成先验数据,包括:使用欧几里德距离计算实例与簇质心之间的距离;配置聚类数;根据所述距离和所述聚类数,对用控制点表示的每类波段的,采用K-means方法对波段进行聚类;聚类完成时,会根据每个波段所属的聚类类别,每个波段分配一个聚类编号。
- 根据权利要求2所述的一种针对心电图的分析识别方法,其特征在于,所述方法还包 括:基于编码特征,利用分类朴素贝叶斯对拼接后的心电信号进行分类。
- 一种针对心电图的分析识别系统,其特征在于,包括:第一模块,用于根据对心电图以及心律失常知识的定义构建概念树,并建立心律失常心电图数据集;第二模块,用于根据所述概念树对所述心律失常心电图数据集进行数据增强处理,构建先验数据;第三模块,用于根据所述先验数据,构建ECG先验模型;第四模块,用于对所述ECG先验模型进行优化,得到目标识别模型;第五模块,用于根据所述目标识别模型,对待分析的心电图进行分析识别,确定所述待分析的心电图的识别结果。
- 一种计算机可读存储介质,其特征在于,所述存储介质存储有程序,所述程序被处理器执行实现如权利要求1至8中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211390694.5 | 2022-11-07 | ||
CN202211390694.5A CN115778400A (zh) | 2022-11-07 | 2022-11-07 | 一种针对心电图的分析识别方法、系统以及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024098553A1 true WO2024098553A1 (zh) | 2024-05-16 |
Family
ID=85436029
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/072195 WO2024098553A1 (zh) | 2022-11-07 | 2023-01-13 | 一种针对心电图的分析识别方法、系统以及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115778400A (zh) |
WO (1) | WO2024098553A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118568601A (zh) * | 2024-08-01 | 2024-08-30 | 安徽大学 | 一种基于自生成异构神经元轻量型网络的ecg身份识别方法 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI834543B (zh) * | 2023-04-21 | 2024-03-01 | 國立勤益科技大學 | 一種心電圖st區段形態自動分類方法 |
CN118690580B (zh) * | 2024-08-22 | 2024-11-05 | 天津天堰科技股份有限公司 | 心电波形动态仿真方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120172689A1 (en) * | 2010-06-08 | 2012-07-05 | David Albert | Wireless, ultrasonic personal health monitoring system |
CN113509186A (zh) * | 2021-06-30 | 2021-10-19 | 重庆理工大学 | 基于深度卷积神经网络的ecg分类系统与方法 |
US20220015711A1 (en) * | 2020-07-20 | 2022-01-20 | Board Of Regents, The University Of Texas System | System and method for automated analysis and detection of cardiac arrhythmias from electrocardiograms |
CN114847905A (zh) * | 2022-05-10 | 2022-08-05 | 武汉大学 | 一种心率失常数据检测识别方法及系统 |
KR20220143400A (ko) * | 2021-04-16 | 2022-10-25 | 금오공과대학교 산학협력단 | 인공 신경망을 이용하여 표준 12리드 심전도 신호로부터 심장 부정맥을 분류하는 방법 및 이를 이용한 심장 부정맥 분류장치 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103006210B (zh) * | 2013-01-11 | 2014-10-15 | 山东师范大学 | 基于分段线性化的窦性心率震荡趋势检测方法 |
CN106503457B (zh) * | 2016-10-26 | 2018-12-11 | 清华大学 | 基于转化医学分析平台的临床数据集成技术数据导入方法 |
CN208511016U (zh) * | 2017-10-15 | 2019-02-19 | 三江学院 | 一种生命体征综合检测分析系统 |
CN107951485B (zh) * | 2017-11-27 | 2019-06-11 | 深圳市凯沃尔电子有限公司 | 基于人工智能自学习的动态心电图分析方法和装置 |
KR102008196B1 (ko) * | 2017-12-05 | 2019-08-07 | 아주대학교 산학협력단 | 심전도 데이터를 이용한 혈중 칼륨농도 예측모델 생성장치 및 그 방법 |
CN108596995B (zh) * | 2018-05-15 | 2022-02-01 | 南方医科大学 | 一种pet-mri最大后验联合重建方法 |
CN110728656A (zh) * | 2019-09-06 | 2020-01-24 | 西安电子科技大学 | 基于元学习的无参考图像质量数据处理方法、智能终端 |
CN112907456B (zh) * | 2019-12-04 | 2022-06-10 | 四川大学 | 基于全局平滑约束先验模型的深度神经网络图像去噪方法 |
CN114764750B (zh) * | 2021-01-12 | 2023-08-18 | 四川大学 | 基于自适应一致性先验深度网络的图像去噪方法 |
CN113057648A (zh) * | 2021-03-22 | 2021-07-02 | 山西三友和智慧信息技术股份有限公司 | 一种基于复合lstm结构的ecg信号分类方法 |
CN113222018B (zh) * | 2021-05-13 | 2022-06-28 | 郑州大学 | 一种图像分类方法 |
CN114847963B (zh) * | 2022-05-06 | 2023-08-01 | 广东工业大学 | 一种高精度的心电图特征点检测方法 |
-
2022
- 2022-11-07 CN CN202211390694.5A patent/CN115778400A/zh active Pending
-
2023
- 2023-01-13 WO PCT/CN2023/072195 patent/WO2024098553A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120172689A1 (en) * | 2010-06-08 | 2012-07-05 | David Albert | Wireless, ultrasonic personal health monitoring system |
US20220015711A1 (en) * | 2020-07-20 | 2022-01-20 | Board Of Regents, The University Of Texas System | System and method for automated analysis and detection of cardiac arrhythmias from electrocardiograms |
KR20220143400A (ko) * | 2021-04-16 | 2022-10-25 | 금오공과대학교 산학협력단 | 인공 신경망을 이용하여 표준 12리드 심전도 신호로부터 심장 부정맥을 분류하는 방법 및 이를 이용한 심장 부정맥 분류장치 |
CN113509186A (zh) * | 2021-06-30 | 2021-10-19 | 重庆理工大学 | 基于深度卷积神经网络的ecg分类系统与方法 |
CN114847905A (zh) * | 2022-05-10 | 2022-08-05 | 武汉大学 | 一种心率失常数据检测识别方法及系统 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118568601A (zh) * | 2024-08-01 | 2024-08-30 | 安徽大学 | 一种基于自生成异构神经元轻量型网络的ecg身份识别方法 |
Also Published As
Publication number | Publication date |
---|---|
CN115778400A (zh) | 2023-03-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110890155B (zh) | 一种基于导联注意力机制的多类心律失常检测方法 | |
Oster et al. | Semisupervised ECG ventricular beat classification with novelty detection based on switching Kalman filters | |
WO2024098553A1 (zh) | 一种针对心电图的分析识别方法、系统以及存储介质 | |
Zhao et al. | An explainable attention-based TCN heartbeats classification model for arrhythmia detection | |
Buscema et al. | Computer aided diagnosis for atrial fibrillation based on new artificial adaptive systems | |
Rohmantri et al. | Arrhythmia classification using 2D convolutional neural network | |
Zhou et al. | Arrhythmia recognition and classification through deep learning-based approach | |
Jones et al. | Improving ECG classification interpretability using saliency maps | |
CN110638430A (zh) | 多任务级联神经网络ecg信号心律失常疾病分类模型和方法 | |
Wu et al. | Personalizing a generic ECG heartbeat classification for arrhythmia detection: a deep learning approach | |
Prabhakararao et al. | Congestive heart failure detection from ECG signals using deep residual neural network | |
Ng et al. | The role of artificial intelligence and machine learning in clinical cardiac electrophysiology | |
Hassan et al. | Performance comparison of CNN and LSTM algorithms for arrhythmia classification | |
Wang et al. | Multiscale residual network based on channel spatial attention mechanism for multilabel ECG classification | |
EPMoghaddam et al. | A graph-based cardiac arrhythmia classification methodology using one-lead ECG recordings | |
CN114224355A (zh) | 心电信号分类训练方法、分类方法、装置及存储介质 | |
Chen et al. | A meta-transfer learning approach to ECG arrhythmia detection | |
Sraitih et al. | An overview on intra-and inter-patient paradigm for ECG heartbeat arrhythmia classification | |
Bhukya et al. | Detection and classification of cardiac arrhythmia using artificial intelligence | |
Prabhakararao et al. | Multi-label ECG classification using temporal convolutional neural network | |
Tao et al. | Automated detection of atrial fibrillation based on DenseNet using ECG signals | |
Qu et al. | ECG heartbeat classification detection based on WaveNet-LSTM | |
Nandanwar et al. | Ecg signals-early detection of arrhythmia using machine learning approaches | |
Bozyigit et al. | Classification of electrocardiogram (ECG) data using deep learning methods | |
Tao et al. | A cascaded step-temporal attention network for ECG arrhythmia classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23887264 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |