[go: up one dir, main page]

WO2022198898A1 - 图像分类方法和装置及设备 - Google Patents

图像分类方法和装置及设备 Download PDF

Info

Publication number
WO2022198898A1
WO2022198898A1 PCT/CN2021/112932 CN2021112932W WO2022198898A1 WO 2022198898 A1 WO2022198898 A1 WO 2022198898A1 CN 2021112932 W CN2021112932 W CN 2021112932W WO 2022198898 A1 WO2022198898 A1 WO 2022198898A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
classified
signal waveform
array
pixel
Prior art date
Application number
PCT/CN2021/112932
Other languages
English (en)
French (fr)
Inventor
张冬冬
Original Assignee
北京至真互联网技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京至真互联网技术有限公司 filed Critical 北京至真互联网技术有限公司
Priority to US17/642,568 priority Critical patent/US12293576B2/en
Priority to JP2022527909A priority patent/JP7305046B2/ja
Publication of WO2022198898A1 publication Critical patent/WO2022198898A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/86Arrangements for image or video recognition or understanding using pattern recognition or machine learning using syntactic or structural representations of the image or video pattern, e.g. symbolic string recognition; using graph matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/225Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/34Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present invention relates to the technical field of image classification and recognition, and in particular, to an image classification method, device and equipment.
  • an image recognition model or a currently popular deep learning network model CNN can be used to implement image classification.
  • a recognition model needs to be built in advance. In the process of building a recognition model, several stages such as feature extraction, feature coding, space constraints, classifier design, and model fusion are required. The research and development cycle is long, and the requirements for algorithm designers’ own professional skills and algorithm deployment are high.
  • the deep learning network model CNN to achieve image classification, although it can liberate manpower and achieve fast and efficient automatic sorting, it takes a lot of time to build a neural network model, label training data sets, train model parameters, and optimize in the early stage.
  • the required technical threshold and hardware configuration are high, which makes the reproducibility and generalization ability of the traditional image classification method weak, and it is difficult to achieve out-of-the-box, rapid replication, and generalization to various complex and changeable clinical situations. in a medical setting.
  • the present invention provides the following scheme:
  • an image classification method comprising:
  • the image data set includes images to be classified
  • the category of the image to be classified is determined based on the signal waveform diagram.
  • any single-channel pixel value array in the standard image select any single-channel pixel value array in the standard image, and draw a corresponding signal waveform diagram based on the pixel value array, including:
  • the corresponding signal waveform diagram is drawn according to the row pixel compression array and the column pixel compression array after the smoothing process.
  • axis represents the dimension of the summation of pixel values
  • the method further includes: taking the logarithm of the smoothed row pixel compression array and the column pixel compression array. The operation of arithmetic processing.
  • determining the category of the image to be classified based on the signal waveform diagram includes:
  • the category of the image to be classified is determined.
  • the method when drawing the corresponding signal waveform diagram based on the pixel value array, the method further includes:
  • the processing of the signal waveform graph according to the sharpness of each peak includes: an operation of removing peaks whose sharpness is less than a preset minimum threshold of sharpness.
  • it also includes:
  • the image to be classified is preprocessed, the marker area in the image to be classified is located, and the category of the image to be classified is determined according to the shape of the located marker area.
  • an image classification apparatus including: a data input module, an image interception module, a waveform diagram drawing module, and a first category determination module;
  • the data input module is configured to input an image data set to be classified; wherein, the image data set includes images to be classified;
  • the image interception module is configured to take the center of the image to be classified as a reference center point, and intercept an image of a preset size including a specific area as a standard image;
  • the waveform graph drawing module is configured to select any single-channel pixel value array in the standard image, and draw a corresponding signal waveform graph based on the pixel value array;
  • the first category determination module is configured to determine the category of the image to be classified based on the signal waveform diagram.
  • an image classification device comprising:
  • memory for storing processor-executable instructions
  • the processor is configured to implement any of the foregoing methods when executing the executable instructions.
  • any single-channel pixel value array is selected from the standard image obtained by the interception, and then the signal waveform diagram is drawn according to the selected single-channel pixel value array, so as to be classified.
  • the image is classified into categories, it is divided based on the signal waveform obtained by drawing.
  • the image recognition model and deep learning network model CNN can not only complete the classification of various types of images quickly, accurately and efficiently, but also only need to perform the above processing on the images to be classified to generate corresponding signal waveforms.
  • the image is enough, and there is no need to collect and label a large amount of sample data, nor to train the recognition model.
  • the sample data can be more suitable for various image classification application scenarios, and finally effectively improve the reproducibility and generalization ability of the image classification method.
  • FIG. 1 is a flowchart of an image classification method according to an embodiment of the application
  • FIG. 3 is a distribution diagram of hue thresholds of different colors in the HSV color space on which images to be classified in an image data set are filtered according to image colors in an image classification method according to an embodiment of the application;
  • FIG. 4 is a schematic diagram of the sharpness detection of each peak in a drawn signal waveform diagram in an image classification method according to an embodiment of the present application
  • Fig. 5a and Fig. 5b are respectively the signal waveform diagram corresponding to the fundus diagram and the signal waveform diagram corresponding to the outer eye diagram in the image classification method finally drawn in an embodiment of the application;
  • FIG. 6 is an effect diagram of an image classification method according to an embodiment of the present application, when an image to be classified is detected as an outer eye diagram according to the shape of the located pupil region;
  • FIG. 7 is a structural block diagram of an image classification apparatus according to an embodiment of the application.
  • FIG. 8 is a structural block diagram of an image classification apparatus according to an embodiment of the present application.
  • FIG. 1 shows a flowchart of an image classification method according to an embodiment of the present application.
  • the method includes: step S100 , inputting an image data set to be classified.
  • the image data set may include various images such as OCT, eye B-ultrasound, graphic report, FFA image, fundus image collected by fundus camera, and outer eye image.
  • Step S200 taking the center of the image to be classified as the reference center point, and intercepting an image of a preset size including a specific area as a standard image.
  • step S300 a pixel value array of any single channel in the standard image is selected, and a corresponding signal waveform diagram is drawn based on the pixel value array.
  • step S400 the category of the image to be classified is determined based on the obtained signal waveform diagram.
  • any single-channel pixel value array is selected from the standard image obtained by the interception, and then according to the selected single-channel pixel value array
  • the pixel value array draws a signal waveform diagram, so that when the image to be classified is classified into categories, the division is performed based on the drawn signal waveform diagram.
  • the image recognition model and deep learning network model CNN can not only complete the classification of various types of images quickly, accurately and efficiently, but also only need to perform the above processing on the images to be classified to generate corresponding signal waveforms.
  • the image is enough, and there is no need to collect and label a large amount of sample data, nor to train the recognition model.
  • the sample data can be more suitable for various image classification application scenarios, and finally effectively improve the reproducibility and generalization ability of the image classification method.
  • the intercepted specific area is different from the image to be classified contained in the image dataset. associated with the class to which it belongs.
  • the images to be classified contained in the image data set are multiple, and the categories of different images to be classified are different, they all belong to the same category.
  • the images to be classified included in the image data set all belong to the category of eye detection data, but different images to be classified correspond to different image data under the eye detection images.
  • the image to be classified may be a fundus image collected by a fundus camera, or an external eye image, and the image to be classified may also be an eye B ultrasound, OCT, FFA, and the like.
  • the images to be classified included in the image data set all belong to the category of chest detection data
  • the images to be classified included in the image data set may be data such as chest CT and chest B-ultrasound.
  • the images to be classified in the image data set may also be image data collected in other application fields, which will not be described one by one here.
  • the image data set may be different image data collected in different application scenarios.
  • the images to be classified in the image data set should all belong to different forms or different categories of image data obtained in the same application scenario.
  • the specific area to be intercepted can be determined based on the specific application scenario to which each image to be classified in the image data set belongs. Also taking the eye detection data in clinical medical images as an example, the intercepted specific area is to envelop the entire pupil or macular area.
  • the preset size can also be set according to the specific application scenario to which the image to be classified in the image data set belongs. The preset sizes set in different application scenarios are different.
  • an operation of filtering the images to be classified in the image data set according to the image size and/or image color may be further included.
  • the images to be classified in the image data set are filtered according to the two pieces of information of graphic size and image color, the sequence of the images can be flexibly set according to the actual situation.
  • step S021 may be used first. Filter the images to be classified in the image data set according to the image size, and filter out small-sized image data such as OCT, eye B-ultrasound, and graphic reports in the image data set. Then, through step S022, the remaining images to be classified in the filtered image data set are filtered again according to the image color, and data such as FFA images are identified and filtered.
  • the color recognition principle used can be specifically based on the “hue threshold distribution of different colors in the HSV color space” (see Figure 3). Divide.
  • the FFA image is a grayscale image
  • the fundus image and the outer eye image are all RGB three-channel color images.
  • Using color recognition can further subdivide the three types of images and filter out the FFA images.
  • the chromaticity values of the fundus map and the outer eye map are located in the hue interval of ['red2', 'red', 'orange'].
  • step S200 can be executed, taking the center of the image to be classified as the reference center point, and intercepting an image of a preset size including a specific area as a standard image.
  • the selection of the specific area and the setting of the preset size may be performed according to the specific application scenario to which the image data set belongs.
  • the intercepted specific area is to envelop the entire through hole or macular area.
  • the default size can be set to a side length of 700px.
  • step S300 can be executed to select any single-channel pixel value array in the standard image, and draw a corresponding signal waveform diagram based on the pixel value array.
  • the selection can be made according to the background color of the image to be classified currently being identified and divided. That is, in the image classification method of the embodiment of the present application, the selected single channel should be a channel that is close to the background color of the image to be classified that is currently being identified and divided.
  • the pixel value array of the single channel R in the standard image can be selected, and then the signal waveform diagram can be generated based on the pixel value array of the selected R channel.
  • selecting any single-channel pixel value array in the standard image, and drawing a corresponding signal waveform diagram based on the pixel value array may be implemented in the following manner.
  • any single-channel pixel value array in the standard image select any single-channel pixel value array in the standard image. Then, perform row compression and column compression on the row pixel array and the column pixel array in the pixel value array, respectively, to obtain the row pixel compression array and the column pixel compression array. Furthermore, curve smoothing is performed on the row pixel compression array and the column pixel compression array to remove noise points in the row pixel compression array and the column pixel compression array. Finally, the corresponding signal waveform diagram is drawn according to the row pixel compression array and the column pixel compression array after smoothing.
  • axis represents the dimension of the summation of pixel values
  • curve smoothing can be performed on the row pixel compression array and the column pixel compression array.
  • smoothing can be performed by calling savgol_filter in the cipy.signal library to remove noise points in the row pixel compression array and the column pixel compression array.
  • calculation formula for performing curve smoothing filtering on the row pixel compression array and the column pixel compression array is as follows:
  • hi is the smoothing coefficient
  • the operation of performing logarithmic operation processing on the smoothed row pixel compression array and the column pixel compression array is also included.
  • logarithmic operations on the row pixel compression array and column pixel compression array after smoothing not only the absolute value of the data can be reduced, which is convenient for calculation, but also the variable scale can be compressed without changing the nature and correlation of the data. , which weakens the collinearity, heteroscedasticity, etc. of the model.
  • the plt.plot function in the matplotlib.pyplot library can be directly called when drawing the signal waveform of the compressed pixel value.
  • step S400 may be executed to determine the category of the image to be classified based on the signal waveform diagram.
  • the percentage of change rate of pixel values in the signal waveform curve, the sharpness of each peak in the signal waveform curve, and the signal waveform can be obtained according to the drawn signal waveform diagram.
  • the percentage of column amplitude to row amplitude in the curve, and then according to at least one of the percentage of change rate of pixel values in the signal waveform curve, the sharpness of each peak in the signal waveform graph, and the percentage of column amplitude to row amplitude in the signal waveform curve The identification and determination of the category of the image to be classified is performed.
  • the percentage change rate of the pixel value in the signal waveform curve can be calculated based on the drawn signal waveform graph. Its calculation formula is:
  • delta is the percentage change of the pixel value in the signal waveform curve
  • max(y smooth ) is the maximum pixel value after Savitzky-Golay smoothing filtering
  • min(y smooth ) is the minimum pixel value after Savitzky-Golay smoothing filtering .
  • the sharpness of each peak in the signal waveform curve can be implemented in the following manner.
  • each peak in the signal waveform diagram is detected based on the peak properties.
  • the peak detection on the signal waveform graph can be realized by directly calling the find_peaks method in the scipy.signal library.
  • the position interval between the identified peak signals in the same cycle is controlled by the parameter distance, and the minimum value that the peak signal needs to meet
  • the threshold peak min is calculated using the following formula:
  • the detection calculation of the sharpness is performed for each of the detected peaks.
  • the peak_prominences method in the scipy.signal library can be called to calculate and detect the prominence of each peak in the signal waveform graph.
  • the image classification method according to an embodiment of the present application further includes: an operation of removing a peak with a small sharpness to realize post-processing of the signal waveform, so as to avoid the sharpness Smaller peaks interfere with subsequent waveform recognition. That is, the peaks in the signal waveform curve whose convexity is smaller than the preset convexity minimum threshold are removed, so as to obtain the final signal waveform.
  • the value range of the preset minimum threshold of suddenness may be: min(y smooth )+0.35 ⁇ (max(y smooth )-min(y smooth )).
  • the calculation of the percentage of the column amplitude to the row amplitude in the signal waveform curve is performed.
  • the percentage of col amplitude to row amplitude percent_col_in_row is used as one of the criteria for judging the fundus image and the outer eye image, and the detected obvious peak signal is displayed on the signal waveform.
  • the corresponding signal waveform diagrams when the image to be classified is a fundus diagram and the corresponding signal waveform diagram when the image to be classified is an external eye diagram are respectively shown.
  • the detected significant peak signals are marked and displayed in the waveform diagrams.
  • the percentage of change rate of pixel values in the signal waveform curve obtained from the signal waveform graph, the sharpness of each peak in the signal waveform graph, and the columns in the signal waveform graph can be obtained. At least one of the amplitudes as a percentage of line amplitudes determines the class of the image to be classified.
  • the identification and determination are performed according to at least one of the preceding three items of information.
  • the signal waveform curve in the signal waveform diagram may be first determined whether the signal waveform curve in the signal waveform diagram is monotonically increasing or monotonically decreasing.
  • the monotonicity of the signal waveform can be judged by calculating whether the first derivative of the signal waveform is always greater than or equal to 0 (ie, ⁇ 0), or is always less than or equal to 0 (ie, ⁇ 0). .
  • the signal waveform curve When it is calculated that the signal waveform curve is always greater than or equal to 0 or is always less than or equal to 0, it indicates that the signal waveform curve is monotonically increasing or monotonically decreasing, so it can be directly determined that the image to be classified corresponding to the signal waveform graph is a fundus image.
  • the to-be-classified image corresponding to the signal waveform image is the fundus image.
  • the values of the first preset value and the second preset value can be flexibly set according to actual conditions. That is, the values of the first preset value and the second preset value can be set according to factors such as the image category to be recognized currently, specific application scenarios, and application requirements. In a possible implementation manner, when the currently identified image category is a fundus image or an external eye image, the value of the first preset value may be 6%, and the value of the second preset value may be 0.02.
  • the image to be classified corresponding to the signal waveform diagram is an outer eye diagram.
  • the values of the third preset value and the fourth preset value can also be set according to factors such as the image category currently to be recognized, specific application scenarios, and application requirements.
  • the value of the third preset value may be 40%
  • the value of the fourth preset value may be 6% .
  • the value of the fifth preset value may be 30%.
  • the value of the fifth preset value may also be selected by testing according to factors such as the image category to be recognized in the actual situation, specific application scenarios, and application requirements, which are not specifically limited here.
  • the method further includes: step S500 , preprocessing the image to be classified, locating the marker area in the image to be classified, and determining the image to be classified according to the shape of the located marker area.
  • the marker area is a marker position used to characterize the attribute of the image to be classified.
  • the attribute of the image to be classified refers to the category to which the image belongs.
  • the marker area refers to the through-hole area.
  • the marker area is a representative location that can characterize the category to which the image belongs. No further examples are provided here.
  • the marker area in the image to be classified is located, and the category of the image to be classified is determined according to the shape of the located marker area, it can be specifically implemented in the following manner.
  • the preprocessing of the to-be-classified image includes: cropping the to-be-classified image, and cropping the to-be-classified image into a standard image.
  • the cropping method can directly adopt the above-mentioned intercepting method, whereby the standard image can be directly read by taking the center of the image to be classified as the reference center point and intercepting an image of a preset size including a specific area.
  • the standard image is preprocessed to obtain a black and white binary image.
  • the preprocessing of the standard image may include:
  • Gaussian filtering is performed on the standard image to remove part of the noise; among them, cv2.GaussianBlur can be used for Gaussian filtering, and the Gaussian kernel size is selected (5, 5).
  • grayscale conversion can be performed using cv2.cvtColorc.
  • the following steps are performed: detecting each connected region in the binary image.
  • the preprocessed binarized image is subjected to closing operation and opening operation in turn, and isolated noise points are filtered, and then the connected area formed by dense pixel points in the binarized image is identified.
  • cv2.morphologyEx can be used to achieve accurate identification of connected regions formed by dense pixels in a binary image.
  • the connected area with the largest area is selected from the detected connected areas, so as to determine the optimal adaptive clipping size of the image, and then draw a corresponding size on the binarized image according to the determined optimal adaptive clipping size. Rectangle.
  • This process can use cv2.contourArea to calculate the area of the connected area.
  • the area of the selected connected area should be greater than 20000, and the tolerance area between the connected area and the edge contour of the binarized image should be greater than 2000 pixels.
  • the operation of removing noise interference and locating the pupil position is performed.
  • the cv2.getStructuringElement method can be used to remove the noise interference around the binary image, and the positioning of the via position can be completed by cv2.morphologyEx.
  • the position of the center of the pupil circle may be set within an interval range of [200px, 660px].
  • the marker area of the image to be classified that is currently being identified and divided can be located.
  • the category of the image to be classified can be determined according to the shape of the located marker area.
  • ellipse detection can be performed through cv2.fitEllipse to complete the classification of fundus and outer eye images.
  • the image is determined to be an outer eye diagram.
  • the predetermined interval range of the short-axis radius of the pupil may be: [82px, 700px].
  • the image data in the process of identifying and determining the to-be-classified image in the image data set according to the signal waveform diagram, can be determined in combination with the shape of the marker area in the image. It focuses on the distinction of 93.7% of the images, which greatly improves the accuracy of image classification. For different eye positions, different eye shapes, different lesions, different shooting angles, and different exposure and saturation All images can be accurately distinguished and recognized, which effectively improves the flexibility and robustness of the image classification method.
  • the present application further provides an image classification apparatus. Since the working principle of the image classification device provided in the present application is the same as or similar to the principle of the image classification method of the present application, the repeated places will not be repeated.
  • the image classification apparatus 100 includes: a data input module 110 , an image interception module 120 , a waveform diagram drawing module 130 and a first category determination module 140 .
  • the data input module 110 is configured to input an image data set to be classified; wherein, the image data set includes images to be classified.
  • the image intercepting module 120 is configured to take the center of the image to be classified as a reference center point, and intercept an image of a preset size including a specific area as a standard image.
  • the waveform graph drawing module 130 is configured to select any single-channel pixel value array in the standard image, and draw a corresponding signal waveform graph based on the pixel value array.
  • the first category determination module 140 is configured to determine the category of the image to be classified based on the signal waveform diagram.
  • a second category determination module (not shown in the figure) is also included.
  • the second category determination module is configured to preprocess the image to be classified, locate the marker area in the image to be classified, and determine the category of the image to be classified according to the shape of the located marker area.
  • an image classification device 200 is also provided.
  • the image classification apparatus 200 includes a processor 210 and a memory 220 for storing instructions executable by the processor 210 .
  • the processor 210 is configured to implement any of the aforementioned image classification methods when executing the executable instructions.
  • the number of processors 210 may be one or more.
  • the image classification apparatus 200 in this embodiment of the present application may further include an input device 230 and an output device 240 .
  • the processor 210, the memory 220, the input device 230, and the output device 240 may be connected through a bus, or may be connected in other ways, which are not specifically limited here.
  • the memory 220 can be used to store software programs, computer-executable programs, and various modules, such as programs or modules corresponding to the image classification method in the embodiments of the present application.
  • the processor 210 executes various functional applications and data processing of the image classification apparatus 200 by running software programs or modules stored in the memory 220 .
  • the input device 230 may be used to receive input numbers or signals. Wherein, the signal may be the generation of a key signal related to user setting and function control of the device/terminal/server.
  • the output device 240 may include a display device such as a display screen.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Ophthalmology & Optometry (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

一种图像分类方法和装置及设备,涉及图像分类识别技术领域。所述方法包括:输入待分类的影像数据集(S100),其中,影像数据集中包括有待分类图像;以待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像(S200);选取标准图像中任一单通道的像素值数组,并基于像素值数组绘制相应的信号波形图(S300);基于信号波形图确定待分类图像的类别(S400)。该方法通过对待分类图像进行特定区域的截取之后,由截取得到的标准图像中选取任一单通道的像素值数组,然后再根据所选取的该单通道的像素值数组绘制信号波形图,从而在对待分类图像进行类别的划分时,基于绘制得到的信号波形图进行划分,不仅能够准确高效地完成各类影像的区分,还能够适用于各种图像分类应用场景。

Description

图像分类方法和装置及设备
本申请要求于2021年03月26日提交中国专利局、申请号为202110322703.6、发明名称为“图像分类方法和装置及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及图像分类识别技术领域,特别是涉及一种图像分类方法和装置及设备。
背景技术
通常,医学影像的存储、传输以及医生的诊断、病程的追踪等都是以患者为维度进行的。纷繁复杂、不同种类的影像混杂在一起势必为后续不同类别的图像分拣、归档和研究造成巨大困扰。在相关技术中,可以采用图像识别模型或者是当下流行的深度学习网络模型CNN来实现图像分类。但是,采用图像识别模型进行图像分类时需要预先搭建好识别模型。在识别模型搭建过程中,需要进行特征提取、特征编码、空间约束、分类器设计和模型融合等几个阶段,研发周期较长且对算法设计者本身专业技能,算法部署等的要求较高。采用深度学习网络模型CNN实现图像分类,虽然可以解放人力,实现快速、高效的自动分拣,但前期搭建神经网络模型、标注训练数据集、训练模型参数、调优等需要花费大量时间,这些同样需要的技术门槛和硬件配置均较高,这就使得传统的图像分类方式的可复制性和泛化能力较弱,难以做到即拿即用,快速复制,推广到各类复杂多变的临床医学场景中。
发明内容
基于此,有必要提供一种图像分类方法和装置及设备,不仅能够实现快速准确的进行图像区分,同时还能够实现即拿即用、快速复制,适用于各种不同的应用场景中。
为实现上述目的,本发明提供了如下方案:
根据本申请的一方面,提供了一种图像分类方法,包括:
输入待分类的影像数据集;其中,所述影像数据集中包括有待分类图像;
以所述待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像;
选取所述标准图像中任一单通道的像素值数组,并基于所述像素值数组绘制相应的信号波形图;
基于所述信号波形图确定所述待分类图像的类别。
在一种可能的实现方式中,输入待分类的影像数据集之后,还包括:
根据图像尺寸和/或图像颜色对所述影像数据集中的待分类图像进行过滤的操作。
在一种可能的实现方式中,选取所述标准图像中任一单通道的像素值数组,并基于所述像素值数组绘制相应的信号波形图,包括:
选取所述标准图像中的任一单通道的像素值数组;
对所述像素值数组中的行像素数组和列像素数组分别进行行压缩和列压缩,得到行像素压缩数组和列像素压缩数组;
对所述行像素压缩数组和所述列像素压缩数组进行曲线平滑处理;
根据平滑处理后的所述行像素压缩数组和所述列像素压缩数组绘制相应的所述信号波形图。
在一种可能的实现方式中,对所述像素值数组中的行像素数组和列像素数组分别进行行压缩和列压缩时,采用如下公式进行:
y_row=img_r.sum(axis=0)#像素值行压缩;
y_col=img_r.sum(axis=1)#像素值列压缩;
其中,axis表征像素值求和的维度,axis=0表示各行的像素值求和,axis=1表示各列的像素值求和。
在一种可能的实现方式中,对所述行像素压缩数组和所述列像素压缩数组进行曲线平滑处理后,还包括:对平滑处理后的行像素压缩数组和列像素压缩数组进行取对数运算处理的操作。
在一种可能的实现方式中,基于所述信号波形图确定所述待分类图像的类别,包括:
根据所述信号波形图中的信号波形曲线中像素值的变化率百分比、所述信号波形图中的各峰值的突度和所述信号波形曲线中列振幅占行振幅的百分比中的至少一种确定所述待分类图像的类别。
在一种可能的实现方式中,基于所述像素值数组绘制相应的信号波形图时,还包括:
根据峰属性检测所述信号波形图中的各峰值,计算出所述信号波形图中的各峰值的突度,根据各峰值的突度对所述信号波形图进行处理;
其中,根据各峰值的突度对所述信号波形图进行处理包括:移除突度小于预设突度最小阈值的波峰的操作。
在一种可能的实现方式中,还包括:
对所述待分类图像进行预处理,定位出所述待分类图像中的标志区域,并根据定位出的所述标志区域的形状确定所述待分类图像的类别。
根据本申请的一方面,还提供了一种图像分类装置,包括:数据输入模块、图像截取模块、波形图绘制模块和第一类别确定模块;
所述数据输入模块,被配置为输入待分类的影像数据集;其中,所述影像数据集中包括有待分类图像;
所述图像截取模块,被配置为以所述待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像;
所述波形图绘制模块,被配置为选取所述标准图像中任一单通道的像素值数组,并基于所述像素值数组绘制相应的信号波形图;
所述第一类别确定模块,被配置为基于所述信号波形图确定所述待分类图像的类别。
根据本申请的另一方面,还提供了一种图像分类设备,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为执行所述可执行指令时实现前面任一所述的方法。
通过对待分类图像进行特定区域的截取之后,由截取得到的标准图像中选取任一单通道的像素值数组,然后再根据所选取的该单通道的像素值数组绘制信号波形图,从而在对待分类图像进行类别的划分时,基于绘 制得到的信号波形图进行划分。相较于相关技术中,采用图像识别模型和深度学习网络模型CNN等方式,不仅能够快速、准确、高效地完成各类影像的分类,并且其只需要对待分类图像进行上述处理生成对应的信号波形图即可,不需要进行大量样本数据的收集和标注,也不需要对识别模型进行训练等操作,这就使得本申请实施例的图像分类方法能够实现即拿即用的目的,其不依赖于样本数据,进而也就能够更加适用于各种图像分类应用场景,最终有效提高了图像分类方法的可复制性和泛化能力。
说明书附图
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请一实施例的图像分类方法的流程图;
图2为本申请一实施例的图像分类方法的另一流程图;
图3为本申请一实施例的图像分类方法中根据图像颜色对影像数据集中的待分类图像进行过滤时所依据的HSV颜色空间下不同颜色的色相阈值分布图;
图4为本申请一实施例的图像分类方法中对绘制得到的信号波形图中的各峰值的突度检测的原理图;
图5a和图5b分别为本申请一实施例的图像分类方法中最终绘制得到的眼底图所对应的信号波形图和外眼图所对应的信号波形图;
图6为本申请一实施例的图像分类方法中,根据定位出的瞳孔区域的形状检测待分类图像为外眼图时的效果图;
图7为本申请一实施例的图像分类装置的结构框图;
图8为本申请一实施例的图像分类设备的结构框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例, 而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。
图1示出根据本申请一实施例的图像分类方法的流程图。如图1所示,该方法包括:步骤S100,输入待分类的影像数据集。此处,需要说明的是,影像数据集中包含有多张待分类图像,这些待分类图像的类别各不相同。具体的,以眼部影像数据来说,影像数据集中可以包括OCT、眼部B超、图文报告、FFA图像、通过眼底相机采集到的眼底图像、外眼图像等各种图像。步骤S200,以待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像。进而再通过步骤S300,选取标准图像中任一单通道的像素值数组,并基于像素值数组绘制相应的信号波形图。最后,再通过步骤S400,基于绘制得到的信号波形图确定待分类图像的类别。
由此,本申请实施例的图像分类方法,通过对待分类图像进行特定区域的截取之后,由截取得到的标准图像中选取任一单通道的像素值数组,然后再根据所选取的该单通道的像素值数组绘制信号波形图,从而在对待分类图像进行类别的划分时,基于绘制得到的信号波形图进行划分。相较于相关技术中,采用图像识别模型和深度学习网络模型CNN等方式,不仅能够快速、准确、高效地完成各类影像的分类,并且其只需要对待分类图像进行上述处理生成对应的信号波形图即可,不需要进行大量样本数据的收集和标注,也不需要对识别模型进行训练等操作,这就使得本申请实施例的图像分类方法能够实现即拿即用的目的,其不依赖于样本数据,进而也就能够更加适用于各种图像分类应用场景,最终有效提高了图像分类方法的可复制性和泛化能力。
其中,需要说明的是,在以待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像时,所截取的特定区域与影像数据集中所包含的待分类图像的所属的大类相关联。
也就是说,影像数据集中所包含的待分类图像虽然为多张,并且不同 的待分类图像的类别不同,但是其均属于同一大类。举例来说,以临床医学影像为例,影像数据集中所包含的待分类图像均属于眼部检测数据这一大类,只不过不同的待分类图像对应眼部检测图像下不同的影像数据。如:待分类图像可以为通过眼底相机采集到的眼底图像,或者是外眼图像,待分类图像还可以为眼部B超、OCT、FFA等。再比如:影像数据集中所包含的待分类图像均属于胸部检测数据这一大类,则其所包含的待分类图像可以为胸部CT、胸部B超等数据。此外,影像数据集中的待分类图像还可以为其他应用领域中所采集到的影像数据,此处不再一一进行举例说明。
即,影像数据集可以为不同应用场景下所采集到的不同图像数据。但,需要指出的是,影像数据集中的待分类图像应当均属于在同一应用场景下所获得的不同形式或不同类别的图像数据。
同时,在对待分类图像进行截取时,所截取的特定区域则可以基于影像数据集中的各待分类图像所属的具体应用场景来进行确定。同样还以临床医学影像中的眼部检测数据为例,所截取的特定区域则为包络整个瞳孔或黄斑区域。预设尺寸同样也可以根据影像数据集中的待分类图像所属的具体应用场景来进行设定。不同应用场景下所设定的预设尺寸不同。
进一步的,在本申请实施例的图像分类方法中,输入待分类的影像数据集之后,还可以包括根据图像尺寸和/或图像颜色对影像数据集中的待分类图像进行过滤的操作。此处,需要说明的是,在根据图形尺寸和图像颜色这两项信息对影像数据集中的待分类图像进行过滤时,其先后顺序可以根据实际情况灵活设定。
举例来说,参阅图2,在一种可能的实现方式中,对于影像数据集为临床医学眼部数据来说,在通过步骤S100,输入待分类的影像数据集之后,可以先通过步骤S021,根据图像尺寸对影像数据集中的待分类图像进行过滤,将影像数据集中的OCT、眼部B超和图文报告等小尺寸图像数据过滤掉。然后,再通过步骤S022,根据图像颜色对过滤后的影像数据集中剩余的待分类图像进行再次过滤,识别过滤出FFA图像等数据。
更加具体的,在根据图像颜色对影像数据集中剩余的待分类图像进行再次过滤时,所采用的颜色识别原理具体可以根据“HSV颜色空间下不同 颜色的色相阈值分布”(可参见图3)来进行划分。
这是由于FFA图像为灰度图,而眼底图、外眼图等均为RGB三通道彩图,采用颜色识别可以对3类图像完成进一步细分,滤除FFA图像。其中,眼底图和外眼图的色度值均位于['red2','red','orange']的色相区间内。
在通过上述方式对影像数据集中的待分类图像进行初步过滤识别之后,即可执行步骤S200,以待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像。此处,根据前面所述,对于特定区域的选取以及预设尺寸的设定,可以根据影像数据集所属的具体应用场景来进行。对于临床医学眼部影像数据来说,所截取的特定区域为包络整个通孔或黄斑区域。预设尺寸的大小可以设定为边长为700px。通过对待分类图像进行截取,能够去除边框噪声的干扰,便于执行后续操作。
在将待分类图像截取为标准图像后,即可执行步骤S300,选取标准图像中任一单通道的像素值数组,并基于像素值数组绘制相应的信号波形图。此处,需要说明的是,在选取图像中任一单通道的像素值数组绘制生成相应的信号波形图时,可以根据当前正在识别划分的待分类图像的底色进行选取。即,在本申请实施例的图像分类方法中,所选取的单通道应当为与当前正在识别划分的待分类图像的底色相接近的通道。
同样以眼底图像数据为例,由于眼底图像的底色接近于黄色,因此可以选取标准图像中单通道R的像素值数组,然后基于所选取出的R通道的像素值数组绘制生成信号波形图。
其中,在一种可能的实现方式中,选取标准图像中任一单通道的像素值数组,并基于像素值数组绘制相应的信号波形图,可以通过以下方式来实现。
即,首先,选取标准图像中的任一单通道的像素值数组。然后,对像素值数组中的行像素数组和列像素数组分别进行行压缩和列压缩,得到行像素压缩数组和列像素压缩数组。进而,对行像素压缩数组和列像素压缩数组进行曲线平滑处理,以去除行像素压缩数组和列像素压缩数组中的噪声点。最后,再根据平滑处理后的行像素压缩数组和列像素压缩数组绘制相应的信号波形图。
其中,对所选取出的标准图像某一单通道的像素值数组分别进行行压缩和列压缩时,可以通过以下公式计算得到:
y_row=img_r.sum(axis=0)#像素值行压缩。
y_col=img_r.sum(axis=1)#像素值列压缩。
其中,axis表征像素值求和的维度,axis=0表示各行的像素值求和,axis=1表示各列的像素值求和。
对选取出的单通道像素值数组完成行压缩和列压缩之后,即可对行像素压缩数组和列像素压缩数组进行曲线平滑处理。其中,在一种可能的实现方式中,可以通过调用cipy.signal库中的savgol_filter进行平滑处理,以去除行像素压缩数组和列像素压缩数组中的噪声点。
更加具体的,对行像素压缩数组和列像素压缩数组进行曲线平滑滤波处理时的计算公式如下所示:
Figure PCTCN2021112932-appb-000001
其中,h i为平滑系数;H=2w+1表示滤波窗口的宽度(即:测量点的总数量),各测量点x=(-w,-w+1,...,0,1,...,w-1,w)。
进一步的,在对行像素压缩数组和列像素压缩数组均进行完平滑处理之后,还包括对平滑处理后的行像素压缩数组和列像素压缩数组进行取对数运算处理的操作。通过对平滑处理后的行像素压缩数组和列像素压缩数组分别取对数运算,不仅可以缩小数据的绝对数值,方便计算,而且在不改变数据性质和相关关系的前提下,还压缩了变量尺度,削弱了模型的共线性,异方差性等。
在通过上述得到平滑处理后的行像素压缩数组和列像素压缩数组之后,即可根据平滑处理后的行像素压缩数组和列像素压缩数组绘制相应的信号波形图。
在一种可能的实现方式中,绘制压缩像素值的信号波形图时可以直接 调用matplotlib.pyplot库中的plt.plot函数来实现。
在绘制生成所选取出的单通道像素值的信号波形图之后,即可执行步骤S400,基于信号波形图确定待分类图像的类别。
具体的,在基于信号波形图确定待分类图像的类别时,可以根据所绘制的信号波形图,获取信号波形曲线中像素值的变化率百分比、信号波形曲线中各峰值的突度、以及信号波形曲线中列振幅占行振幅的百分比,然后根据信号波形曲线中像素值的变化率百分比、信号波形图中的各峰值的突度和信号波形曲线中列振幅占行振幅的百分比中的至少一种进行待分类图像的类别的识别确定。
更加具体的,在本申请一实施例的图像分类方法中,信号波形曲线中像素值的变化率百分比可以基于所绘制的信号波形图计算得到。其计算公式为:
Figure PCTCN2021112932-appb-000002
其中,delta为信号波形曲线中像素值的变化率百分比;max(y smooth)为经过Savitzky-Golay平滑滤波后的最大像素值;min(y smooth)为经过Savitzky-Golay平滑滤波后的最小像素值。
进一步的,在一种可能的实现方式中,信号波形曲线中各峰值的突度则可以以下方式来实现。
即,首先,根据峰属性检测信号波形图内的各个峰值。具体的,可以通过直接调用scipy.signal库中的find_peaks方法来实现信号波形图上的峰值检测,识别出的同一周期内各峰值信号之间位置间隔通过参数distance来控制,峰值信号需满足的最小阈值peak min利用如下公式计算:
peak min=min(y smooth)+0.35×(max(y smooth)-min(y smooth))。
然后,对于检测出的各峰值进行突度的检测计算。在一种可能的实现方式中,可以采用调用scipy.signal库中的peak_prominences方法来进行信号波形图中各峰值的突度的计算检测。在计算检测出各峰值的突度之后,在本申请一实施例的图像分类方法中,还包括:移除突度较小的波峰的操作来实现对信号波形图的后处理,从而避免突度较小的波峰对后续的波形 识别造成干扰。即,移除信号波形曲线中突度小于预设突度最小阈值的波峰,从而得到最终的信号波形图。此处,需要说明的是,预设突度最小阈值的取值范围可以为:min(y smooth)+0.35×(max(y smooth)-min(y smooth))。
其中,突度prominence的基本原理可参见图4所示。图4中所示出的垂直箭头分别显示了三个波峰对应的突度,计算公式如下:
prominence=min(peak-left base,peak-right base)。
接着,再根据处理后的信号波形图,进行信号波形曲线中列振幅占行振幅的百分比的计算。其中,col振幅占row振幅的百分比percent_col_in_row,作为判定眼底图和外眼图的标准之一,并将检测到的明显的峰值信号在信号波形图上显示出来。
其中,参阅图5a和图5b,分别示出了待分类图像为眼底图时所对应的信号波形图以及待分类图像为外眼图时所对应的信号波形图。在这两张信号波形图中,均将检测到的显著峰值信号在波形图中进行了标注显示。
在通过上述方式最终绘制生成相应的信号波形图之后,即可根据信号波形图所得到的信号波形曲线中像素值的变化率百分比、信号波形图中的各峰值的突度和信号波形曲线中列振幅占行振幅的百分比中的至少一种确定待分类图像的类别。
其中,在根据上述方式进行待分类图像的类别的识别确定时,还可以先通过信号波形曲线的单调性进行待分类图像是否为眼底图的识别。在通过信号波形曲线的单调性识别出待分类图像不符合眼底图时,再根据前面三项信息中的至少一种进行识别确定。
即,在一种可能的实现方式中,可以先判断信号波形图中的信号波形曲线是否为单调递增或单调递减的。在判断出信号波形曲线为单调递增或单调递减时,可以直接确定待分类图像为眼底图。其中,应当指出的是,信号波形曲线的单调性可以通过计算信号波形曲线的一阶导数是否恒大于等于0(即,≥0),或者是恒小于等于0(即,≤0)来进行判断。在计算出信号波形曲线恒大于等于0或者是恒小于等于0时,表明信号波形曲线为单调递增或单调递减的,因此可以直接确定该信号波形图所对应的待分类图像为眼底图。
在判断出信号波形曲线不具有单调性时,即,信号波形曲线不是单调 递增或单调递减时,则判断(a),信号波形曲线中行方向和列方向的像素值变化率百分比delta是否均小于第一预设值,且(b),信号波形曲线中行方向和列方向是否均不存在突度大于第二预设值的显著波峰。
在判断出信号波形曲线中行方向和列方向的像素值变化率百分比delta均小于第一预设值,且信号波形曲线中行方向和列方向均不存在突度大于第二预设值的显著波峰时,则可以确定该信号波形图所对应的待分类图像为眼底图。
其中,第一预设值和第二预设值的取值均可以根据实际情况灵活设置。即,第一预设值和第二预设值的取值均可以根据当前要识别的图像类别、具体的应用场景和应用需求等因素进行设置。在一种可能的实现方式中,当前要识别确定的图像类别为眼底图或外眼图时,第一预设值的取值可以为6%,第二预设值的取值可以为0.02。
若绘制生成的信号波形图均不满足上述条件,则对信号波形图的信号波形曲线中col振幅占row振幅的百分比percent_col_in_row是否超过第三预设值,并且信号波形曲线中沿行方向和列方向的像素值变化率百分比delta是否存在至少有一个小于第四预设值。
如果判断出信号波形曲线中col振幅占row振幅的百分比percent_col_in_row超过第三预设值,且信号波形曲线中沿行方向和列方向的像素值变化率百分比delta存在至少一个小于第四预设值,则可以判定该信号波形图所对应的待分类图像为外眼图。
其中,第三预设值和第四预设值的取值同样也均可以根据当前要识别的图像类别、具体的应用场景和应用需求等因素进行设置。在一种可能的实现方式中,当前要识别确定的图像类别为眼底图或外眼图时,第三预设值的取值可以为40%,第四预设值的取值可以为6%。
另外,在一种可能的实现方式中,当判断出信号波形曲线中col振幅占row振幅的百分比percent_col_in_row小于第五预设值时,则可以直接确定该信号波形图所对应的待分类图像为眼底图。其中,第五预设值的取值可以为30%。第五预设值的取值还可以根据实际情况中所要识别的图像类别、具体的应用场景和应用需求等因素进行测试选择,此处不进行具体限定。
更进一步的,由于影像数据集中通常会包含有多张待分类图像,通过前面任一种方式对各待分类图像的类别进行识别确定时,如果还存在不能直接识别出的待分类图像时,在本申请实施例的图像分类方法中,参阅图2,还包括:步骤S500,对待分类图像进行预处理,定位出待分类图像中的标志区域,并根据定位出的标志区域的形状确定待分类图像的类别。其中,应当指出的是,标志区域为用于表征待分类图像的属性的标志位置。此处,本领域技术人员可以理解的是,待分类图像的属性指的是图像的所属类别。
举例来说,在影像数据集为临床医学眼部影像时,标志区域则指的是通孔区域。在影像数据集为其他影像时,标志区域则为能够表征该图像所属类别的代表性位置。此处不再一一举例说明。
其中,对待分类图像进行预处理,定位出待分类图像中的标志区域,并根据定位出的标志区域的形状确定待分类图像的类别时,具体可以通过以下方式来实现。
对待分类图像进行预处理包括:对待分类图像进行裁剪,将待分类图像裁剪为标准图像。其中,裁剪方式可以直接采用前面所述的截取方式,由此可直接读取通过以待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为的标准图像。
然后,对标准图像进行预处理,得到黑白二值图。具体的,在一种可能的实现方式中,对标准图像的预处理可以包括:
滤波、灰度转换、二值化等处理。即,预处理操作具体:对标准图像进行高斯滤波,去除部分噪点;其中,可以使用cv2.GaussianBlur进行高斯滤波,高斯核大小选取(5,5)。
将滤波后的原始标准图像转为灰度图;其中,可以使用cv2.cvtColorc进行灰度转换。
对转换得到的灰度图进行二值化处理;如:可以使用cv2.threshold实现灰度图的二值化。
在通过上述任一方式实现对标准图像的预处理之后,再执行以下步骤:检测二值图中的各个连通区域。其中,在一种可能的实现方式中,先对预处理后的二值化图像依次进行闭运算、开运算,过滤孤立噪声点,之后对 二值化图像中密集像素点形成的连通区域进行识别。其中,在一种可能的实现方式中,可以采用cv2.morphologyEx实现二值图中密集像素点形成的连通区域的精准识别。
接着,再由检测出的连通区域中筛选出面积最大的连通区域,从而确定图像的最佳自适应剪裁尺寸,进而根据确定的最佳自适应剪裁尺寸在二值化图像上绘制出相应尺寸的矩形框。该过程可以采用cv2.contourArea进行连通区域的面积计算。其中,所筛选出的连通区域的面积应大于20000,且连通区域与二值化图像的边缘轮廓的容差面积须大于2000个像素点。
然后,执行去除噪声干扰,定位瞳孔位置的操作。其中,在进行噪声干扰的去除时,可以采用cv2.getStructuringElement方法去除二值图周围的噪声干扰,通孔位置的定位则可以通过cv2.morphologyEx完成。应当指出的是,在本申请一实施例的图像分类方法中,瞳孔圆心位置可以设置在[200px,660px]的区间范围内。
通过上述方式即可定位出当前正在识别划分的待分类图像的标志区域。在定位出标志区域后,即可根据所定位出的标志区域的形状进行待分类图像类别的判定。
具体的,可以通过cv2.fitEllipse进行椭圆检测,完成眼底、外眼图像分类。参见图6,若在标准图像中检测到短轴半径为指定区间范围内的椭圆结构,则判定该图像为外眼图。其中,在一种可能的实现方式中,对于外眼图像,其瞳孔的短轴半径的制定区间范围可以为:[82px,700px]。
由此,本申请实施例的图像分类方法,通过在根据信号波形图进行影像数据集中的待分类图像的识别确定的过程中,还同时结合图像中标志区域的形状进行确定,能够实现对影像数据集中93.7%的图像的区分,这就大大提升了图像分类的准确率,对于不同眼位、不同眼型、不同病灶、不同拍摄视角以及不同曝光度和饱和度等应用场景所采集到的各种图像均能够准确进行图像的区分识别,这也就有效提高了图像分类方法的灵活性和鲁棒性。
相应的,基于前面任一所述的图像分类方法,本申请还提供了一种图像分类装置。由于本申请提供的图像分类装置的工作原理与本申请的图像 分类方法的原理相同或相似,因此重复之处不再赘述。
参阅图7,本申请提供的图像分类装置100,包括:数据输入模块110、图像截取模块120、波形图绘制模块130和第一类别确定模块140。其中,数据输入模块110,被配置为输入待分类的影像数据集;其中,影像数据集中包括有待分类图像。图像截取模块120,被配置为以待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像。波形图绘制模块130,被配置为选取标准图像中任一单通道的像素值数组,并基于像素值数组绘制相应的信号波形图。第一类别确定模块140,被配置为基于信号波形图确定待分类图像的类别。
在一种可能的实现方式中,还包括第二类别确定模块(图中未示出)。其中,第二类别确定模块,被配置为对待分类图像进行预处理,定位出待分类图像中的标志区域,并根据定位出的标志区域的形状确定待分类图像的类别。
更进一步地,根据本申请的另一方面,还提供了一种图像分类设备200。参阅图8,本申请实施例的图像分类设备200包括处理器210以及用于存储处理器210可执行指令的存储器220。其中,处理器210被配置为执行可执行指令时实现前面任一所述的图像分类方法。
此处,应当指出的是,处理器210的个数可以为一个或多个。同时,在本申请实施例的图像分类设备200中,还可以包括输入装置230和输出装置240。其中,处理器210、存储器220、输入装置230和输出装置240之间可以通过总线连接,也可以通过其他方式连接,此处不进行具体限定。
存储器220作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序和各种模块,如:本申请实施例的图像分类方法所对应的程序或模块。处理器210通过运行存储在存储器220中的软件程序或模块,从而执行图像分类设备200的各种功能应用及数据处理。
输入装置230可用于接收输入的数字或信号。其中,信号可以为产生与设备/终端/服务器的用户设置以及功能控制有关的键信号。输出装置240可以包括显示屏等显示设备。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即 可。对于实施例公开的系统而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。

Claims (10)

  1. 一种图像分类方法,其特征在于,包括:
    输入待分类的影像数据集;其中,所述影像数据集中包括有待分类图像;
    以所述待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像;
    选取所述标准图像中任一单通道的像素值数组,并基于所述像素值数组绘制相应的信号波形图;
    基于所述信号波形图确定所述待分类图像的类别。
  2. 根据权利要求1所述的方法,其特征在于,输入待分类的影像数据集之后,还包括:
    根据图像尺寸和/或图像颜色对所述影像数据集中的待分类图像进行过滤的操作。
  3. 根据权利要求1所述的方法,其特征在于,选取所述标准图像中任一单通道的像素值数组,并基于所述像素值数组绘制相应的信号波形图,包括:
    选取所述标准图像中的任一单通道的像素值数组;
    对所述像素值数组中的行像素数组和列像素数组分别进行行压缩和列压缩,得到行像素压缩数组和列像素压缩数组;
    对所述行像素压缩数组和所述列像素压缩数组进行曲线平滑处理;
    根据平滑处理后的所述行像素压缩数组和所述列像素压缩数组绘制相应的所述信号波形图。
  4. 根据权利要求3所述的方法,其特征在于,对所述像素值数组中的行像素数组和列像素数组分别进行行压缩和列压缩时,采用如下公式进行:
    y_row=img_r.sum(axis=0)#像素值行压缩;
    y_col=img_r.sum(axis=1)#像素值列压缩;
    其中,axis表征像素值求和的维度,axis=0表示各行的像素值求和,axis=1表示各列的像素值求和。
  5. 根据权利要求3所述的方法,其特征在于,对所述行像素压缩数组和所述列像素压缩数组进行曲线平滑处理后,还包括:对平滑处理后的 行像素压缩数组和列像素压缩数组进行取对数运算处理的操作。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,基于所述信号波形图确定所述待分类图像的类别,包括:
    根据所述信号波形图中的信号波形曲线中像素值的变化率百分比、所述信号波形图中的各峰值的突度和所述信号波形曲线中列振幅占行振幅的百分比中的至少一种确定所述待分类图像的类别。
  7. 根据权利要求1或3所述的方法,其特征在于,基于所述像素值数组绘制相应的信号波形图时,还包括:
    根据峰属性检测所述信号波形图中的各峰值,计算出所述信号波形图中的各峰值的突度,根据各峰值的突度对所述信号波形图进行处理;
    其中,根据各峰值的突度对所述信号波形图进行处理包括:移除突度小于预设突度最小阈值的波峰的操作。
  8. 根据权利要求1至5任一项所述的方法,其特征在于,还包括:
    对所述待分类图像进行预处理,定位出所述待分类图像中的标志区域,并根据定位出的所述标志区域的形状确定所述待分类图像的类别。
  9. 一种图像分类装置,其特征在于,包括:数据输入模块、图像截取模块、波形图绘制模块和第一类别确定模块;
    所述数据输入模块,被配置为输入待分类的影像数据集;其中,所述影像数据集中包括有待分类图像;
    所述图像截取模块,被配置为以所述待分类图像的中心为基准中心点,截取包括特定区域的预设尺寸的图像作为标准图像;
    所述波形图绘制模块,被配置为选取所述标准图像中任一单通道的像素值数组,并基于所述像素值数组绘制相应的信号波形图;
    所述第一类别确定模块,被配置为基于所述信号波形图确定所述待分类图像的类别。
  10. 一种图像分类设备,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述可执行指令时实现权利要求1至8中任意一项所述的方法。
PCT/CN2021/112932 2021-03-26 2021-08-17 图像分类方法和装置及设备 WO2022198898A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/642,568 US12293576B2 (en) 2021-03-26 2021-08-17 Determining type of to-be-classified image based on signal waveform graph
JP2022527909A JP7305046B2 (ja) 2021-03-26 2021-08-17 画像分類方法及び装置並びに機器

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110322703.6A CN112801049B (zh) 2021-03-26 2021-03-26 图像分类方法和装置及设备
CN202110322703.6 2021-03-26

Publications (1)

Publication Number Publication Date
WO2022198898A1 true WO2022198898A1 (zh) 2022-09-29

Family

ID=75815763

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/112932 WO2022198898A1 (zh) 2021-03-26 2021-08-17 图像分类方法和装置及设备

Country Status (4)

Country Link
US (1) US12293576B2 (zh)
JP (1) JP7305046B2 (zh)
CN (1) CN112801049B (zh)
WO (1) WO2022198898A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071725A (zh) * 2023-03-06 2023-05-05 四川蜀道新能源科技发展有限公司 一种路面标线识别方法及系统
CN116320459A (zh) * 2023-01-08 2023-06-23 南阳理工学院 一种基于人工智能的计算机网络通信数据处理方法及系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801049B (zh) * 2021-03-26 2021-07-23 北京至真互联网技术有限公司 图像分类方法和装置及设备
US12100152B1 (en) * 2023-01-30 2024-09-24 BelleTorus Corporation Compute system with acne diagnostic mechanism and method of operation thereof
CN116979991B (zh) * 2023-07-24 2025-07-22 电子科技大学 一种多跳速情况下跳频信号预分选方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080085048A1 (en) * 2006-10-05 2008-04-10 Department Of The Navy Robotic gesture recognition system
CN106446936A (zh) * 2016-09-06 2017-02-22 哈尔滨工业大学 基于卷积神经网络的空谱联合数据转波形图的高光谱数据分类方法
CN108596169A (zh) * 2018-03-12 2018-09-28 北京建筑大学 基于视频流图像的分块信号转换与目标检测方法及装置
CN108921076A (zh) * 2018-09-21 2018-11-30 南京信息工程大学 基于图像的道面裂缝病害自适应恒虚警检测方法
CN109359693A (zh) * 2018-10-24 2019-02-19 国网上海市电力公司 一种电能质量扰动分类方法
CN112801049A (zh) * 2021-03-26 2021-05-14 北京至真互联网技术有限公司 图像分类方法和装置及设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5117353B2 (ja) * 2008-11-07 2013-01-16 オリンパス株式会社 画像処理装置、画像処理プログラムおよび画像処理方法
CN111814564B (zh) * 2020-06-09 2024-07-12 广州视源电子科技股份有限公司 基于多光谱图像的活体检测方法、装置、设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080085048A1 (en) * 2006-10-05 2008-04-10 Department Of The Navy Robotic gesture recognition system
CN106446936A (zh) * 2016-09-06 2017-02-22 哈尔滨工业大学 基于卷积神经网络的空谱联合数据转波形图的高光谱数据分类方法
CN108596169A (zh) * 2018-03-12 2018-09-28 北京建筑大学 基于视频流图像的分块信号转换与目标检测方法及装置
CN108921076A (zh) * 2018-09-21 2018-11-30 南京信息工程大学 基于图像的道面裂缝病害自适应恒虚警检测方法
CN109359693A (zh) * 2018-10-24 2019-02-19 国网上海市电力公司 一种电能质量扰动分类方法
CN112801049A (zh) * 2021-03-26 2021-05-14 北京至真互联网技术有限公司 图像分类方法和装置及设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116320459A (zh) * 2023-01-08 2023-06-23 南阳理工学院 一种基于人工智能的计算机网络通信数据处理方法及系统
CN116320459B (zh) * 2023-01-08 2024-01-23 南阳理工学院 一种基于人工智能的计算机网络通信数据处理方法及系统
CN116071725A (zh) * 2023-03-06 2023-05-05 四川蜀道新能源科技发展有限公司 一种路面标线识别方法及系统
CN116071725B (zh) * 2023-03-06 2023-08-08 四川蜀道新能源科技发展有限公司 一种路面标线识别方法及系统

Also Published As

Publication number Publication date
US12293576B2 (en) 2025-05-06
JP2023522511A (ja) 2023-05-31
CN112801049B (zh) 2021-07-23
US20240046632A1 (en) 2024-02-08
JP7305046B2 (ja) 2023-07-07
CN112801049A (zh) 2021-05-14

Similar Documents

Publication Publication Date Title
WO2022198898A1 (zh) 图像分类方法和装置及设备
US11681418B2 (en) Multi-sample whole slide image processing in digital pathology via multi-resolution registration and machine learning
CN106023151B (zh) 一种开放环境下中医舌象目标检测方法
CN107038704B (zh) 视网膜图像渗出区域分割方法、装置和计算设备
CN102096802A (zh) 人脸检测方法及装置
KR102826732B1 (ko) 안저 영상 식별 방법 및 장치와 기기
CN106355599A (zh) 基于非荧光眼底图像的视网膜血管自动分割方法
CN106355584A (zh) 基于局部熵确定阈值的眼底图像微动脉瘤自动检测方法
CN118172608B (zh) 一种幽门杆菌的细菌培养图像分析方法及系统
CN118097141A (zh) 一种基于产科影像的图像分割方法及系统
CN117576121A (zh) 一种显微镜扫描区域自动分割方法、系统、设备及介质
CN113989588A (zh) 一种基于自学习的五边形绘图测试智能评价系统及方法
AU2020103713A4 (en) Digital imaging methods and system for processing agar plate images for automated diagnostics
CN118898775A (zh) 一种水下图像结构病害全自动标注方法及系统
CN113052234A (zh) 一种基于图像特征和深度学习技术的玉石分类方法
CN118135620A (zh) 基于病理切片图像的肝癌微血管侵犯区域的识别方法及系统
JP2008084109A (ja) 目開閉判定装置及び目開閉判定方法
CN108921171B (zh) 一种骨关节x线片自动识别分级方法
CN113706515B (zh) 舌像异常确定方法、装置、计算机设备和存储介质
CN107358224B (zh) 一种白内障手术中虹膜外轮廓检测的方法
Cloppet et al. Adaptive fuzzy model for blur estimation on document images
Jalil et al. Iris localization using colour segmentation and circular Hough transform
CN110543802A (zh) 眼底图像中左右眼识别方法与装置
CN113139552B (zh) 一种小麦抽穗期识别方法、装置、电子设备及存储介质
CN114445364B (zh) 眼底图像微动脉瘤区域检测方法及其成像方法

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 17642568

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2022527909

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932510

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932510

Country of ref document: EP

Kind code of ref document: A1

WWG Wipo information: grant in national office

Ref document number: 17642568

Country of ref document: US