WO2024231061A1 - Procédé d'identification d'anomalies dans des cellules d'intérêt dans un échantillon biologique - Google Patents
Procédé d'identification d'anomalies dans des cellules d'intérêt dans un échantillon biologique Download PDFInfo
- Publication number
- WO2024231061A1 WO2024231061A1 PCT/EP2024/060418 EP2024060418W WO2024231061A1 WO 2024231061 A1 WO2024231061 A1 WO 2024231061A1 EP 2024060418 W EP2024060418 W EP 2024060418W WO 2024231061 A1 WO2024231061 A1 WO 2024231061A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- interest
- cells
- digital
- cytology
- biological sample
- Prior art date
Links
- 239000012472 biological sample Substances 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 title claims description 52
- 230000005856 abnormality Effects 0.000 title description 7
- 206010005003 Bladder cancer Diseases 0.000 claims abstract description 24
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 claims abstract description 23
- 201000005112 urinary bladder cancer Diseases 0.000 claims abstract description 23
- 239000013598 vector Substances 0.000 claims abstract description 14
- 238000004458 analytical method Methods 0.000 claims abstract description 13
- 210000004027 cell Anatomy 0.000 claims description 66
- 238000011176 pooling Methods 0.000 claims description 9
- 238000010801 machine learning Methods 0.000 claims description 8
- 210000002700 urine Anatomy 0.000 claims description 8
- 239000000523 sample Substances 0.000 claims description 5
- 238000001514 detection method Methods 0.000 claims description 4
- 238000013434 data augmentation Methods 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 206010044412 transitional cell carcinoma Diseases 0.000 claims description 3
- 208000023747 urothelial carcinoma Diseases 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 2
- 238000004220 aggregation Methods 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims description 2
- 210000000805 cytoplasm Anatomy 0.000 claims 1
- 238000012549 training Methods 0.000 description 18
- 238000013459 approach Methods 0.000 description 14
- 238000002474 experimental method Methods 0.000 description 11
- 238000003745 diagnosis Methods 0.000 description 10
- 238000012360 testing method Methods 0.000 description 9
- 206010028980 Neoplasm Diseases 0.000 description 8
- 201000011510 cancer Diseases 0.000 description 7
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 238000010186 staining Methods 0.000 description 5
- 230000004931 aggregating effect Effects 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 230000036210 malignancy Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 210000000440 neutrophil Anatomy 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000007170 pathology Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 230000002380 cytological effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001839 endoscopy Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 238000009607 mammography Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 230000002485 urinary effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/695—Preprocessing, e.g. image segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/698—Matching; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30024—Cell structures in vitro; Tissue sections in vitro
Definitions
- the present invention relates to the field of Whole Slide Image (WSI) analysis, and more particularly to detection of abnormalities in biological samples. Especially, the present invention relates to detecting cell abnormalities in a biological sample, preferably a urinary sample. More specifically, the invention relates to a method for predicting the histology outcomes from cytological slides. BACKGROUND OF INVENTION [0002]
- Whole slide image analysis (WSI) is a digital pathology technique that involves the scanning of glass slides containing tissue samples and converting them into high- resolution digital images.
- WSI enables pathologists and researchers to examine large tissue samples in detail, without the limitations of traditional microscopy. It allows for quantitative analysis of tissue features and can aid in the diagnosis, prognosis, and treatment of various diseases, including cancer.
- cytopathology of urine sediment is a promising approach for diagnosing bladder-related disorders or diseases.
- Low-grade bladder cancer is a type of non-invasive bladder cancer that grows slowly and is less likely to spread to other parts of the body. It is often less aggressive and may recur but is less likely to progress to muscle-invasive disease. Low-grade bladder cancer is usually treated by removing the tumors through transurethral resection, followed by close monitoring and surveillance to detect any recurrence.
- High-grade bladder cancer is a more aggressive and invasive form of bladder cancer that grows quickly and has a higher risk of spreading to other parts of the body, particularly if left untreated. It is more likely to recur and progress to muscle- invasive disease, which can be life-threatening if not treated promptly.
- the grading of bladder cancer is based on the appearance of cancer cells under a microscope, which are categorized into low-grade or high-grade based on their level of abnormality, with high-grade cells being more abnormal and aggressive. Determining the grade of bladder cancer is important for the medical team to design a therapeutical strategy for the patient.
- This invention aims at helping the medical team in having a prompt and accurate information on the next step to be taken.
- This invention helps predicting the outcome of an endoscopy of the patient’s bladder, and help monitoring the patient in an optimal way.
- a ⁇ device for analysis of a digital cytology slide of a biological sample said biological sample having been previously collected from a subject suspected to be suffering from bladder cancer, said device comprising: ⁇ ⁇ ⁇ ⁇
- ⁇ 3 at least one input configured to receive at least one digital cytology slide obtained from a digitalization of at least one Whole Slide Image (WSI) of said biological sample; - at least one processor configured to: - detect cells of interest from said at least one digital cytology image; - for each cell of interest, compute a feature vector comprising at least one feature calculated on each cell of interest; - define a bag of k instances for each digital cytology slide; wherein the k instances are selected, based on at least one feature of said feature vectors, as the cells of interest being the highest atypia cells among the cells of interest detected for each digital cytology slide; - calculating a global prediction score representative of a probability of presence of bladder cancer and/or a stage of bladder cancer for said subject; said global prediction score being obtained from a combination of single prediction scores obtained from at least two multi-instance learning methods configured to receive as input said at least one defined bag; - at least one output configured to provide said global prediction score.
- WAI Whole Slide Image
- the device of the present invention relies on a-priori selection of the most atypical cells and an ensembling of Multiple Instance Learners to predict diagnosis. Besides the computational interest of selecting the most atypia cells of interest this approach is advantageous as, in both positive and negative slides, sufficient information is contained in the “most positive” instances. For positive slides, healthy cells tends to be removed from the analysis, increasing attention on atypical cells of interest. On the other hand for negative slides, more importance is given to slightly atypical but healthy cells of interest, acting as hard-mining which could help reduce false positives [0013] According to other advantageous aspects of the invention, the device comprises one or more of the features described in the following embodiments, taken alone or in any possible combination. [0014] According to one embodiment, the at least one digitalized cytology images is obtained from at least one WSI being colored with Papanicolaou. This embodiment ⁇ ⁇ ⁇ ⁇ ⁇
- the detection of the at least one cell of interest is performed using a trained Res-Net model configured to receive as input at least one pre- processed portion of the WSI, and provide as output an identification and a classification of objects present in said at least one pre-processed portion of the WSI as belonging or not to at least one class of cells of interest.
- the Res-Net model is ResNet-18 model which is trained to classify all foreground objects crops into 5 classes: basal urothelial cells (BUCs), superficial urothelial cells, conglomerates (at least two touching objects), polynuclear neutrophils and others, based on more than 25,000 annotated crops.
- the cells of interest are the basal urothelial cells (BUCs), and the other object classified may be discarded.
- the at least one feature comprised in the feature vector are selected among: Nuclear-Cytoplasm Ratio (NCR), Nucleus Intensity, Nucleus Intensity standard deviation, Haralick’s Energy, Entropy, Homogeneity, convex-hull ratio and/or nucleus circularity.
- the at least one feature is Nuclear-Cytoplasm Ratio (NCR) and wherein the selection of k cells of interest comprises: - sorting the cells of interest by decreasing Nuclear-Cytoplasm Ratio values; and - select the first k cells of interest, representative of high atypia basal urothelial cells.
- the computation of the feature vector comprises a preliminary segmentation of the detected at least one cell of interest and its/their nuclei using a convolutional machine learning model, for example a U-Net model.
- the convolutional machine learning model had been previously trained using random data augmentation so that the convolutional machine learning model is robust to potential staining variations.
- k is a predefined and constant natural number, allowing to improve for bag dimension consistency.
- the at least two multi-instance learning models are an embedding-based multi-instance learning model using a pooling operator being a weighted-mean operator.
- the at least two multi-instance learning model is trained using bootstrap aggregation.
- the bladder cancer is an Urothelial Carcinomas.
- the biological sample is a urine sample
- the cells of interest are basal urothelial cells.
- the device comprises one or more of the features described in the following embodiment, taken alone or in any possible combination.
- the disclosure relates to a computer program comprising software code adapted to perform a method for analysis of a digital cytology slide of a biological sample or a method for training compliant with any of the above execution modes when the program is executed by a processor.
- the present disclosure further pertains to a non-transitory program storage device, readable by a computer, tangibly embodying a program of instructions executable by the computer to perform a method for analysis of a digital cytology slide of a biological sample or a method for training, compliant with the present disclosure.
- a non-transitory program storage device can be, without limitation, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device, or any ⁇ ⁇ ⁇ ⁇
- processor should not be construed to be restricted to hardware capable of executing software, and refers in a general way to a processing device, which can for example include a computer, a microprocessor, an integrated circuit, or a programmable logic device (PLD).
- the processor may also encompass one or more Graphics Processing Units (GPU), whether exploited for computer graphics and image processing or other functions.
- GPU Graphics Processing Unit
- the instructions and/or data enabling to perform associated and/or resulting functionalities may be stored on any processor- readable medium such as, e.g., an integrated circuit, a hard disk, a CD (Compact Disc), an optical disc such as a DVD (Digital Versatile Disc), a RAM (Random-Access Memory) or a ROM (Read-Only Memory). Instructions may be notably stored in hardware, software, firmware or in any combination thereof. ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
- the functions may be provided by a single dedicated processor, a single shared processor, or a plurality of individual processors, some of which may be shared.
- a processor When provided by a processor, the functions may be provided by a single dedicated processor, a single shared processor, or a plurality of individual processors, some of which may be shared.
- the elements shown in the figures may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements are implemented in a combination of hardware and software on one or ⁇ ⁇ ⁇ ⁇
- MI learning also called multi-instance learning methods
- MI Learning methods learning examples are represented by a bag (i.e.; multiset) of instances instead of a single feature vector.
- MI Learning methods In typical machine learning problems such as image classification, it is assumed that an image clearly represents a category (a class). However, in many real-life applications multiple instances are observed and only a general statement of the category is given. This scenario is called multiple instance learning (MIL) or, learning from weakly annotated data.
- MIL multiple instance learning
- MI Learning methods include, non limitatively: Instance based MIL such as MIWrapper, Embedded-based MIL such as for example ABMIL, TRANSMIL or wrapper algorithms such as MILES and SimpleMI, Dual Stream MIL, also referred to as DSMIL.
- Instance-based MIL methods seek to predict first the label yi for each instance i, then deduce the global label from the collection of yi
- MIWrapper performs propositionalization by applying bag-level class labels to instances, and weighting the instances so that each bag has the same total weight. A single-instance model is built on the resulting dataset, and bag-level predictions are made by averaging the predicted probabilities of instances in a bag.
- Embedding-based methods first construct a representation (embedding) of the bag from which the global label is predicted. Each MIL method is composed of a feature ⁇ ⁇ ⁇ ⁇
- TransMIL is an embedding-based method relying on self-attention layers from Transformers to compute instances’ attention scores with reference to all the other instances in the bag. Unlike previous methods, an instance importance is then judged using pairwise comparison with all other instances in the bag.
- MILES Multiple-instance Learning via Embedded Instance Selection
- SVM support vector machine
- MILES embeds bags into a single-instance feature space. MILES uses a symmetric assumption, where multiple target points are allowed, each of which may be related to either positive or negative bags.
- SimpleMI performs propositionalization by averaging the attribute values of the instances in each bag, and appending the bag’s class label to the resulting feature
- the SimpleMI methods only generate one instance for each training bag, without increasing the dimensionality of the feature space. For example, MIWrapper generates one instance for every instance in every bag, leaving the dimensionality of the feature space unchanged. In contrast, if MILES also generates one instance per training bag, the dimensionality of the feature space is almost always much higher, as the number of attributes is equal to the total number of instances in the training bags.
- the Dual-Stream MIL method combines both instance-based and embedding- based approaches.
- An instance-based max-pooling branch is used to predict a malignancy score for each instance, the instance with the highest score is kept as the critical instance.
- a second embedding-based branch uses the weighted-mean operator as ⁇ , where attention ⁇ ⁇ ⁇ ⁇
- ⁇ 10 scores ai are computed as distances to the critical instance.
- the prediction is the mean of each branch output.
- all methods used in this invention include instance selection and an ensembling or bootstrap aggregating approach.
- the method of the invention may implement ensembling: because MIL methods are weakly-supervised methods, obtained predictions can be variable and are usually evaluated with multiple repetitions per experiment. To mitigate this variability, ensembling is used during inference: N individual MIL models are trained on N independent subsets of the training data, and the raw predictions are averaged.
- the method of the invention may implement bootsrap aggregating: bootstrap aggregating, also known as bagging, is an ensemble machine learning technique used to improve the stability and accuracy of machine learning models. It works by combining multiple models built on different samples of the training data to produce a single, more robust model. The idea behind bagging is to reduce the variance of a model, which can occur when the model is overfitting to the training data. By training multiple models on different samples of the training data, bagging helps capturing the underlying structure of the data more accurately.
- bootstrap aggregating the training data is randomly sampled with replacement to create multiple training sets. A separate model is then trained on each of these training sets. The final prediction is made by combining the predictions from each of the models.
- Figure 1 is a general scheme giving an overview of one embodiment of the invention.
- Figures 2a and 2b show the architecture of an example of feature extractor f and classifier g used. ⁇ ⁇ ⁇ ⁇
- Figures 3a and 3b show averaged attention scores over respectively all positive and all negative Test slides, for instances ordered by increasing NCR. In both cases, cells with higher NCR tends to have higher attention scores.
- Figure 4 illustrates ABMIL performances evaluated in cross-validation w.r.t. the number of instances k selected per bags. EXAMPLE [0060] The present invention is further illustrated by the following example.
- Slides are digitized with a Hamamatsu NanoZoomer® S360, along 3 to 5 focal planes ( ⁇ 40, 0.23 ⁇ m/pixel).731 patients’ slides (268 Negative, 233 LGUC, 230 HGUC) are used for training, the remaining patients (141 Negative, 76 LGUC, 75 HGUC) are kept for the Test set. [0064] 1. Detection of cells of interest in WSI: the considered thin-layered urine cytology WSI are colored with Papanicolaou staining, making isolated stained foreground objects easily identifiable from the background. Foreground objects are defined as connected components detected by Otsu’s automated threshold with an area in the range 1200 – 90,000 pixels at full resolution. Foreground objects are detected from the central focal plane of the slide, then a crop containing each foreground object is extracted from the ⁇ ⁇ ⁇ ⁇
- a ResNet-18 model is trained to classify all foreground objects crops into 5 classes: basal urothelial cells (BUCs), superficial urothelial cells, conglomerates (at least two touching objects), polynuclear neutrophils and others, based on more than 25,000 annotated crops.
- BUCs basal urothelial cells
- conglomerates at least two touching objects
- polynuclear neutrophils and others, based on more than 25,000 annotated crops.
- NCR Nuclear-Cytoplasm Ratio
- Nucleus Intensity the Nucleus Intensity
- nucleus texture features the Nucleus Intensity standard deviation and Haralick’s Energy
- Entropy and Homogeneity the Nucleus morphology features
- nucleus morphology features the convex-hull ratio and the nucleus circularity.
- MIL Multiple Instance Learning
- MIL is a weakly-supervised learning method designed to predict a label from a bag of instances, and is well adapted to the problem of predicting a diagnosis score from a collection of BUCs.
- the label yi of each instance i ⁇ [1, k] is not available.
- MIL framework hypothesis is that a bag is positive if containing at least one positive instance.
- Bag creation While MIL approaches can handle bags with a variable number of instances, in this work the number k of instances in a bag is fixed for bag dimension. We propose to select the k most atypical BUCs in each digital slide to build the bags. According to the Paris System, the NCR is the most important criterion for BUCs atypia, hence all BUCs in a WSI are sorted by decreasing NCR and the first k BUCs are included ⁇ ⁇ ⁇ ⁇
- MIL method The present example uses an approach relies on the attention based MIL method which is also embedding-based and for which the pooling operator is a weighted-mean operator: ⁇ ⁇ is the attention score associated with instance i, learnt by the model from ⁇ ⁇ .
- the feature extractor f and classifier g are shown on Fig. 2.
- Ensemble MIL Because MIL methods are weakly-supervised methods, obtained predictions can be variable and are usually evaluated with multiple repetitions per experiment. To mitigate this variability, ensembling is used during inference: N individual MIL models are trained on N independent subsets of the training data, and the raw predictions are averaged.
- Each subset is composed of (N ⁇ 1)/N of the whole train dataset and is computed similarly to a N-fold cross-validation, meaning each sample is in N ⁇ 1 out of the N training subsets.
- ⁇ ⁇ be the prediction of a single MIL model
- the ensembled prediction ypred is obtained as ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
- N 6.
- Diagnostic performances are evaluated on Test set patients using accuracy, sensitivity, specificity and ROC-AUC, averaged over 5 repetitions of the experiment, and compared against experts diagnosis. Following the Paris System, slides labeled C2 by the experts are negative, and slides labeled C4, C5 or ⁇ ⁇ ⁇ ⁇ ⁇
- ⁇ 14 C6 are considered positive.
- slides contains not basal urothelial cells to establish a diagnosis, the slide is rejected and labeled C1.
- the Paris System contains a C3 class corresponding to atypical slides, containing some atypical cells but not enough to be classified as suspicious for HGUC. This C3 label can then be considered as the uncertain class.
- slides are automatically rejected if less than 10 basal urothelial cells are detected, and a slide is labeled uncertain when the diagnosis prediction lies between 0.45 and 0.55. Doing so, the proportion of rejected and uncertain slides returned by our method match the proportion of C1 and C3 slides labeled by the experts. Performances computed on slides neither rejected nor uncertain are reported Table 1.
- Table 1 Diagnostic performances of the proposed method and of experts, for the Test set. Acc. Sensi. Speci. % uncertain %rejected Experts 0.758 0.515 1.0 0.064 0.048 Proposed 0.775 0.758 0.791 0.043 0.034 [0076]
- This experiment shows the proposed approach is able to diagnose bladder cancer with higher accuracy and sensitivity but lower specificity than experts. It should however be noted that in the clinical trial VisioCyt®, negative patients cannot have positive cytology (C4, C5 or C6): expert’s specificity is 100% by construction and surely overestimated.
- the model predictions are also analyzed for the different patients sub-groups: Negative, LGUC and HGUC and reported Table 2.
- Table 3 MIL models performances evaluated on the Test set and averaged for 5 repetitions of the experiment. ABMIL model is found to perform best. model accuracy sensitivity specificity AUC Maxpool (inst.) 0.702 ⁇ 0.008 0.666 ⁇ 0.025 0.737 ⁇ 0.022 0.770 ⁇ 0?011 Avgpool (inst.) 0.744 ⁇ 0.000 0.655 ⁇ 0.000 0.832 ⁇ 0.000 0.796 ⁇ 0.000 Maxpool 0.738 ⁇ 0.002 0.712 ⁇ 0.003 0.765 ⁇ 0.007 0.811 ⁇ 0.002 (emb.) Avgpool 0.745 ⁇ 0.003 0.695 ⁇ 0.005 0.796 ⁇ 0.005 0.800 ⁇ 0.000 (emb.) DSMIL 0.699 ⁇ 0.009 0.676 ⁇ 0.008 0.721 ⁇ 0.014 0.771 ⁇ 0.006 TransMIL 0.757 ⁇ 0.007 0.719 ⁇ 0.010 0.796 ⁇ 0.018 0.814 ⁇ 0.004 ABMIL 0.771 ⁇ 0.004 0.756 ⁇ 0.007 0.785 ⁇ 0.010 0.8
- MIL methods The ABMIL method is compared against common baselines and two recent MIL methods with interesting properties.
- Baselines Considered baselines methods relies on average or maximum pooling operators for ⁇ . They are declined as instance-based or embedding-based, depending if the pooling operator ⁇ is applied before or after the classifier g. ⁇ ⁇ ⁇ ⁇
- DSMIL The Dual-Stream MIL method combines both instance-based and embedding-based approaches.
- An instance-based max-pooling branch is used to predict a malignancy score for each instance, the instance with the highest score is kept as the critical instance.
- a second embedding-based branch uses the weighted-mean operator as ⁇ , where attention scores a i are computed as distances to the critical instance. The prediction is the mean of each branch output.
- TransMIL The TransMIL method is an embedding-based method relying on self-attention layers from Transformers to compute instances’ attention scores w.r.t all the other instances in the bag.
- Bag creation accuracy sensitivity specificity AUC Random - 300 0.739 ⁇ 0.006 0.728 ⁇ 0.007 0.749 ⁇ 0.006 0.812 ⁇ 0.003 Top-300 0.771 ⁇ 0.004 0.756 ⁇ 0.007 0.785 ⁇ 0.824 0.824 ⁇ 0.002 [0089] The impact of the number of instances k selected in bags on ABMIL model performances, evaluated in cross-validation, is reported Fig. 4 for k ⁇ ⁇ 100, . . . , 500 ⁇ . The number of instances k is found to have low impact on diagnosis performances.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
La présente invention concerne ainsi un dispositif mis en œuvre par ordinateur et un dispositif pour l'analyse d'une lame cytologique numérique d'un échantillon biologique, ledit échantillon biologique ayant été précédemment collecté chez un sujet suspecté d'être atteint d'un cancer de la vessie, ledit dispositif comprenant : au moins une entrée configurée pour recevoir au moins une lame cytologique numérique obtenue à partir dudit échantillon biologique ; au moins un processeur configuré pour : détecter des cellules d'intérêt à partir de ladite au moins une image cytologique numérique ; pour chaque cellule d'intérêt, calculer un vecteur caractéristique comprenant au moins une caractéristique calculée sur chaque cellule d'intérêt ; définir un paquet de k instances pour chaque lame cytologique numérique ; calculer un score de prédiction global représentatif d'une probabilité de présence d'un cancer de la vessie et/ou d'un stade de cancer de la vessie pour ledit sujet ; au moins une sortie configurée pour fournir ledit score de prédiction global.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP23172455.0A EP4462380A1 (fr) | 2023-05-09 | 2023-05-09 | Procédé d'identification d'anomalies dans des cellules d'intérêt dans un échantillon biologique |
US18/314,482 | 2023-05-09 | ||
US18/314,482 US20240378720A1 (en) | 2023-05-09 | 2023-05-09 | Method for identifying abnormalities in cells of interest in a biological sample |
EP23172455.0 | 2023-05-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024231061A1 true WO2024231061A1 (fr) | 2024-11-14 |
Family
ID=90826460
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2024/060418 WO2024231061A1 (fr) | 2023-05-09 | 2024-04-17 | Procédé d'identification d'anomalies dans des cellules d'intérêt dans un échantillon biologique |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024231061A1 (fr) |
-
2024
- 2024-04-17 WO PCT/EP2024/060418 patent/WO2024231061A1/fr unknown
Non-Patent Citations (1)
Title |
---|
BOUYSSOUX ALEXANDRE ET AL: "Ensemble Multiple Instance Learning for Bladder Cancer Diagnosis", 2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), IEEE, 18 April 2023 (2023-04-18), pages 1 - 5, XP034413578, DOI: 10.1109/ISBI53787.2023.10230445 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Roy et al. | Patch-based system for classification of breast histology images using deep learning | |
Khameneh et al. | Automated segmentation of cell membranes to evaluate HER2 status in whole slide images using a modified deep learning network | |
Barzekar et al. | C-Net: A reliable convolutional neural network for biomedical image classification | |
Loukas et al. | Breast cancer characterization based on image classification of tissue sections visualized under low magnification | |
Dov et al. | Thyroid cancer malignancy prediction from whole slide cytopathology images | |
Duc et al. | An ensemble deep learning for automatic prediction of papillary thyroid carcinoma using fine needle aspiration cytology | |
Kowal et al. | Breast cancer nuclei segmentation and classification based on a deep learning approach | |
Abdulaal et al. | A self-learning deep neural network for classification of breast histopathological images | |
Madhukar et al. | Classification of breast cancer using ensemble filter feature selection with triplet attention based efficient net classifier. | |
Aktas et al. | Deep convolutional neural networks for detection of abnormalities in chest X-rays trained on the very large dataset | |
Sreelekshmi et al. | SwinCNN: an integrated Swin transformer and CNN for improved breast Cancer grade classification | |
Sunny et al. | Oral epithelial cell segmentation from fluorescent multichannel cytology images using deep learning | |
Mantha et al. | A transfer learning method for brain tumor classification using efficientnet-b3 model | |
Salvi et al. | Deep learning approach for accurate prostate cancer identification and stratification using combined immunostaining of cytokeratin, p63, and racemase | |
Ahmad et al. | Brain tumor detection using convolutional neural network | |
Kalbhor et al. | DeepCerviCancer-deep learning-based cervical image classification using colposcopy and cytology images | |
Fouad et al. | Human papilloma virus detection in oropharyngeal carcinomas with in situ hybridisation using hand crafted morphological features and deep central attention residual networks | |
Sajiv et al. | Predicting breast cancer risk from histopathology images using hybrid deep learning classifier | |
Almaslukh | A reliable breast cancer diagnosis approach using an optimized deep learning and conformal prediction | |
Koriakina et al. | Oral cancer detection and interpretation: Deep multiple instance learning versus conventional deep single instance learning | |
Shirazi et al. | Automated pathology image analysis | |
Tosun et al. | Histological detection of high-risk benign breast lesions from whole slide images | |
Dash et al. | Deep learning based Framework for breast Cancer mammography classification using Resnet50 | |
Bandaru et al. | A review on advanced methodologies to identify the breast cancer classification using the deep learning techniques | |
Tarquino et al. | Engineered feature embeddings meet deep learning: a novel strategy to improve bone marrow cell classification and model transparency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24720783 Country of ref document: EP Kind code of ref document: A1 |