GB2633100A

GB2633100A - Method for reducing image variability under global and local illumination changes

Info

Publication number: GB2633100A
Application number: GB2313382.0A
Authority: GB
Inventors: Maboudi Hadi
Original assignee: Opteran Technologies Ltd
Current assignee: Opteran Technologies Ltd
Priority date: 2023-09-01
Filing date: 2023-09-01
Publication date: 2025-03-05
Also published as: GB202313382D0; WO2025046235A1

Abstract

A method for removing illumination from an image to improve colour constancy comprises converting 153 the image to a second colour space and transforming the image using chromatic adaption 155. A first normalization 157 is performed on the transformed image where the value of a normalized pixel may depend on the number of pixels in the transformed image that initially had the same or a lower value. A set of filters is applied 159 to the normalised image where the filters may comprise a centre-surround stricture of receptive fields and may represent the effect of double opponent cells. An illumination estimation for the image is obtained 161 from the filtered data and is used to output an image which compensates for the effect of illumination to ensure colour invariance. Also claimed is a method for removing local illumination by extracting colour information from edges in a plurality of image regions and a method for removing shadows from an image.

Description

METHOD FOR REDUCING IMAGE VARIABILITY UNDER GLOBAL AND LOCAL

ILLUMINATION CHANGES

Technical Field

[0001] The present application relates to a system, apparatus, and method(s) of estimating local and global illumination of the images and removing illumination via colour correction to maintain colour constancy and minimize shadow effects of the images.

Background

[0002] Colour constancy is the ability to perceive the colour of objects regardless of the change source of light. This ability is reported to be inherent in species such as humans, fish, and bees. Colour constancy is evolved in these species to aid object recognition by expending less memory while maintaining the same or better accuracy and more stable discrimination. Without colour constancy, the objects' colour under different light variability would be an unreliable factor that diminishes the species' ability to identify objects accurately.

[0003] Several behavioural and neurobiological studies have explored the mechanisms underlying colour constancy. Two possible neural mechanisms have been suggested for colour constancy: Chromatic adaptation and Opponent-process theory. The Chromatic adaptation occurs at the level of individual photoreceptors by adjusting their activity according to the relative light within the local region. Higher level neural processing, double-opponent cells in the early visual system (areas V1 and V4), may also be attributed to colour constancy. On the other hand, Opponent-process theory proposes that one member of the colour pair suppresses the other colour.

[0004] One vital requirement in computer vision and robotics, especially for robust colour-based object recognition and tracking, is to record and memorise reliable colour cues that are invariant to any external light changes. This demand is challenging when the source of illumination is unknown and when there are different lighting conditions across the scene simultaneously. Computational colour constancy has proposed different solutions for robust colour coding, which is used to correct colour-biased images to get the canonical images under a white light source.

[0005] On this front, many biologically plausible and ad-hoc solutions have been proposed for computational colour constancy, from simple algorithms based on statistics of low or medium levels of image features (such as grey-world model, white patch, max-Red-Green-Blue (RGB), shade of grey, grey-edge, etc.) to sophisticated statistical and machine learning based algorithms (including gamut mapping, Bayesian approaches and neural network based). Despite their high computational processing, no solution has emerged that can achieve human-level colour constancy, independent of illumination and the type of photo-sensor. Currently, no solution can perfectly satisfy the problem in realistic and natural situations, especially utilising both aspects of Chromatic adaptation and Opponent-process theory. In addition, there is no solution that employs the process of neural normalisation by the responses of neighbouring photo receptors, inspired by the insect eye, to estimate the illumination of the images.

[0006] For these above reasons, there is an unmet need in the field of image processing for computationally maintaining human colour constancy without the need to adapt illumination or use specialised photo-sensor. Present invention herein provides a novel bio-inspired solution to this colour constancy challenge, applying an algorithm inspired by human and insect visual systems to provide stable colour appearance and high colour discrimination when images are processed.

[0007] Furthermore, it is recognized that the success of robotic navigation hinges on the accuracy of visual place recognition and localization. However, the presence of shadows resulting from changing lighting conditions poses a substantial challenge to these processes. Shadows, stemming from diverse light sources, introduce discrepancies in colour, texture, and shape, leading to inconsistencies in conventional visual features across different scenarios. These inconsistencies hinder the precise matching and recognition of locations, thereby compromising overall navigation effectiveness. It should be acknowledged that improved accuracy in recognition in this case does not require true colours only consistent colours.

[0008] To address this concern, a process to detect and remove shadows is also needed in the context of colour constancy in order to mitigate the adverse effects of shadows on visual-based navigation systems. Shadows, cast by objects and structures under varying lighting angles and intensities, profoundly alter scene and object appearances. This alteration introduces disparities in the visual data captured by a robot's sensors, making it difficult to accurately match features across diverse lighting conditions. Consequently, the reliability and precision of visual place recognition and localization are compromised, undermining the navigational prowess of the robot.

[0009] It is thus recognized that herein described process for shadow detection and shadow removal holds pivotal significance in the domains of object recognition and visual place recognition, especially for providing a stable colour appearance and high colour discrimination when images are processed. These processes substantially elevate the capabilities of visual place recognition in the fields of robotics and computer vision by: 1. Ensuring heightened feature consistency; 2. Enhancing the assessment of scene similarity; 3. Bolstering resilience against changes in lighting conditions; and 4. Ultimately culminating in superior localization accuracy.

[0010] The embodiments and aspects described below are not limited to implementations which solve any or all of the disadvantages of the known approaches described above.

Summary

[0011] This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to determine the scope of the claimed subject matter; variants and alternative features which facilitate the working of the invention and/or serve to achieve a substantially similar technical effect should be considered as falling into the scope of the invention disclosed herein.

[0012] As part of Opteran Vision framework, present disclosure provides a bio-inspired solution to the colour constancy challenge, for example, in the use case of matching the overlap between two cameras in Opteran Development Kit (ODK) by adjusting the colour distribution of these cameras. The solution encompasses, amongst other features described herein, an algorithm/model for stable colour appearance and high colour discrimination that is inspired by the visual systems of humans and mainly insects and is tailored to the functional features of the ODK camera system. Specifically, the model underlying the solution integrates three mechanisms: Chromatic adaptation model, LMC model (a model of lamina monopolar cells in insect visual lobe), and Colour opponent coding, which are further described in the following sections.

[0013] Moreover, the bio-inspired solution to the colour constancy challenge also comprises a process to estimate the local illumination within these regions of a partitioned input image. Also inspired by eye movement and selective attention research in vision science, the approach relies on three key components: Selective attention mechanism, Gray-edge hypothesis, and colour normalisation. These three components are integrated into a single algorithm/model to reduce local illumination.

[0014] Furthermore, present disclosure provides, in conjunction with illumination correction as described herein, another solution for image processing that is configured to detect and accurately remove shadows from the input image that arise from varying lighting conditions. These shadows can introduce inconsistencies in image features and significantly impact accurate perception. Shadow detection and removal thus play a critical role in enhancing the process of visual place recognition and localization for applications such as robotic navigation. Specifically, the proposed solution integrates two mechanisms: Shadow detection and Shadow removal, as described herein, is purposed to fix true colours and textures obscured by shadows inherent in the input image, making a valuable addition to the Opteran Vision framework.

[0015] In a first aspect, the present disclosure provides a method or a computer-implemented method for processing images based on colour constancy removing illumination from the images, the method comprising: obtaining an image in a first colour space; converting the image to data in a second colour space; transforming the data using chromatic adaptation; performing a first normalisation on the transformed data, wherein the first normalisation comprises applying a dynamic spatial filtering technique to adjust the transformed data based on light intensity; applying a set of filters to the normalised data, wherein the set of filters is convoluted based on the normalised data in relation to the image; performing a second normalisation on the filtered data to obtain an illumination estimation of the image in relation to the filtered data; and outputting the normalised data from the second normalisation, wherein the normalised data maintains colour constancy based on the illumination estimation, removing the illumination from the normalised data.

[0016] In a second aspect, the present disclosure provides a method or computer-implemented method for reducing effect of illumination on images, the method comprising: receiving an input image; partitioning the input image into a plurality of regions; analysing the plurality of regions based on colour information and spatial position of pixels in each region; selecting from the plurality of regions a subset of regions that are influenced by an illuminant based on the analysis; identifying coloured edges for at least said subset of regions; extracting the colour information from the coloured edges; decomposing reflectance and illumination components of the input image using the extracted colour information; correcting illumination of the input image based on the decomposed reflectance and illumination components; outputting an image with illumination corrected.

[0017] In a third aspect, the present disclosure provides a method or computer-implemented method for providing a shadow-free image, the method comprising: receiving an input image in the first colour space, wherein the input image comprises at least one shadow region; generating shadow region masks for said at least one shadow region; removing shadow from shadow regions of the input image based on the shadow region masks; and outputting a shadow-free image.

[0018] In a fourth aspect, the present disclosure provides an apparatus for processing images to maintain colour constancy of the images, the apparatus comprising: at least one model configured to perform steps according to the first, second, and/or third aspect as well as any of the aspects described herein.

[0019] In a fifth aspect, the present disclosure provides a system for processing images to establish colour constancy for an image by removing illumination from the image, the system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform the first, second, and/or third aspect as well as any of the aspects described herein.

[0020] The methods described herein may be performed by software in machine-readable form on a tangible storage medium e.g. in the form of a computer program comprising computer program code means adapted to perform all the steps of any of the methods described herein when the program is run on a computer and where the computer program may be embodied on a computer-readable medium. Examples of tangible (or non-transitory) storage media include disks, thumb drives, memory cards etc. and do not include propagated signals. The software can be suitable for execution on a parallel processor or a serial processor such that the method steps may be carried out in any suitable order, or simultaneously.

[0021] The methods or computer-implemented methods herein may be described in terms of one or more models. It is thus appreciated that the term "method' may be interchangeably recited as "model" throughout the disclosure where appropriate, though in some cases, a model may not correspond to a single method necessarily but may instead be incorporated as part of said method.

[0022] This application acknowledges that firmware and software can be valuable, separately tradable commodities. It is intended to encompass software, which runs on or controls "dumb" or standard hardware, to carry out the desired functions. It is also intended to encompass software which "describes" or defines the configuration of hardware, such as HDL (hardware description language) software, as is used for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.

[0023] The optional features or options described herein may be combined as appropriate, as would be apparent to a skilled person, and may be combined with any of the aspects of the invention.

Brief Description of the Drawings

[0024] Embodiments of the invention will be described, by way of example, with reference to the following drawings, in which: [0025] Figure 1a is a flow diagram of a model architecture for image processing according to aspects of the disclosure; [0026] Figure lb is a flow diagram of the image processing based on the model architecture according to aspects of the disclosure; [0027] Figure 2 is a pictorial diagram of a model application to the Spyder Checkr (Macbeth ColourChecker chart) under different illumination according to aspects of the disclosure; [0028] Figure 3 is a pictorial diagram of a comparison of distributions of pixel intensity between the original and corrected images according to aspects of the disclosure; [0029] Figure 4 is a pictorial diagram of Fano factor (relative variance) of the selected patches according to aspects of the disclosure; [0030] Figure 5 is a pictorial diagram of colour distance between colour boards in an environment dataset (set 1 in relation to Figure 9) according to aspects of the disclosure; [0031] Figure 6 is a pictorial diagram of colour matching between two ODK cameras capturing the raw image and image corrected by the model according to aspects of the disclosure; [0032] Figure 7 is a pictorial diagram of Spyder Checkr -Macbeth ColorChecker chart according to aspects of the disclosure; [0033] Figure 8 is a pictorial diagram of an exemplary data collection setup according to aspects of the disclosure; [0034] Figure 9 is a pictorial diagram of example images from an environment dataset (set 1) according to aspects of the disclosure; [0035] Figure 10 is a pictorial diagram of example images from another environment dataset (set 2) according to aspects of the disclosure; [0036] Figure 11 is a pictorial diagram of a comparison of distributions of pixel intensity between the original and corrected images, where the original image is captured by a Pi Camera according to aspects of the disclosure; [0037] Figure 12 is a pictorial diagram of colour distance between colour boards in Environment dataset (set 2) according to aspects of the disclosure; [0038] Figure 13 is a block diagram of a computing device or apparatus suitable for implementing aspects of the disclosure; [0039] Figure 14 is a pictorial diagram of model applications for environment under different illuminations according to aspects of the disclosure; [0040] Figure 15a is a flow diagram of another model architecture for image processing according to aspects of the disclosure; [0041] Figure 15b is a flow diagram of the image processing based on said another model architecture according to aspects of the disclosure; [0042] Figure 16 is a pictorial diagram of exemplary output of the image processing based on said another model architecture according to aspect of the disclosure; [0043] Figure 17 is a pictorial diagram of violin plots showing colour distance (Delta E) between the images in each dataset for image processing using said another model architecture according to aspects of the disclosure; [0044] Figure 18 is a pictorial diagram of said another model applications for environment under different illuminations according to aspects of the disclosure; [0045] Figure 19a is a flow diagram of another model architecture 1900 for image processing according to aspects of the disclosure [0046] Figure 19b is a flow diagram of the image processing based on said another model architecture according to aspects of the disclosure; [0047] Figure 20 is a pictorial diagram of exemplary input and output of the image processing based on said another model architecture according to aspect of the disclosure; and [0048] Figure 21 a pictorial diagram of exemplary input and output of the image processing for a different input image.

[0049] Common reference numerals are used throughout the figures to indicate similar features.

Detailed Description

[0050] Embodiments of the present invention are described below by way of example only. These examples represent the suitable modes of putting the invention into practice that are currently known to the Applicant although they are not the only ways in which this could be achieved. The description sets forth the functions of the example and the sequence of steps for constructing and operating the example. However, the same or equivalent functions and sequences may be accomplished by different examples.

[0051] Colour constancy is an important feature of colour perception. It refers to the ability to perceive the colour of objects, irrespective of varied light sources. This ability is reported in different species such as humans, fish, and bees. Colour constancy allows different species to perceive the colour of objects relatively constantly under varying illumination conditions or identify objects irrespective of illumination conditions. For example, an object will appear green to us during midday, when the main illumination is white sunlight, as well as at sunset, when the main illumination is red.

[0052] The mechanisms underlying the concept of colour constancy can thus be assumed to be important in domains such as computer vision and robotics. It is known that colour constancy mechanisms can be implemented, borrowing its concept, to effectively record and store reliable colour cues, which are stable under diverse light conditions. Although several computational models have been proposed for colour constancy, challenges remain still in providing a more robust colour constancy algorithm/model with high efficiency akin to human and animal-level vision.

[0053] Present invention is purposed to tackle and overcome at least some of the challenges of existing computational models. It takes advantage of the neural mechanisms of colour constancy in primates and insects to design a novel algorithm that can maintain a stable colour appearance across changing light sources. The evaluation of the present invention, using different datasets, revealed that the model(s) proposed herein reduced the intensity variation of colour patches illuminated (both locally and globally) by different natural and artificial light sources.

[0054] Importantly, the present invention enhances the camera's capacity in terms of colour discriminability by increasing the colour distance between objects. It is useful for the challenge of matching the overlap between two cameras in ODK by adjusting the colour distribution of the cameras as shown according to the figures. In brief, the present invention exhibits some potential advantages for the Opteran Vision Framework by upgrading the colour coding in the context of image processing.

[0055] One aspect of the present invention is a biologically plausible algorithm to estimate light illumination and suggest a stable colour encoding mechanism for visual object recognition in autonomous robots. It takes the inspiration for an accurate and fast colour encoding algorithm from the individual neuron and neural network level of human and insect visual systems. Herein, it is proposed a multi-layer neural network incorporating three of the visual mechanisms suggested for colour constancy; retina photo receptor adaptation, lateral normalisation between photo receptors and the center-surround spectral opponency in the early visual system (namely Figure la and lb). An overview of the proposed model follows, serving as an exemplary implementation or aspect of the present invention. It is understood that this aspect may be readily combined with other aspects of the model herein described.

[0056] It is understood that the method, algorithms, and/or model described herein may comprise one or more steps for partitioning an input image, raw image, or training data/dataset into a plurality of partitions (this process is also referred to as image segmentation) before to be processed further as described in the present disclosure. The method for image segmentation or partitioning an image may include but is not limited to thresholding, region growing, edge-based segmentation, clustering, histogram-based bundling, k-means clustering, watershed, active contours, ML-based segmentation using Convolutional Neural Networks, graph-based segmentation, and superpixel-based segmentation. It is appreciated that any appropriate image segmentation method as well as certain novel method from the Opteran Vision framework may be used with respect to Global illumination, Local Illumination, Shadow Detection and Removal (obtaining a shadow-free image) as described in the following sections.

Global Illumination [0057] In this disclosure, colour space refers to an abstract space with a specific organization of reproducible representations of colour. It is understood that colour space may be arbitrary, i.e., with physically realised colours assigned to a set of physical colour swatches with corresponding assigned colour names or structured with mathematical rigor.

[0058] In the present invention, an image is converted to a Long-Medium-Short (LMS) colour space in order to simulate the L,M,S cones in the human eye. Initially, the gamma correction applied to the colours (Equation 1) is removed to generate the linear RGB of the image. Gamma correction herein refers to a nonlinear operation used to adjust the brightness and contrast of an image. It involves applying a non-linear mapping to the pixel values of an image to compensate for the inherent non-linear response of display devices, such as monitors and televisions. Gamma correction can be used to control the brightness of the image. It helps ensure that the perceived brightness and contrast of the image remain consistent across different devices and viewing environments. Here, we assume raw image includes gamma correction. The gamma correction of the raw image can be removed, resulting in an image with each RGB value of a processing colour transformed into equivalent linear RGB colour space. Gamma correction transforms colour intensities from the physical world into a more uniform arrangement for humans. Using the matrix transformation MRGB_,LMS, we convert the linear RGB image into LMS cone inputs.

[0059] In the next stage, we implement the processing of LMC cells by applying a single transformation, T to the photoreceptors' response (Equation 5). This transformation normalises the photoreceptors output in respect to the population activity of all at the same colour channel photoreceptor. We finally propose a model of opponent coding according to the single-opponent and double-opponent cells in the human and insect visual system (see Double Opponent model). By combining these stages of colour processing, we can estimate the global illumination of the input image illuminated by external light sources. This allows us to correct a large range of colours in images with different light conditions due to varying natural light sources and changing artificial light. For display purposes, the corrected image in LMS colour space is transformed into sRGB space by applying the inverse transformation MI6B-1MS and Equation 2. Below sections we briefly describe the colour constancy models and idea behind each component: a) Chromatic adaptation, b) LMC model, and c) Double Opponent model of the present invention. More details are discussed later in Detailed Implementation.

[0060] a) Chromatic adaptation in the context of colour constancy is defined as the ability of animal colour perception to adjust retina sensitivity in response to changes in light sources. Chromatic adaptation is a technique for explaining colour constancy based on the animals' ability, including humans, to adjust the sensitivity of their photoreceptors to changes in light sources. Chromatic adaptation is closely related to the adjustment of cone sensitivity which happens in the retina when the illumination changes.

[0061] One example of a chromatic adaptation technique applies an adapted version of Von Kries' chromatic adaptation, which is based on the LMS cones' sensitivity response function (LMS colour space). Document describing the general approach of Von Kries' chromatic adaptation (Luo, M. Ronnier. "A review of chromatic adaptation transforms." Review of Progress in Coloration and Related Topics 30 (2000): 77-92.) is hereby incorporated herein by reference.

[0062] For the present adaptation, each cone increases or decreases responsiveness based on the illuminant spectral energy. Each cone cell increases or decreases spectral sensitivity to adapt to new illumination from the previous illumination to perceive objects as constant colours. Based on LMS cone gain, Von Kries' adaptation transforms one illuminant to another to maintain white colour consistency in both systems. Further extensions of the Von Kries' adaptation have also been introduced. For example, multiple CAT transformation and the Bradford model are popular. Here, we implement some of these models by application of the matrix transformation shown in Implementation of Chromatic adaptation transform.

[0063] b) A model of lamina monopolar cells (LMC model): It has been hypothesized that the dendrites of insect lamina monopolar cells integrate visual information from neighbouring photo receptors. Therefore, they contribute to the spatial summation of visual information and normalise the responses of photoreceptors by using temporal coding. This process also enhances a neuron's information capacity. Insects use this mechanism to improve their visual sensitivity at night. In this work, we show that this normalisation process can also be beneficial in the exclusion of variation of illumination and improvement in colour constancy, by utilising simple transformation of the photoreceptors' activity.

[0064] c) A model of Single and Double Opponent cells (S-DO Opponent model): Opponent-process theory states that colour perception is controlled by the activity of three opponent colour systems: red-green, blue-yellow, and white-black, despite the fact that more components have recently been reported. This theory proposes that one member of the colour pair suppresses the other colour. For example, we see yellowish-greens and reddish-yellows, but we never see reddish-green or yellowish-blue colour hues. Documents (Shapley, Robert, and Michael J. Hawken. "Color in the cortex: single-and double-opponent cells." Vision research 51.7 (2011): 701-717; Hurvich, Leo M., and Dorothea Jameson. "An opponent-process theory of color vision." Psychological review 64.6p1 (1957): 384; and Solomon, Samuel G., and Peter Lennie. "The machinery of colour vision." Nature Reviews Neuroscience 8.4 (2007): 276-286) are hereby incorporated herein by reference.

[0065] The outputs of photo receptors are propagated in the way of colour opponency via single-opponent cells in retinal ganglion layers and LGN and double-opponent cells in V1 of the visual cortex. The single-opponent cells process the colour information through the centre-surround structure of their receptive fields (RF). There are different types of single opponent cells that encode the colour contrast of red-green, blue-yellow and black-white opponency. For instance, the RF structure of single-opponent cell type II is shaped as two Gaussian functions with red-on in the centre and green-off in the surround. In addition, several studies have discovered double-opponent cells in V1. Detecting local colour contrasts is the important role of these cells. Double-opponent cells compute both colour opponency and spatial opponency. It has been suggested that double-opponent cells are potentially the basis of illumination encoding. Interestingly, double-opponent cells were also observed in goldfish and honeybees. In the present disclosure we suggest a simple form of S-DO Opponent model for colour constancy that improves the estimation of global illumination combined with the two other mechanisms of colour constancy described above. Documents (Conway, Bevil R, et al "Advances in color science. from retina to behavior "Journal of Neuroscience 30.45 (2010): 14955-14963; Shapley, and Conway, Bevil R., David H. Hubel, and Margaret S. Livingstone. "Color contrast in macaque V1." Cerebra/ cortex 12.9 (2002): 915-925") are hereby incorporated herein by reference.

[0066] A method to evaluate colour constancy models is to use an angular error that measures angular distance between the estimated illumination and the ground-truth illumination. Since it is difficult to collect data with corresponding ground truth illuminants, we decided to use a different approach to quantitatively analyze the model's performance. In this way, pre-selected patches of pixels located within sections on the ColourChecker illuminated under varying light conditions are analysed before and after applying out model. The model was evaluated with three different datasets of the colour constancy task: Spyder Checkr 24 dataset, Environment dataset, and Lab dataset. The Spyder Checkr 24 from datacolour is a standard for colour calibration. This dataset contains 34 images of the colour chart illuminated by different natural and artificial lights. The Environment dataset has raw-RGB pair images which were captured under varying lighting conditions (daily natural lights and lamp artificial lights) (see Data Collection). The Lab dataset includes RGB images of the lab space also captured under varying light conditions and sources, as shown according to the figures.

[0067] To analyse the colour variability over varied illuminations, the Fano factor was calculated for each channel of selected patches. It measures the relative variance of the colour intensity and shows the extent of variability in relation to the mean of population. The Fano factor is defined as Fi = where at and gi are the variance and mean of selected patches for each colour channel i = R,G or B, respectively.

[0068] The distance between two colours allows for a quantified analysis of how far apart two colours are from one other. The Delta E ( AE) metric is used to measure the degree of colour changes over different illumination and verify the improvement of the colour discrimination of our model/algorithm. It evaluates the distance in the CIELab colour space and represents the relative perceived magnitude of colour difference. The larger AE, the greater distance between the colours.

Implementation Details [0069] With respect to the present disclosure, it is appreciated that further documents (Stokes, Michael. "A standard default color space for the internet-srgb." http://www. w3. org/Graphics/Color/sRGB. html (1996); and Anderson, Matthew, et al. "Proposal for a standard default color space for the internet-srgb." Color and imaging conference. Vol. 1996. No. 1. Society for Imaging Science and Technology, 1996.) are hereby incorporated herein by reference.

[0070] Linear RGB: For each RGB value of a processing colours in the input image, we transformed the colours into what is called linear RGB space by removing the gamma correction with this formula: 1-9191, 12 92 Cinsour -C.,,,i,b0+505.055)14 if csrab < 0.04045 (Equation 1) if c",,, > 0.04045 where csto is the (R,G,B) value of each pixel in the image. The Gamma correction can later be reapplied with the inverse formula: cs b 12.92 Cti"ear if cii",a, < 0.0031308 1.055 -0.055 if cif",", > 0.0031301d (Equation 2) [0071] LMS colour space: Since the spectral sensitivities of digital cameras are different to the photo receptors spectral sensitivities in human and insect vision, we transform the input images from linear RGB space to LMS cone inputs (L -, M -and S -cone), based on previous computational and biological experiments: in = MI2GB,LIVIS (9). (Equation 3) A I [0.3192 0.6098 0.0447 where RGB-,LMS = 0.2647 0.7638 0.08701.

0.0202 0.1296 0.9391 The matrix Mno,B_Lms maps pixels of each change R(x,y),G(x,y) B(x,y) of the input image I(x,y) into 1(x,y),m(x,y) and s(x,y) in the LMS space. The yellow component y(x,y) = m(x,y) + s(x,y) and illuminance component u(x,y) = 1(x,y)+m(x,y)+ s(x,y) can be simply computed. Similarly, we can transform the colour information in LMS space back into the 868 space using the inverse matrix [0072] Implementation of Chromatic adaptation transform: we implemented chromatic adaptation for converting source images into destination illuminate by applying a chromatic adaptation transformation. The chromatic adaptation is performed using a reference white point. The transformed images contain improved white colour ranges. Other colours are transformed based on an illumination transform obtained by a white point transform. Chromatic adaptation for an image is implemented in two steps: Step 1) Estimate the global illuminant of an image; and Step 2) Convert the image to destination illuminant using the illuminant obtained in Step 1.

[0073] Step 1) Grey world model: To estimate global illumination, we used the grey world method. The grey world assumption is a simple method which assumes an image contains objects with different reflective colours, that are uniform from minimum to maximum intensities and therefore averaging all pixels gives a grey colour. Illumination estimation (calculated or derived data representation of illumination present in an image) is calculated by averaging all pixel values for each channel, the illumination estimation would give an average colour which provides an approximation of the illuminant colour. For an image with equal representation of colours, illumination estimation gives an average colour grey. The purpose of illumination estimation is to separate the effects of lighting from the intrinsic colours of objects in the image Illumination = (R",, Garr, Bora), (Equation 4) Where Rara, Ga", and /3",, are means of each image channel (8,6,8).

[0074] Step 2) Convert the image to destination image: Firstly, we convert sRGB values of the image into LMS cone space (see LMS colour space). Then, we transform the source image to destination illuminant by multiplying LMS values with one of the chromatic adaptation transformations listed below. Here are some popular matrix transformations of chromatic adaptation, i.e. standard von Kris method as below: 0.40024 0.70760 -0.08081 AlvonKries = -0.22630 1.16532 0.04570.

0.00000 0.00000 0.91822 [0075] : LMC model: It has been proposed that neighboring photo receptors in flies are laterally connected by large monopolar cells (LMC) which improves the contrast encoding and contributes to spatial summation of the visual information. In fact, LMC cells cause a uniform distribution of the photoreceptor's response to the visual input by sending feedback to the photoreceptors and changing their temporal responses. It is therefore appreciated that documents (Laughlin, Simon. "A simple coding procedure enhances a neuron's information capacity." Zeitschrift far Naturforschung c 36.9-10 (1981): 910-912; and StOckl, Anna Lisa, David Charles O'Carroll, and Eric James Warrant. "Hawkmoth lamina monopolar cells act as dynamic spatial filters to optimize vision at different light levels." Science Advances 6.16 (2020): eaaz8645) are hereby incorporated herein by reference.

[0076] Accordingly, following this normalisation mechanism, we implemented LMC cells in our model/algorithm by modifying the photoreceptor's responses using a linear transformation, such that all responses from low to high photoreceptors' responses are uniformly distributed in the range [0, Rmax].

[0077] To implement a neural network version of LMC normalisation, we can consider a simple form of temporal coding of photoreceptors by defining the temporal responses matrix, 7', as: For each pixel (x,y) in image 1(x,y) of size m x n at one colour channel R, G or B: T((y -1) M + x,l(x,y)) = 1 and TO,]) = 0 for the rest of matrix's elements (i,j). Each row of the matrix 7 represents the activity of each photoreceptor such that the peak of activity moves to the left or right based on the intensity of the pixel. To simplify, we only consider a value 1 at the peak of activity and the value 0 in the rest of time response range.

[0078] We assume that the LMC neuron computes the integration of all activity at each bin j by LMC(]) = U * T(i,j) where U = 1 is unit vector of dimension 1 x M. The LMC then sends graduated accumulating signals R"o,"i",=LMC * Q (where Q is the upper triangle matrix of dimension N x N) to the next layer as a normalised signal. Overall, the normalised image is obtained by: normalised = T [U * T Q], (Equation 5) [0079] This form of normalisation works as a dynamic spatial filtering that makes a lateral inhibition between receptors for receptors exposed with high light intensity and makes a spatial summation between receptors that are exposed to the dim light intensity. Since Matrix 7' is sparse, we can take advantage of the sparsity to improve the cost of computation.

[0080] Dynamic spatial filtering refers to a form of normalisation of data in a colour space, where the data is representative of the receptors' representation of the visual environment. Dynamic spatial filtering effectively simulates and modifies the response of the receptors that are exposed to high light intensity and dim light intensity, where these receptors respectively are influenced by the lateral inhibition and spatial summation of neighboring receptors.

[0081] Double Opponent model: the first stage of colour-sensitive cells are single-opponent cells in the LGN which encode colour information within their centre-surround receptive fields (RFs) in the way of red-green, blue-yellow and black-white opponency. If we think of the L channel as red, the M as green and the S as blue, we can create opponent colour models, using three components in LMS space as: 1) achromatic = I+ + s; 2) yellow -blue = l+ m -s; and 3) red -gren = 1 -m. For simplicity, we construct the RFs in the square shape, called RE(x,y),i = 1:6, presented in Figure la and corresponding to Figure 1 b, instead of using Gaussian function. Hence, the responses of single opponent cells to the image 1(x,y) is calculated as ric"(x,y) = 1(x,y) RFi(x,y) where 0 denotes the convolution.

rg."_,ris:,,,"Thrts:",+, represent the red-green, blue-yellow and achromatic single opponent cells, respectively. In these expressions, the sign '+' and '-' denote the excitation and inhibition. Following the physiological connection in V1 of early visual cortex, we can construct the RF of the V1 double-opponent cells, re° , using the outputs from two single-opponent cells with different scales,K. Thus the response of double-opponent cells can be computed as: uo -rg"_(x,y)+ r:.,+t_(x,y) 1f°(x,y) = K * rst_ (x,y) ra°°(x,y) = 11N(x,y)+ K * 1;:= (x,y) [0082] where K controls the relative contribution of RF surround. We assume that the next layer (V4) in the visual cortex encodes a single global illumination vector E = (e.,,e2,e3) based on the global colour status of each colour component L, M and S. Hence, we transform the output of double-opponent cells to LMS space as: (II 1 -1 LMS(x,Y) [7 7 0.D0 (x, 1 1 -z I cCiTiMS (X' Y) = 77. 1 y) , (Equation 6) (x y) ti rr (x 143..17 v1:7 The vector estimation E is estimated by pooling function f(.) as: f(ty)(riLM5(t.Y)) (Equation 7) Y/ 10-41)(rfrms(tY1) [0083] here, A.) implies the canonical neural computation of max or sum over the whole image, separately for each colour channel. Finally, the input image is corrected by dividing it the illumination vector E. For the purpose of image representation, the corrected image is then transformed to the sRGB space, using the inverse matrix M,7,1",,," and Equation 2.

[0084] Data Collection: Different image dataset are collected using an all-in-one Raspberry Pi Camera (Pi Camera) and the ODK camera as described in the following sections. The collected dataset is used to train and evaluate the model's performance at estimating global illumination of natural light from different sources of artificial lights.

[0085] Data colour dataset: The small colour chart data was collected on a Raspberry Pi Camera V2 with the same 8 megapixel Sony IMX219 (8 MP Camera Sensor) image sensor as the ODK. This singular camera setup has a fixed focus (not fish eye) lens. This was positioned in front of the Datacolour SpyderCheckr24: 24 Colour Patch and Grey Card for camera calibration against the window, for natural illumination. Some images had the addition of yellow lamp or white LED light to increase the illumination variation.

[0086] Pi camera: A repeat of this image collection was taken in consistent light conditions with the Pi camera. In this experiment the 10 preset Auto-White-Balance (AWB) options (off, Auto, Sunlight, Cloudy, Shade, Tungsten, Fluorescent, Incandescent, Flash and horizon) and 13 preset exposure settings (off, auto, Night, night preview, back light, Spotlight, sports, snow, beach, very long, fixed fps, anti shake and fireworks) are changed sequentially to ensure each combination is achieved. This was automated with a python script to control the pi camera molecule and change the settings, take and save the images.

[0087] Environment dataset: Using an ODK with two IMX219 cameras built in, an experimental setup was configured as shown in Figure. Two boxes were modified to have a colour chart on the visible side, to use for the colour consistency corrections. The colours included were; back, white, grey, red, yellow, green, blue and pink (visible according to at least one of the figures). The ODK was configured to save the raw camera images (not 4pi etc) to a rosbag every three minutes. To collect a variation of lighting conditions, the experiment took place next to a window, allowing natural light to illuminate the scene. Additionally, the lights in the office were on/off at random intervals. To add structured light and shadows to the scene, three lamps were placed in the setup. The lamps were smart bulbs, controlled through the app to turn on and off every 3/7/11 minutes, respectively. Two lamps were set to yellow light and one white light to add variation to the scene. The lamps were positioned to create shadows, bright spots and patches of illumination creating a difficult scene. Due to the angles of the fish eye lenses, it was possible to place a blank section of white paper in view of both cameras simultaneously for future analysis of the camera correction. The ODK images were collected at three-minute intervals by throttling the image collection speed to that rate, which creates a ros topic which can be subscribed to and recorded. Once recorded, the images were replayed in rqt to check and saved as png images. The raw CSV values were also retained.

[0088] It is appreciated that images contain a standard data colour chart (Fig. 7) or several coloured boards placed on the wall (Fig. 9). Data colour's Spyder Checkr 24 is a colour chart containing multiple coloured squares in a grid (Fig. 7). The data colour chart shown in the figure contains a range of spectral reflectance colour patches that represents most possible intensity ranges that are suitable for many uniform illumination conditions. Since the colour checker chart contains uniform colour ranges, we can use this chart to estimate the illumination of an image used at least in part by the present invention.

[0089] Another aspect of the present invention is a method that functionally utilises the Gray-edge hypothesis, which posits that the average edge difference within a scene window is achromatic, to estimate the local illumination within these regions of a partitioned input image. The resultant estimation of the local illumination can be used to correct the illumination of the input image using retinex-based correction, effectively reducing local illumination, which leads to precise colour correction for the input image.

[0090] The method tackles the challenges posed by local illumination resulting from multiple light sources. By combining selective attention mechanisms and the Gray-Edge hypothesis, our proposed solution demonstrates promising results in mitigating the impact of local illumination and improving colour constancy across various scenarios as can be seen based on the results shown in the figures. Importantly, this method can be combined with any of the other models/approaches described herein. In combination, these methods/models/approaches of the present invention are empowered to address a wide range of application requirements and effectively handle the complexities of lighting conditions encountered in real-world environments.

[0091] In sum, this other aspect of the present invention (as well as in combination with other aspects) contributes valuable insights into the field of colour constancy and provides a robust methodology for enhancing colour accuracy in images affected by local illumination. Present method works to further refine and may be extendable to handle additional lighting challenges and broaden its applicability across various domains in computer vision and robotics.

Local Illumination [0092] Further improving colour constancy of images, the present disclosure provides another method of image processing in conjunction with any and all herein described methods. This method addresses at least some of the aforementioned challenges, such as those posed by multiple light sources, by removing local (colour) illuminations from an input image.

[0093] In brief, this approach is proposed to address the local illumination of an input image and may be used in conjunction with any of the approaches described herein, further improving colour constancy. For example, the output of the local illumination approach can be used as the image in the first colour space used for removing global illuminations from a raw image.

[0094] Inspired by eye movement and selective attention research in vision science, the approach relies on three key components: I) Selective attention mechanism, II) Gray-edge hypothesis, and III) colour normalisation. These three components are integrated into a single algorithm/model to reduce local illumination. The combined algorithm represents a significant reform of previous research, offering a new and highly efficient process that yields improved results as shown according to the figures.

[0095] Through experimentation with results as shown, the present invention demonstrates its ability to rival state-of-the-art methods, boasting exceptional computational efficiency as an added advantage. Most impressively, our model/algorithm showcases significant advancements in mitigating local illumination effects, resulting in higher accuracy when estimating colour constancy compared to existing methods. In effect, this represents a remarkable leap forward in the realm of colour constancy and sets the stage for exciting new possibilities in a field that has remained largely unexplored until now. Implementation Details [0096] With respect to the present disclosure, it is appreciated that further documents (Brainard, David H., and Brian A. Wendell. "Analysis of the retinex theory of color vision." JOSA A a /0 (1986): 1651-1661; and Land, Edwin H., and John J. McCann. "Lightness and retinex theory." Josa 61.1 (1971): 1-11) are hereby incorporated herein by reference.

[0097] Our proposed algorithm/model, incorporating a form of selective attention and the Gray-Edge hypothesis, was applied to each pre-processed image. The model analyses the edge information within the subset of regions (patches) of interest. We assume the average of the reflectance differences in a patch is achromatic. Hence, within these regions, edge information was extracted to estimate colour constancy following the Retinex theory (see Figure 15a). The following components are integrated to produce the resultant image.

[0098] 1) Selective Attention: in this step, we consider a simple form of selective attention, "Region-based analysis", that involves dividing the image into a plurality of regions (i.e. patches) and analysing each region separately. By identifying regions that are likely to be influenced by a single illuminant, colour constancy algorithms can focus on those regions individually. This approach assumes that each region is illuminated with a single colour. To detect the regions (or a subset of regions therefrom), we used k-means clustering algorithm to segment the input image. k-means is an unsupervised machine learning algorithm that aims to partition a dataset into a predetermined number (k) of distinct, nonoverlapping clusters. In the context of image segmentation, here, k-means is used to group similar pixels together based on their colour or intensity values, more likely with the same illumination. In this process, a subset of regions is selected whereby considering both the colour information (that is represented in numerical values that describe the intensity or magnitude of different colour channels) and the spatial position of pixels (or their location within an image or a coordinate system). This achieves better segmentation results. The algorithm will assign each pixel to one of the k clusters based on the similarity of their colour value. The number of patches is only a free parameter of the model that can be adjusted based on the size of the image input. However, this mechanism can be improved by selective attention such as the saliency map as opposed to using a clustering algorithm, i.e. k-means.

[0099] Integration of saliency-based selective attention mechanisms can be used to refine and improve model performance. Rather than analysing the Gray-Edge Hypothesis in all regions sequentially, the use of saliency maps can guide our attention to focus on the most informative and visually salient regions (here referred to as salient regions) of the image. By leveraging saliency information, we can prioritize regions that are more likely to contain reliable colour information and are less affected by local illumination variations induced by multiple light sources. Using the saliency map allows for more efficient and targeted analysis, potentially improving the accuracy, speed, and efficiency of the present invention.

[00100]2) Edge Detection: To extract the colour of edges (herein referred to as coloured edges or edges), we can employ a Canny edge detection algorithm, which is a widely used technique in image processing for edge detection. This algorithm comprises several steps to identify colour changes within an image. First, the image is smoothed to reduce noise. Then, the gradient magnitude and orientation are calculated. Non-maximum suppression is applied to thin the edges, and the most robust edges are selected based on thresholding. However, for testing purposes, we can explore an alternative approach using simple gradient operators to estimate image gradients. One straightforward option is to utilise the Sobel operators as follows: -1 0 1 Si = -2 0 2, -1 0 1 -1 -2 -1 Sy = 0 0 0 1 2 1 [00101] By convolving both operators on the gray image, Ig obtained from the input image I, we will get the image gradients G., and Grin the direction G and G as: Gx. = and Gy = Sy ® III. Then, the gradient magnitude and orientation are computed as: 116,11 = + Gy and B = arc tan tan (=) Here, G, and Gy represent the gradient values in the x and y directions, respectively, obtained from the Sobel operators. By obtaining these estimates of first-order image derivatives, we can compute the gradient magnitude to capture the strength of the colour changes at edges, using the dot product between the image and the gradient magnitude of the image as flow: = 10 IIG,11.

[00102] As an example, see the middle panel of Figure 16, which exhibits the edge information from the image. This information is crucial for subsequent steps in our model/algorithm and facilitates the accurate estimation and correction of colour variations induced by multiple light sources.

[00103] 3) Retinex-based Correction: To correct the colour in the attended regions, a modified version of the Retinex-based approach was employed, utilizing the colour information extracted from the edges. By considering the reflectance and illumination components of an image, and by focusing on the edges, we aimed to reduce the influence of local illumination variations induced by single light sources within the region.

[00104] Let re(x,y), ge(x,y) and be(x,y) are three colour channel of the edges within the patch ledge(x, y). The corrected colours f(x,y), #(x,y) and b(x,y) are obtained by following equations: Max f(x,Y) = min Ir(x,Y) * Max,.),2551 Maxb(x,Y) = min I b(x,Y) * g / ) , 255 I Maxb where Max b = (be(x,Y)), Maxq (ge(x, y)) and Max,. = (re(x,y)).

[00105] It is understood that the current algorithm changes the colour of patches without edges to grey, which is not ideal if we plan to recover the true colour of objects. However, this limitation does not pose a major issue in the reduction of light variability. Further effort is required to improve it by defining the scalable patches in the attention mechanism to achieve maximum information in each patch.

[00106] Another aspect of the present invention is a method for removing shadows from an input image. The method comprises two algorithms: shadow detection algorithm and shadow removal algorithm. The two algorithms work in tandem, where the first algorithm detects shadows and creates a shadow mask, and the second algorithm eliminates the detected shadows based on the shadow mask. The final output is a shadow-free image.

[00107] It is appreciated that shadows in the input image can lead to reduced visibility, decreased contrast, and altered colour distribution, which can impact the interpretability and aesthetic quality of the image, causing problems when the image is being used in applications of computer vision and robotics. In various applications where shadow detection and removal may be used, i.e., medical imaging or remote sensing, removing shadows can be an important step before any accurate analysis or measurements can be performed.

[00108] Shadow detection and removal may help reduce the number of false positive objects being identified during the above applications. Shadow pixels may in fact disrupt how human colour constancy can be maintained. It is demonstrated that a shadow-free image tends to be robust and suitable for further applications, serving as input for illumination according to any aspect described herein in order to maintain human colour constancy. The shadow detection and removal algorithm may be suitably performed on the output of the image processing process described herein.

Shadow-free image [00109] Further improving image quality, the present invention may include a shadow detection and removal process. The result of which is a shadow-free image. One advantage of removing shadows or having a shadow-free image is to enhance the visibility of the input image. Simply put, shadows can obscure details in an image, making it difficult to distinguish objects or features. By removing shadows, important details of the image become clearer and more visible, improving image quality.

[00110] Moreover, removal of shadow from the input image also improves the image contrast. For example, shadow pixels of an image often cause a decrease in contrast between different parts of the image. Removing shadows can help restore a more balanced contrast, making objects stand out against their backgrounds. Shadows can also introduce colour variations and shifts due to changes in lighting conditions. The removal results in a more consistent colour representation across the image for maintaining colour constancy of the image or its natural appearance. For example, removing shadows can result in a more natural and evenly illuminated image that closely resembles how the scene might appear under uniform lighting conditions.

[00111] In sum, a shadow-free image has obvious advantages, making the shadow-free image suitable for applications and can be used in conjunction with other algorithms described herein to establish correct colour constancy, thereby enhancing the quality and utility of the raw images from one or more cameras, eliminating the negative effects of shadows, leading to improved visibility, contrast, colour consistency, and aesthetics. The following steps may be carried out to obtain the shadow-free image. It is understood that the algorithm is not limited to only these steps and may encompass other steps described herein.

[00112] In one aspect of Shadow Detection Algorithm: RGB image is the input to the algorithm. Convert RGB image to Lab colour space, extract L channel. Generate and smooth histogram of L channel. Identify local minima on smoothed histogram. Determine threshold based on minima distances. Create shadow mask by selecting pixels with L values less than threshold. Output the shadow region mask, as shown at least according to Figure 20.

[00113] In one aspect of Shadow Removal Algorithm: RGB image and shadow region mask serve as input to the algorithm. Convert RGB image to HSV colour space, split into shadow and lit regions. Segment image using k-means algorithm. Label the segments based on region intersections. Calculate distance between segments using texture features. Match labelled segments using distances. Perform histogram matching on labelled segments' channels. Merge adjusted shadow segments to obtain corrected segments. Convert corrected segments to RGB colour space. Iterate for all shadow segments, then merge all segments. Output the shadow-free image.

Implementation Details [00114] Herein we propose a shadow detection and removal algorithm. The primary objective of this method is to pinpoint regions in an image susceptible to shadow influence. Shadows typically exhibit darker and distinct colours compared to the rest of the scene. Through a comprehensive analysis of intensity variations in illuminated and shadowed regions within the LAB colour space, we reach a simple threshold method to identify pixels affected by shadows. This threshold-based approach enables effective detection without necessitating extensive training.

[00115] Upon identifying shadows, our model/algorithm facilitates their removal by adjusting the colour information of shadowed regions to closely match the lit regions, considering texture and image structures. To determine the nearest lit region, we introduce a metric that quantifies region similarity based on texture patterns. To enhance metric accuracy, both shadowed and lit regions are segmented into smaller units, each assumed to exhibit a single texture pattern [00116]The similarity assessment among segments incorporates diverse visual features like texture density, frequency, entropy, and inter-segment centroid distance. We employ histogram matching to align pixel colours in shadow segments with the nearest lit segments, assuming comparable visual features between shadow and corresponding lit segments. The objective of histogram matching is to eliminate the shadow's impact on the original underlying texture. Following this procedure for all shadow segments results in a shadow-free image.

[00117] Integrating shadow detection and removal methods into visual place recognition and localization frameworks holds promise for enhancing robotic navigation. By bolstering visual feature consistency and mitigating shadow effects, these methods may improve navigation system accuracy and dependability. Consequently, robots can confidently recognise familiar locales and precisely estimate their positions, even when grappling with challenging lighting conditions.

[00118] Our proposed algorithm/model comprising at least two algorithms: a) shadow detection algorithm and b) shadow removal algorithm. The two algorithms work in tandem, where the first algorithm (steps 1 to 9) detects shadows from an RBG image serving as input and outputs a shadow mask. The shadow removal algorithm (steps 1 to 11) eliminates the detected shadows based on the shadow mask. The final output from both algorithm is a shadow-free image of the original. Exemplary steps for both algorithms are described as follows: [00119] In another aspect of the Shadow Detection Algorithm, I) shadow detection algorithm carries out the following steps: 1. Input: RGB image I 2. Extract the dimensions of the image (h, w) 3. Convert I to the Lab colour space and extract the channels L, a and b 4. Generate a histogram of channel L 5. Smooth the histogram curve using a Gaussian window F 6. Identify the local minima points on the smoothed histogram 7. Choose the indices of the first minimum as the threshold T if its distance to the indices of the second minimum is less than p; otherwise, select the indices of the second minimum as the threshold T (the parameter p is chosen based on the image size) 8. Generate the mask SI by selecting the pixels in image I with L channel values less than the threshold T 9. Output: Shadow region mask S' [00120] In another aspect of the Shadow Removal Algorithm, II) shadow removal algorithm carries out the following steps: 1. Input: RGB image I and shadow mask 5' 2. Convert Ito the HSV colour space and extract channels h, s and v 3. Split the RGB values of I into two parts: shadow region Is and lit region I, 4. Segment image / using the k -means algorithm: a. Flatten the image: convert 2D RGB image into a 10 array of RGB tuples b. Choose the number of clusters (k) c. Initialize cluster centres: select from RGB values present in the image / using a uniform distribution d. Assign pixels to clusters i = 1: k e. Calculate the distance between each pixel's value and the cluster centres f. Assign each pixel to the nearest cluster centre.

g. Update the cluster centres h. Repeat assignments and centre updates (d-g) i. Check convergence j. Perform post-processing to correct assignments based on the neighbourhood.

k. Add the 10 array of assignments to the flattened image (a) I. Divide image I into images I = where h flli = cb for any i,j 5. Assign the label "shadow"S or "lit" L to the segments: For /k,s: if Ik 11 Is ForIkn 0 6. Calculate the distance between segments Did between i -th and j -th segments, based on image texture features: a. Edge distance (Dr,i): calculate the distance between the edges extracted from the segments b. Colour distance0:*)* calculate the mean colour distance of colour channels a and b (in Lab colour space) for the i -th and j -th segments c. Entropy distance (D j): calculate the entropy distance between i -th and j -th segments e. Neighbourhood distance (D6): pixel distance between centres of -th and j -th segments f Define distance Did = criDitk + a2Dik + ce,Drk + a4D1 such that a, + a3 + a3 + a4 = 1 and 0 <= ai <=1 7. For each shadow segment /,,,, find the lit segment /exthat is placed at the minimum distance Dk,k, from the shadow segment /k,,, 8. Perform the histogram matching: a. Split the shadow segment kg, into its colour channels h, s, v and obtain corresponding histograms Histrcs, Hisqks and Histics, respectively b. Split the shadow segment into its colour channels h, s, v and obtain corresponding histograms Histk% , Histks,, and Histk%) respectively c. Obtain the cumulative distribution functions cdf of all histograms obtained in a and b: cdf"'s, cdfkcs, cdfki's for the shadow segment and ccifk% , cdhs., and cdf".% for the lit segment d. Match cdf,ch,s with cdfch.k to obtain cdf,,v, with ccif;%, to obtain g.s, cdfkss with ccills, to obtain and e. Merge all recovered components es, L and gs of the shadow segments to obtain bc, f Convert bc,, to RGB colour space.

9 Steps 7 and 8 are iteratively performed on all shadow segments.

10. Merge all colour-corrected segments Is and lit segments /",," into image / 11 Output: Shadow-free image / [00121] Specific steps outlined in the shadow removal and detection algorithm as above, as well as other algorithm(s) described herein, might vary depending on the circumstance or the context of the algorithm being used. Therefore, it is recognized that the various aspects are described herein for the present invention, where some of these aspects may not be restricted to the sequence steps as described, and certain steps may vary based on the specific application.

[00122] The present disclosure delves into the vital importance of handling illumination and shadow within the realm of visual place recognition and localization for robotic navigation. Our ongoing efforts involve the implementation of an adaptive algorithm designed for handling illumination and shadow. The following figures further provide example implementations of the present invention corresponding to the various aspects described herein.

[00123] Figure la is a flow diagram illustrating an example process 100 for achieving colour constancy according to the present invention. Process 100 illustrates an algorithm for stable colour appearance and high colour discrimination that is inspired by the visual systems of humans and mainly insects and is tailored to the functional features of the ODK camera. The diagram comprises a flowchart showing the details of process 100 in estimating multi-illumination integrating three mechanisms/models: a) chromatic adaptation, b) LMC normalisation model and c) double-opponent model.

[00124] Figure lb is a flow diagram illustrating an example process 150 corresponding to the algorithm proposed in Figure la. The flow diagram depicts the image processing method based on the model architecture according to aspects of the disclosure, which is based on colour constancy removing illumination from the images. The image processing method may comprise the following steps 151 to 165.

[00125] In step 151, the method obtains an image in a first colour space. The first colour space may be a RGB colour space. In step 153, the method converts the image to data in a second colour space. The second colour space is a LSM colour space.

[00126] In step 155, the method transforms the data using chromatic adaptation, where said transforming the data using chromatic adaptation, further comprising: estimating an illuminant of the data using a grey world model; and converting the data to a destination illuminant using the illuminant. This may be done by applying one or more matrix transformations to the destination illuminant in accordance with one or more cameras for obtaining the image.

[00127] In step 157, the method performs a first normalisation on the transformed data, wherein the first normalisation comprises applying a dynamic spatial filtering technique to adjust the transformed data based on light intensity.

[00128] In step 159, the method applies a set of filters to the normalised data, wherein the set of filters is convoluted based on the normalised data in relation to the image. The set of filters is representative of two layers of a visual system.

[00129] The set of filters may comprise at least one centre-surround structure representative of RFs encoding colour opponency. The RFs may be red-green opponency, blue-yellow opponency, and achromatic opponency.

[00130]Th e set of filters may also comprise at least two filters positioned in series such that one of said at least two filters receives input from the other filter.

[00131] In step 161, the method performs a second normalisation on the filtered data to obtain an illumination estimation of the image in relation to the filtered data. In step 163, the method outputs the normalised data from the second normalisation, wherein the normalised data maintains colour constancy based on the illumination estimation, removing the illumination from the normalised data.

[00132] Optionally, in step 165, the standard method may be used to transfer RGB to LMS colour space, where method converts the normalised data from the second normalisation to an image in the first colour space.

[00133] According to Figure 1a and 1 b, the method may further identify a data representation of receptor responses in relation to the transformed data when performing the first normalisation. The data representation would comprise a uniform representation of the receptor responses across the image.

[00134] The data representation may also be applied to the transformed data, where the method integrates the data representation over a time period/frame and normalises the transformed data based on the integrated data representation using the dynamic spatial filtering technique representative of performing temporal coding adjustment for each receptor in respect of the light intensity.

[00135] The second normalisation may further comprise the step to correct the filtered data with an illumination vector generated using a pooling function, Equation 7, where the filtered data is divisible by the illumination vector.

[00136] The pooling function may comprise: Axoe (rfm s (x.Y)) e - [00137] where Ar,y)(.) is a data representation of a canonical neural computation of max over the filtered data and TV^ (x,y) exhibits the output of double-opponent filters in the LMS colour space.

[00138] The method may receive a raw image from one or more cameras, by removing gamma correction from the raw image, the method obtains the input to step 151, where the method converts the raw image to said image in the first colour space.

[00139] Figure 2 is a pictorial diagram of a model application to the Spyder Checkr under different illumination according to aspects of the disclosure. Various examples are included in relation to the model application and the results are shown. These are examples of corrected images obtained from applying the model's components to sample colour charts. As shown in the figure, the colour condition of the Spyder Checkr in corrected images on the right side would be almost constant. It indicates that the variability of colour patches at the Spyder Checkr on the left side is significantly reduced.

[00140] The left column 201 of the diagram represents the examples of original images of colour charts under varying illumination, where images of various illumination are shown in the respective rows. The global illuminations of the images were excluded by different components of the model and the mixture/combined model (last column 209). The middle columns are intermediate results for chromatic adaptation (second column 203), LMC normalisation model (third column 205), and double-opponent model (fourth column 207) in accordance with Figures la and lb. Chromatic adaption 203, LMC 205, and opponent model 207 exhibit an synergic effect when combined shown as combined model 209.

[00141] Specifically, the sample images (left column 201) were captured using a Raspberry Pi V2 Camera in different light conditions. The final column 209 represents the corrected images after the exclusion of global illumination utilising the combined model. The global illumination of the images was excluded by different components of the model and the mixture model. Because the colour condition of the Spyder Checkr on corrected images on the right side of the figure is almost constant, it is indicative that the variability of colour patches at the Spyder Checkr in the left side is significantly reduced (comparing colour patches of the Spyder Checkr in the left and right columns). Hence, the combined model produced colour charts with almost constant colour patches. The same result was obtained from the datasets captured by the front and back cameras of the ODK system (see Figures 9 and 10 for the Environment dataset and Figure 14 for the simulation environment).

[00142]To quantitatively analyze the variability of the colour patch (Spyder Checkr) intensity, before and after applying the model, the average pixel values of five patches (red, green, blue, yellow and white) were explored by the model. The violin plot visualises the probability density of the (R,G,B) of each selected patch, before and after the model (Fig. 3). Due to the huge light variability, the colours of the original images have elongated distribution, compared to the corrected images. As can be seen, the distributions of corresponding RGB pixels were shortened after applying the model and the colour values of patches increased. However, The Fano factor was used to measure the relative variance of the colours (as shown in Figure 4).

[00143] Figure 3 is a pictorial diagram of a comparison of distributions of pixel intensity between the original and corrected images according to aspects of the disclosure. The results of the model application in Figure 2 are quantified and illustrated. The figure compares the distribution of pixel intensity between the original and corrected images. Each row shows the distribution of intensity of red (left 301), green (middle 303) and blue (right 305) colour channels of the same selected patch from the original images (left blue) as labelled and the corrected images (right orange) as labelled. The colour distribution of blue 311, green 313, red 315, yellow 317, and white 319 patches is ordered from the top to bottom column as shown.

[00144] Figure 4 is a pictorial diagram of fano factors (relative variance) of the selected patches according to aspects of the disclosure. The bar graphs show the Fano factor of selected patches which represents the normalised variability of each colour channel (red: left 401, green: middle 403, and blue: right 405) for 5 selected patches 411-419 of the original images (blue bars) and corrected images by the model (orange bars). It indicates that the model reduced the variability of most of the colour channels of the selected patched.

[00145] Specifically, in the figure are Fano factors for (R,G,B) values of 5 patches 411-419, which appear to be smaller in the corrected images as it is indicative that the variability of the same pixels was decreased for all patches over images by applying the model to the images. Similar result is obtained for the second dataset (set 2, see Figure 11).

[00146] Figure 5 is a pictorial diagram of colour distance between colour boards in an environment dataset (set 1 in relation to examples of Figure 9) according to aspects of the disclosure. In Figure 5, the first row 501 exhibits an example of the original image 501a captured by the ODK camera and its corrected image 501b after being proceed by the model. The original and corrected image in the second row are their respective distances [00147] The matrices in the second row 503 show the colour distance between 8 colour boards represented in the first row (left: original image 503a, right: corrected image 503b). The colour distance between each pair of colour boards is shown by an element of the matrix.

[00148] The third 505 and fourth 507 rows display the average colour distance (average of colour distance matrix) of 8 patches for all images in the dataset in accordance to Figure 12. The box plots in the last row 509 (i.e., summarise third and fourth rows) reveal that the model underlying the present invention increases the colour distance between colour boards.

[00149] The colour distances between patches were measured for both original and corrected images using the Delta E (AE) as explained in a previous section. The matrices display the distances between all pairs of patches as shown. In the figure, the average colour distance between selected patches (i.e., the average of the matrix of colour distance) is plotted as a function of the images' index. The average colour distance between patches in the corrected images is nearly 2 times larger than the colour distance in the original images, a result which is explainable by the box plots as shown. Similar results were obtained using the dataset illustrated, for example, in Figure 12. From this, it is revealed that the present invention improves the separation between colours and potentially be able to enhance the colour discriminability of the ODK system.

[00150] Figure 6 is a pictorial diagram showing an example of the model application for the image matching between two ODK cameras according to aspects of the disclosure. In the figure, the top row 601 shows images captured by the front and back IMX219 cameras which is established in the ODK system. The bottom row 603 shows images after being corrected by the present invention. Using a target image containing the Spyder Checkr, the colour distribution of both images matched the colour distribution of the target image.

[00151] The present invention is able to undertake the matching of colour distribution between the front and back ODK cameras aside from improving colour discriminability. By doing so (undertake) help solve the challenge of overlapping in making the cylinder image from two ODK cameras by reducing the colour distance between the overlap region of the front and back images. This is shown by/from the second row, where the colours of both front and back images were matched to the colour of a template image. However, the model is still sensitive to the quality of the template image. The template image can be captured from the Spyder Checkr or Macbeth ColorChecker chart, as shown in Figure 7, under the (full) white light source.

[00152] Figure 7 is a pictorial diagram of Spyder Checkr or Macbeth ColorChecker chart according to aspects of the disclosure. The Spyder Checkr is a datacolour chart that contains a range of spectral reflectance colour patches that represents the (most) possible intensity ranges that are suitable for many uniform illumination conditions. A model application under different illumination is provided in the form of Spyder Checkr chart according to Figure 3.

[00153] Figure 8 is a pictorial diagram of an exemplary data collection setup scheme for obtaining the model results in relation to Figures 1 to 7. The location of ODK camera 805, coloured objects 801, 807a/b, windows (natural lights) 803a/b and lamps (artificial lights) 809a/b/c are shown in the scheme. The scheme setup comprises a radiator object 801 between two windows 803a/b with natural lights facing a plurality of box objects 807a/b. A plurality of lamps 809a/b/c is provided as artificial lights. Distances between the objects are shown accordingly. The ODK camera is situated at the centre of the scheme.

[00154] Figures 9 and 10 are pictorial diagrams of example images from an environment dataset (set 1 and 2) according to aspects of the disclosure. In each figure, five samples of the images 901a, 903a, 1001a, 1003a were captured from the colour boards (Top: front camera 901, 1001, Bottom: back camera 903, 1003), using the ODK camera under different light illuminations (see the design of the Environment data in Spyder Checkr with respect to Figure 2). The second row 901b, 903b, 1001 b, 1003b shows the corresponding images corrected by applying the proposed/combined model.

[00155] Figure 11 is a pictorial diagram of a comparison of distributions of pixel intensity between the original and corrected images, Pi Camera according to aspects of the disclosure. In the figure, each column shows the distribution of intensity of red (left 1101), green(middle 1103) and blue(right 1105) colour channels of the same selected patch from the original images (blue, captured by the Pi Camera) and the corrected images (orange). The colour distributions of red 1111, green 1113, blue 1115, yellow 1117, and white 1119 patches are ordered from the top to bottom rows.

[00156] Figure 12 is a pictorial diagram of the colour distance between colour boards in the Environment dataset (set 2) according to aspects of the disclosure and with respect to Figure 5. In Figure 12, the first row 1201 exhibits an example of the images captured by the ODK camera and its correction by the model. The matrices in the second row 1203 show the colour distance between 8 colour boards represented in the first row 1201 (left: original image, right: corrected image). The colour distance between each pair of colour boards is shown by an element of the matrix. The third 1205 and fourth rows 1207 display the average colour distance (average of colour distance matrix) of 8 patches for all images in the dataset (set 2). The box plots in the last column (i.e., summarise third and fourth columns) reveal that the model increases the colour distance between colour boards.

[00157] Figure 13 is a block diagram illustrating an example computing apparatus/system 1300 that may be used to implement one or more aspects of the present invention, apparatus, method(s), and/or process(es) combinations thereof, modifications thereof, and/or as described with reference to figures la to 12 and 14 to 21 and/or aspects as described herein.

Computing apparatus/system 1300 includes one or more processor unit(s) 1302, an input/output unit 1304, communications unit/interface 1306, a memory unit 1308 in which the one or more processor unit(s) 1302 are connected to the input/output unit 1304, communications unit/interface 1306, and the memory unit 1308. In some embodiments, the computing apparatus/system 1300 may be a server, or one or more servers networked together. In some embodiments, the computing apparatus/system 1300 may be a computer or supercomputer/processing facility or hardware/software suitable for processing or performing the one or more aspects of the system(s), apparatus, method(s), and/or process(es) combinations thereof, modifications thereof, and/or as described with reference to figures 1a to 12 and 14 to 21 and/or aspects as described herein. The communications interface 1306 may connect the computing apparatus/system 1300, via a communication network, with one or more services, devices, the server system(s), cloud-based platforms, systems for implementing subject-matter databases and/or knowledge graphs for implementing the invention as described herein. The memory unit 1308 may store one or more program instructions, code or components such as, by way of example only but not limited to, an operating system and/or code/component(s) associated with the process(es)/method(s) as described with reference to figures la to 12 and 14 to 21, additional data, applications, application firmware/software and/or further program instructions, code and/or components associated with implementing the functionality and/or one or more function(s) or functionality associated with one or more of the method(s) and/or process(es) of the device, service and/or server(s) hosting the process(es)/method(s)/system(s), apparatus, mechanisms and/or system(s)/platforms/architectures for implementing the invention as described herein, combinations thereof, modifications thereof, and/or as described with reference to at least one of the figure(s) la to 12 and 14 to 21 [00158] Figure 14 is a pictorial diagram of model applications to a simulation environment under different illuminations according to aspects of the disclosure and with respect to Figure 2. Here, sample images represented on the left column (original image 1401) were selected from the stimulation environment at different brightness [10, 30, 100, 500, 1000] shown in six rows respectively from top row to the bottom row. The global illumination of the images was excluded by different components of the model 1403, 1405, 1407 and the combined model (last column 1409). It is observed that the colour conditions of the corrected images on the right side are almost constant.

[00159] Figure 15a is a flow diagram of another model architecture for image processing in respect of colour constancy, especially to maintain the local illumination of an input image. The figure shows the process 1500 of the proposed model. In the process 1500, a patch 1503 is selected over a sequential mechanism from the input image 1501. Using the Canney edge detection 1505, the colour of edges within the patch 1503 is computed. The coloured edges are extracted 1507. Then, the colour of the edges is used to estimate the local illumination 1509, and ultimately correct the colour of the selected patch by following the Retinex theory. The output is an image with illumination corrected 1513. The colour of all patches is corrected sequential process of scanning.

[00160] Figure 15b is a flow diagram of the image processing based on said another model architecture shown in Figure 15a. The figure shows the method 1550 for reducing the effect of illumination on images. The method comprises at least the following steps.

[00161] In step 1551, the method receives an input image. The input image may be a raw image obtained from one or more cameras. The input image may also be an image in a colour space as described herein.

[00162] In step 1553, the input image is partitioned into a plurality of regions. In this process, the input image is segmented using a k-means clustering algorithm based on similarity in colour or intensity values of pixels from the input image, which results in producing the plurality of regions.

[00163] In step 1555, the plurality of regions is analysed based on colour information and spatial position of pixels in each region. The analysis is performed by identifying salient regions from the plurality of regions using a clustering algorithm; generating a clustered map based on the identified salient regions; and analysing the salient regions to select the subset of regions that are influenced by the illuminant.

[00164] For example, a saliency map covering the plurality of regions can be applied. Salient regions are identified from the plurality of regions using the saliency map, which constrains the regions to a subset of partitions. The salient regions are further analysed to select the subset of regions that are influenced by the illuminant.

[00165] In step 1557, from the plurality of regions, a subset of regions are selected. The selection is influenced by an illuminant based on the analysis from the previous step.

[00166] In step 1559, coloured edges are identified for at least said subset of regions. This may be accomplished using edge detection, specifically using a canny edge detection algorithm configured to select a plurality of edges based on a threshold intensity. Sobel operators may also be used during edge detection for improved efficiency.

[00167] In step 1561, the colour information is/are extracted from the coloured edges; [00168] In step 1563, reflectance and illumination components of the input image are decomposed using the extracted colour information where the reflectance and illumination components are two main factors that contribute to the appearance of an object's colour in an image. The reflectance component, also known as surface reflectance or albedo, represents the inherent colour and material properties of an object's surface while the illumination component refers to the lighting conditions under which an object is viewed as described in the previous section. In this step, these components are decomposed.

[00169] In step 1565, correcting illumination of the input image based on the decomposed reflectance and illumination components.

[00170] In step 1567, the method outputs an image with illumination corrected.

[00171] Optionally or additionally, the method may comprise identifying regions with the coloured edges from said least one subset of regions of the input image; and correcting illumination of the input image based on the identified regions with the coloured edges.

[00172] Figure 16 is a pictorial diagram of exemplary output of the image processing based on said another model architecture. The top panel 1601 displays the original image, while the middle panel 1603 exhibits the coloured edge extracted during the model's process. The last panel 1605 represents the final output after applying the Retinex-based Correction process.

[00173] Figure 17 is a pictorial diagram of violin plots 1700 showing colour distance (represented by Delta E) between the images in each dataset for image processing. Delta E is metric that measures the similarity or dissimilarity between images, before and after the correction process. The larger the Delta E, the greater the distance between the colours. Here, we evaluate the distance in the CIELab colour space and represents the relative perceived magnitude of colour difference.

[00174] The figure shows 5 different violin plots corresponding to dataset 1 to dataset 5, 1701, 1703, 1705, 1707, and 1709. The performance is evaluated with these 5 datasets shown with respect to the violin plots. Two lab datasets are common to that of the previous global colour constancy model. Three new lab datasets are added covering a more diversity of local colour constancy. In the new data sets, different regions of the image were illuminated by different artificial lights according to example images from Figure 18.

[00175] Each violin plot depicts the colour distance (Delta E) between the images in each dataset. Left, middle and right panels exhibit the Delta E measurements between original images (left), corrected image by global colour constancy model (middle), and corrected images by both local and global colour constancy model (right), respectively. It is assumed that images of each dataset captured from the same place under different lighting conditions, containing both global and local illuminations.

[00176] In sum, the effectiveness of the models was evaluated by measuring the colour distance between images before and after their implementation using the Delta E metric. This analysis was conducted for both local colour constancy and global colour constancy models as described herein, and the results are depicted in the figure as such. Here, the Lower Delta E values indicated better colour constancy performance, with reduced differences between predicted and ground truth pixel values. The left violin plot represents the probability distribution of Delta E values between the original images, illustrating the inherent variability in colour appearance. The following two violin plots showcase the colour distances between the corrected images. The middle plot corresponds to the application of the global colour constancy model, while the left plot represents the results obtained after employing the local colour constancy model.

[00177] It is evident from the figure that the local colour constancy model supplements the global colour constancy model in establishing a stable colour appearance and high colour discrimination when images are processed. The combination of the models significantly enhances performance, resulting in even lower variability for all five datasets. These findings underscore the effectiveness of the local colour constancy model in achieving more consistent colour representations across different lighting conditions.

[00178] Figure 18 is a pictorial diagram of model applications for the environment under different illuminations. On the left side 1801 are example input and outputs of the colour constancy models. The top panel 1801a displays the original image randomly selected from the dataset as input. The middle panel 1801 b and bottom panel 1801c depict the corrected images obtained after applying the global colour constancy and local colour constancy models, respectively.

[00179] On the right side 1803 are example images captured by ODK under different lighting conditions. The example images provide a visual representation of the diverse lighting conditions present in any exemplary dataset, where such dataset may be used for generating the output shown on the left side 1801 of the figure, namely bottom panel 1803c. These examples, top 1801a, middle 1801b, and bottom 1801c panels, demonstrate the variability and complexity of the lighting environments. This dataset thus consists of images captured under a range of local lighting conditions. The dataset was carefully selected to include the environment with multiple light sources, ensuring that our model/algorithm was tested under realistic and challenging lighting scenarios.

[00180] Figure 19a is a flow diagram of another model architecture for image processing. As shown, the model architecture has two steps: a first step (top) for detecting the shadows and a second step (bottom) for removing the detected shadows. The first step detects the shadows by analysing the original RGB image 1901 converted in LAB colour space 1903, generating histograms. A threshold-based method as described herein is then applied 1905 to filter the converted image in LAB colour space in relation to the histogram. Shadow pixels are selected 1907 to generate a shadow region mask 1909 as output.

[00181] In the second step, the image is first segmented into a plurality of regions using k-means algorithm. Yellow curves in the figure specify the boundary of each segment within the image. As shown, the segmented image 1911 is split into lit regions (here referred to as light segment regions) and shadow regions (here referred to as shadow segment regions). Target shadow segments 191313 (based on the shadow region mask 1909) from the first step are removed by adjusting colour to match lit regions, considering texture features of the input image as described herein. Yellow pixels represent the pixels assigned to the shadow region by the shadow detection algorithm. A similarity metric D identifies the nearest lit regions 1913a from the target shadow segments 1913b based on texture features/patterns thereof. Histogram matching 1913b aligns colours of shadow segments to corresponding lit ones. In turn, the correct shadow segments after histogram matching between the shadow and its nearest lit segments are produced and merged 1917 to form the output, a shadow-free image 1919 of the original RGB image.

[00182] Using the shadow-free image in applications such as robotic navigation indeed enhances accuracy and dependability by improving visual consistency. For example, the shadow-free image as input to or applied to output from local colour constancy and/or global colour constancy models as described herein, in the context of robotic navigation, allows the robot to confidently recognize familiar places and precise position estimation, even in challenging lighting conditions.

[00183] Figure 19b is a flow diagram of the image processing based on said another model architecture according to aspects of the disclosure; [00184] Figure 20 is a pictorial diagram of exemplary input and output of the image processing based on said another model architecture. In the figure, the original image 2001 displays an example input image that is transformed 2003 into histogram in "lab" space, where the smoothing takes place. The middle histogram 2005 shows the results of the smoothed diagram L Channel (p<10). Here, the smoothed histogram generated by the '12 channel of the converted image into 'Lab' colour space. The solid red line exhibits the threshold labeling the pixels with the shadow and lit regions.

[00185] According to the histogram, the first minima 2005a is between 0 and 5 indicated by the dashed red line while the threshold T 2005b is 10. Following this, the pixel selection 2007 occurs and shadow mask 2009a is produced. The image with shadow regions mask S 2009a is shown and can be compared to the ground truth shadow 2009b. Finally, in the figure, a version of the image after removal of shadow 2011 is also shown.

[00186] Figure 21 a pictorial diagram of exemplary input and output of the image processing for a different input image. The figure shows the original image (top left 2101) and resultant image (bottom left 2103) from steps applied to the original image to detect shadow and then exclude the shadow from the image. The figure shows the ground truth image (top right 2105) and image with shadow mask (bottom right 2107) in yellow.

[00187] In relation to the sections above, the present invention is further described herein as one or more of the following aspects and options. These aspects and options are disclosed according to any of the figures 1 to 21 as appropriate. These aspects and certain options may be combined with any other aspects and features described herein as would be apparent to a skilled person in the field of robotics and machine vision.

[00188] In one aspect is a method for processing images based on colour constancy removing illumination from the images, the method comprising: obtaining an image in a first colour space; converting the image to data in a second colour space; transforming the data using chromatic adaptation; performing a first normalisation on the transformed data, wherein the first normalisation comprises applying a dynamic spatial filtering technique to adjust the transformed data based on light intensity; applying a set of filters to the normalised data, wherein the set of filters is convoluted based on the normalised data in relation to the image; performing a second normalisation on the filtered data to obtain an illumination estimation of the image in relation to the filtered data; and outputting the normalised data from the second normalisation, wherein the normalised data maintains colour constancy based on the illumination estimation, removing the illumination from the normalised data.

[00189] In another aspect is a method for reducing effect of illumination on images, the method comprising: receiving an input image; partitioning the input image into a plurality of regions; analysing the plurality of regions based on colour information and spatial position of pixels in each region; selecting from the plurality of regions a subset of regions that are influenced by an illuminant based on the analysis; identifying coloured edges for at least said subset of regions; extracting the colour information from the coloured edges; decomposing reflectance and illumination components of the input image using the extracted colour information; correcting illumination of the input image based on the decomposed reflectance and illumination components; outputting an image with illumination corrected.

[00190] In another aspect is a method for providing a shadow-free image, the method comprising: receiving an input image in the first colour space, wherein the input image comprises at least one shadow region; generating shadow region masks for said at least one shadow region; removing shadow from shadow regions of the input image based on the shadow region masks; and outputting a shadow-free image.

[00191] In another aspect is an apparatus for processing images to maintain colour constancy of the images, the apparatus comprising: at least one model configured to perform steps according to any of herein described aspects.

[00192] In another aspect is system for processing images to establish colour constancy for an image by removing illumination from the image, the system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform any of herein described aspects.

[00193] In another aspect is a non-transitory computer medium having computer program instructions stored thereon, and computer-implemented methods according to any of herein described aspects.

[00194] As an option, converting the normalised data from the second normalisation to an image in the first colour space. As another option, said performing first normalisation, further comprising: identifying a data representation of receptor responses in relation to the transformed data, wherein the data representation comprises a uniform representation of the receptor responses across the image; and applying the data representation to the transformed data. As another option, said applying data representation, further comprising: integrating the data representation over a time period; and normalising the transformed data based on the integrated data representation using the dynamic spatial filtering technique representative of performing temporal coding adjustment for each receptor in respect of the light intensity. As another option, second normalisation comprises correcting the filtered data with an illumination vector generated using a pooling function, wherein the filtered data is divisible by the illumination vector. As another option, the pooling function comprising: the pooling function comprising: f(x]y)(Trs(x,Y)) et El f(",y)(TiLms(x.Y))* where A,y)(. ) is a data representation of a canonical neural computation of max over the filtered data and TIMS(x,y) exhibits an output of double-opponent filters in the second colour space. As another option, further comprising: receiving a raw image from one or more cameras; removing gamma correction from the raw image; and converting the raw image to said image in the first colour space. As another option, the first colour space is a Red-GreenBlue colour space. As another option, the second colour space is a Long-Medium-Short colour space. As another option, said transforming the data using chromatic adaptation, further comprising: estimating an illuminant of the data using a grey world model; and converting the data to a destination illuminant using the illuminant. As another option, said converting the data to a destination illuminant, further comprising: applying one or more matrix transformations to the destination illuminant in accordance with one or more cameras for obtaining the image. As another option, wherein the set of filters comprises at least one centre-surround structure representative of receptive fields (RFs) encoding colour opponency. As another option, the RFs comprise the red-green opponency, blue-yellow opponency, and achromatic opponency. As another option, the set of filters comprises at least two filters positioned in series such that one of said at least two filters receives input from the other filter. As another option, the set of filters is representative of two layers of a visual system. As another option, said analysing the plurality of regions, further comprising: identifying salient regions from the plurality of regions using a clustering algorithm; generating a clustered map based on the identified salient regions; and analysing the salient regions to select the subset of regions that are influenced by the illuminant. As another option, the clustered map constrains the plurality of regions into a set number of partitions. As another option, the partitioning the input image into a plurality of regions, further comprising: segmenting the input image using a k-means clustering algorithm based on similarity in colour or intensity values of pixels from the input image. As another option, the identifying coloured edges for at least said subset of regions, further comprising: performing edge detection using a canny edge detection algorithm configured to select a plurality of edges based on a threshold intensity. As another option, the identifying coloured edges for at least said subset of regions, further comprising: performing edge detection using sobel operators. As another option, further comprising: identifying regions with the coloured edges from said least one subset of regions of the input image; and correcting illumination of the input image based on the identified regions with the coloured edges.

[00195] As another option, said removing shadow from shadow regions of the input image based on the shadow region masks, further comprising: converting the input image to data in a third colour space; partitioning the data of the input image into a plurality of regions; segmenting the plurality of regions into shadow segment regions and light segment regions according to the shadow region masks; determining distances between the shadow segment regions and the light segment regions based on texture features of the input image; pairing the shadow segment regions and the light segment regions based on the determined distance; performing histogram matching on the paired shadow segments and light segments to produce colour-adjusted segments; iteratively performing said pairing and histogram matching until every shadow segment has been matched; merging the colour-adjusted segments to form a shadow-free image; and converting the shadow-free image into an image in the first colour space. As another option, said pairing the shadow segment regions and the light segment regions based on the determined distance, further comprising: identifying one or more light segment regions of closest distance to each shadow segment region; and pairing said each shadow segment with said one or more identified light segment regions. As another option, further comprising: splitting the data into a light segment part and shadow segment part; and labelling each part for the segmentation. As another option, the texture features comprising: edge distance, colour distance, entropy distance, neighbourhood distance, and a combination thereof. As another option, said generating shadow region masks for said at least one shadow region, further comprising: identifying said at least one shadow region from the input image; and generating shadow region masks based on said at least one shadow region identified. As another option, further comprising: converting the input image to data in a fourth colour space; generating a histogram based on a colour channel extracted from the data; smoothing the histogram using a gaussian window; identifying local minima on the smoothed histogram; determining a threshold based on the identified local minima and a parameter associated with the input image, wherein the parameter is selected based on size of the input image; and selecting pixels to be masked based on the colour channel of the converted image being less than the threshold.

[00196] In the embodiments, aspects, examples, of the invention as described above such as algorithm(s), model(s), process(es), method(s), system(s) and/or apparatus may be implemented on and/or comprise one or more cloud platforms, one or more server(s) or computing system(s) or device(s). A server may comprise a single server or network of servers, the cloud platform may include a plurality of servers or network of servers. In some examples the functionality of the server and/or cloud platform may be provided by a network of servers distributed across a geographical area, such as a worldwide distributed network of servers, and a user may be connected to an appropriate one of the network of servers based upon a user location and the like.

[00197] The above description discusses embodiments of the invention with reference to a single user for clarity. It will be understood that in practice the system may be shared by a plurality of users, and possibly by a very large number of users simultaneously.

[00198] The embodiments described above may be configured to be semi-automatic and/or are configured to be fully automatic. In some examples a user or operator of the querying system(s)/process(es)/method(s) may manually instruct some steps of the process(es)/method(es) to be carried out.

[00199] The described embodiments of the invention a system, process(es), method(s) and/or tool for querying any data structure described thereof and the like according to the invention and/or as herein described may be implemented as any form of a computing and/or electronic device. Such a device may comprise one or more processors which may be microprocessors, controllers or any other suitable type of processors for processing computer executable instructions to control the operation of the device in order to gather and record routing information. In some examples, for example where a system on a chip architecture is used, the processors may include one or more fixed function blocks (also referred to as accelerators) which implement a part of the process/method in hardware (rather than software or firmware). Platform software comprising an operating system or any other suitable platform software may be provided at the computing-based device to enable application software to be executed on the device.

[00200] Various functions described herein can be implemented in hardware, software, or any combination thereof. If implemented in software, the functions can be stored on or transmitted over as one or more instructions or code on a computer-readable medium or non-transitory computer-readable medium. Computer-readable media may include, for example, computer-readable storage media. Computer-readable storage media may include volatile or nonvolatile, removable or non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. A computer-readable storage media can be any available storage media that may be accessed by a computer. By way of example, and not limitation, such computer-readable storage media may comprise RAM, ROM, EEPROM, flash memory or other memory devices, CD-ROM or other optical disc storage, magnetic disc storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Disc and disk, as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and blu-ray disc (BD). Further, a propagated signal is not included within the scope of computer-readable storage media. Computer-readable media also includes communication media including any medium that facilitates transfer of a computer program from one place to another. A connection or coupling, for instance, can be a communication medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of communication medium. Combinations of the above should also be included within the scope of computer-readable media.

[00201]Alternatively, or in addition, the functionality described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, hardware logic components that can be used may include Field-programmable Gate Arrays (FPGAs), Program-specific Integrated Circuits (ASICs), Program-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs). Complex Programmable Logic Devices (CPLDs), etc. [00202] Although illustrated as a single system, it is to be understood that the computing device may be a distributed system. Thus, for instance, several devices may be in communication by way of a network connection and may collectively perform tasks described as being performed by the computing device.

[00203]Althoug h illustrated as a local device it will be appreciated that the computing device may be located remotely and accessed via a network or other communication link (for example using a communication interface).

[00204]Th e term 'computer' is used herein to refer to any device with processing capability such that it can execute instructions. Those skilled in the art will realise that such processing capabilities are incorporated into many different devices and therefore the term 'computer' includes PCs, servers, loT devices, mobile telephones, personal digital assistants and many other devices.

[00205]Those skilled in the art will realise that storage devices utilised to store program instructions can be distributed across a network. For example, a remote computer may store an example of the process described as software. A local or terminal computer may access the remote computer and download a part or all of the software to run the program. Alternatively, the local computer may download pieces of the software as needed, or execute some software instructions at the local terminal and some at the remote computer (or computer network). Those skilled in the art will also realise that by utilising conventional techniques known to those skilled in the art that all, or a portion of the software instructions may be carried out by a dedicated circuit, such as a DSP, programmable logic array, or the like.

[00206] It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages. Variants should be considered to be included into the scope of the invention.

[00207]An y reference to 'an' item refers to one or more of those items. The term 'comprising' is used herein to mean including the method steps or elements identified, but that such steps or elements do not comprise an exclusive list and a method or apparatus may contain additional steps or elements.

[00208]As used herein, the terms "component' and "system" are intended to encompass computer-readable data storage that is configured with computer-executable instructions that cause certain functionality to be performed when executed by a processor. The computer-executable instructions may include a routine, a function, or the like. It is also to be understood that a component or system may be localized on a single device or distributed across several devices. Further, as used herein, the term "exemplary", "example or "embodiment" is intended to mean "serving as an illustration or example of something". Further, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim.

[00209] The figures illustrate exemplary methods. While the methods are shown and described as being a series of acts that are performed in a particular sequence, it is to be understood and appreciated that the methods are not limited by the order of the sequence. For example, some acts can occur in a different order than what is described herein. In addition, an act can occur concurrently with another act. Further, in some instances, not all acts may be required to implement a method described herein.

[00210] Moreover, the acts described herein may comprise computer-executable instructions that can be implemented by one or more processors and/or stored on a computer-readable medium or media. The computer-executable instructions can include routines, sub-routines, programs, threads of execution, and/or the like. Still further, results of acts of the methods can be stored in a computer-readable medium, displayed on a display device, and/or the like.

[00211] The order of the steps of the methods described herein is exemplary, but the steps may be carried out in any suitable order, or simultaneously where appropriate. Additionally, steps may be added or substituted in, or individual steps may be deleted from any of the methods without departing from the scope of the subject matter described herein. Aspects of any of the examples described above may be combined with aspects of any of the other examples described to form further examples without losing the effect sought.

[00212] It will be understood that the above description of a preferred embodiment is given by way of example only and that various modifications may be made by those skilled in the art.

[00213] What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable modification and alteration of the above devices or methods for purposes of describing the aforementioned aspects, but one of ordinary skill in the art can recognize that many further modifications and permutations of various aspects are possible. Accordingly, the described aspects are intended to embrace all such alterations, modifications, and variations that fall within the scope of the appended claims.

Claims

CLAIMS1. A computer-implemented method for processing images based on colour constancy removing illumination from the images, the method comprising: obtaining an image in a first colour space; converting the image to data in a second colour space; transforming the data using chromatic adaptation; performing a first normalisation on the transformed data, wherein the first normalisation comprises applying a dynamic spatial filtering technique to adjust the transformed data based on light intensity; applying a set of filters to the normalised data, wherein the set of filters is convoluted based on the normalised data in relation to the image; performing a second normalisation on the filtered data to obtain an illumination estimation of the image in relation to the filtered data; and outputting the normalised data from the second normalisation, wherein the normalised data maintains colour constancy based on the illumination estimation, removing the illumination from the normalised data.
2. The method of claim 1, further comprising: converting the normalised data from the second normalisation to an image in the first colour space.
3. The method of claim 1 or 2, wherein said performing first normalisation, further comprising: identifying a data representation of receptor responses in relation to the transformed data, wherein the data representation comprises a uniform representation of the receptor responses across the image; and applying the data representation to the transformed data.
4. The method of claim 3, wherein said applying data representation, further comprising: integrating the data representation over a time period; and normalising the transformed data based on the integrated data representation using the dynamic spatial filtering technique representative of performing temporal coding adjustment for each receptor in respect of the light intensity.
5. The method of any preceding claims, wherein second normalisation comprises correcting the filtered data with an illumination vector generated using a pooling function, wherein the filtered data is divisible by the illumination vector.
6. The method of claim 5, wherein the pooling function comprising: f(x,y)(11Lms(x,Y)) e, -Ei kx,y)(riLms(x,Y)).where is a data representation of a canonical neural computation of max over the filtered data and ri's (x,y) exhibits an output of double-opponent filters in the second colour space.
7. The method of any preceding claims, further comprising: receiving a raw image from one or more cameras; removing gamma correction from the raw image; and converting the raw image to said image in the first colour space.
8. The method of any preceding claims, wherein the first colour space is a Red-Green-Blue colour space.
9. The method of any preceding claims, wherein the second colour space is a Long-MediumShort colour space.
10. The method of any preceding claims, wherein said transforming the data using chromatic adaptation, further comprising: estimating an illuminant of the data using a grey world model; and converting the data to a destination illuminant using the illuminant.
11. The method of claim 10, wherein said converting the data to a destination illuminant, further comprising: applying one or more matrix transformations to the destination illuminant in accordance with one or more cameras for obtaining the image.
12. The method of any preceding claims, wherein the set of filters comprises at least one centre-surround structure representative of receptive fields (RFs) encoding colour opponency.
13. The method of claim 12, wherein the RFs comprise the red-green opponency, blue-yellow opponency, and achromatic opponency.
14. The method of any preceding claims, wherein the set of filters comprises at least two filters positioned in series such that one of said at least two filters receives input from the other filter.
15. The method of any preceding claims, wherein the set of filters is representative of two layers of a visual system.
16. The method of any preceding claims, further comprising: obtaining an input image; partitioning the input image into a plurality of regions; analysing the plurality of regions based on colour information and spatial position of pixels in each region; selecting from the plurality of regions a subset of regions that are influenced by an illuminant based on the analysis; identifying coloured edges for at least said subset of regions; extracting the colour information from the coloured edges; decomposing reflectance and illumination components of the input image using the extracted colour information; correcting illumination of the input image based on the decomposed reflectance and illumination components; outputting an image with illumination corrected.
17. The method of any preceding claims, wherein the input image obtained is a raw image received from one or more cameras, said image in the first colour space obtained prior to the first normalisation, said normalised data from the output of the second normalisation, or said image in the first colour space obtained following to the second normalisation.
18. The method of any preceding claims, wherein said analysing the plurality of regions, further comprising: identifying salient regions from the plurality of regions using a clustering algorithm; generating a clustered map based on the identified salient regions; and analysing the salient regions to select the subset of regions that are influenced by the illuminant.
19. The method of any preceding claims, further comprising: obtaining an input image in the first colour space; identifying shadow regions from the input image; generating shadow region masks based on the identified shadow regions; removing shadow from shadow regions of the input image based on the shadow region masks; and outputting a shadow-free image.
20. The method of claim 19, wherein the input image is a raw image or an image in the first colour space.
21. The method of claim 19 or 20, wherein said removing shadow from shadow regions of the input image based on the shadow region masks, further comprising: converting the input image to data in a third colour space; partitioning the data of the input image into a plurality of regions; segmenting the plurality of regions into shadow segment regions and light segment regions according to the shadow region masks; determining distances between the shadow segment regions and the light segment regions based on texture features of the input image; pairing the shadow segment regions and the light segment regions based on the determined distance; performing histogram matching on the paired shadow segments and light segments to produce colour-adjusted segments; iteratively performing said pairing and histogram matching until every shadow segment has been matched; merging the colour-adjusted segments to form a shadow-free image; and converting the shadow-free image into an image in the first colour space.
22. The method of claims 19 to 21, wherein said identifying shadow regions from the input image, further comprising: converting the input image to data in a fourth colour space; generating a histogram based on a colour channel extracted from the data; smoothing the histogram using a gaussian window; identifying local minima on the smoothed histogram; determining a threshold based on the identified local minima and a parameter associated with the input image, wherein the parameter is selected based on size of the input image.
23. The method of claim 22, wherein said generating shadow region masks based on the identified shadow regions, further comprising: selecting pixels to be masked based on the colour channel of the converted image being less than the threshold.
24. An apparatus for establishing colour constancy of an image, the apparatus comprising: one or more cameras for capturing the image in a first colour space; a processing unit for converting the captured image to corresponding data in a second colour space; a first model, a second model, and a third model configured to process said data sequentially to establish colour constancy, wherein the first model is configured to transform the data using chromatic adaptation, the second model is configured to perform a first normalisation on the transformed data, wherein the first normalisation comprises applying a dynamic spatial filtering technique to adjust the transformed data based on light intensity, and the third model is configured to apply a set of filters to the normalised data, wherein the set of filters is convoluted based on the normalised data in relation to the image, and perform a second normalisation on the filtered data to obtain an illumination estimation of the image in relation to the filtered data; and and an output module configured to output the normalised data from the second normalisation, wherein the normalised data maintains colour constancy based on the illumination estimation.
25. The apparatus of claim 24, wherein the processing unit is configured to perform methods according to any of the claims 2 to 23.
26. A system for processing images to establish colour constancy for an image by removing illumination from the image, the system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform any method of claims 1 to 23.
27. A computer-implemented method for reducing effect of illumination on images, the method comprising: receiving an input image; partitioning the input image into a plurality of regions; analysing the plurality of regions based on colour information and spatial position of pixels in each region; selecting from the plurality of regions a subset of regions that are influenced by an illuminant based on the analysis; identifying coloured edges for at least said subset of regions; extracting the colour information from the coloured edges; decomposing reflectance and illumination components of the input image using the extracted colour information; correcting illumination of the input image based on the decomposed reflectance and illumination components; outputting an image with illumination corrected.
28. The method of claim 27, wherein the input image is received according to any method of claims 1 to 15.
29. The method of claim 27 or 28, wherein said analysing the plurality of regions, further comprising: identifying salient regions from the plurality of regions using a clustering algorithm; generating a clustered map based on the identified salient regions; and analysing the salient regions to select the subset of regions that are influenced by the illuminant based on the clustered map.
30. The method of claim 29, wherein the clustered map constrains the plurality of regions into a set number of partitions.
31. The method of claims 27 to 30, wherein the partitioning the input image into a plurality of regions, further comprising: segmenting the input image using a k-means clustering algorithm based on similarity in colour or intensity values of pixels from the input image.
32. The method of claims 27 to 31, wherein the identifying coloured edges for at least said subset of regions, further comprising: performing edge detection using a canny edge detection algorithm configured to select a plurality of edges based on a threshold intensity.
33. The method of claims 27 to 32, wherein the identifying coloured edges for at least said subset of regions, further comprising: performing edge detection using sobel operators.
34. The method of claims 27 to 33, further comprising: identifying regions with the coloured edges from said least one subset of regions of the input image; and correcting illumination of the input image based on the identified regions with the coloured edges.
35. An apparatus for processing images to maintain colour constancy of the images, the apparatus comprising: at least one model configured to perform steps according to any method of claims 27 to 34.
36. A system for processing images to establish colour constancy for an image by removing illumination from the image, the system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform any method of claims 27 to 34.
37. A computer-implemented method for providing a shadow-free image, the method comprising: receiving an input image in the first colour space, wherein the input image comprises at least one shadow region; generating shadow region masks for said at least one shadow region; removing shadow from shadow regions of the input image based on the shadow region masks; and outputting a shadow-free image.
38. The method of claim 37, wherein the input image is received according to any of the methods of claims 1 to 15 and/or 27 to 34.
39. The method of claim 37 or 38, wherein said removing shadow from shadow regions of the input image based on the shadow region masks, further comprising: converting the input image to data in a third colour space; partitioning the data of the input image into a plurality of regions; segmenting the plurality of regions into shadow segment regions and light segment regions according to the shadow region masks; determining distances between the shadow segment regions and the light segment regions based on texture features of the input image; pairing the shadow segment regions and the light segment regions based on the determined distance; performing histogram matching on the paired shadow segments and light segments to produce colour-adjusted segments; iteratively performing said pairing and histogram matching until every shadow segment has been matched; merging the colour-adjusted segments to form a shadow-free image; and converting the shadow-free image into an image in the first colour space.
40. The method of claims 39, wherein said pairing the shadow segment regions and the light segment regions based on the determined distance, further comprising: identifying one or more light segment regions of closest distance to each shadow segment region; and pairing said each shadow segment with said one or more identified light segment regions
41. The method of claim 39 or 40, further comprising: splitting the data into a light segment part and shadow segment part; and labelling each part for the segmentation.
42. The method of claims 39 to 41, wherein the texture features comprising: edge distance, colour distance, entropy distance, neighbourhood distance, and a combination thereof
43. The method of claims 39 to 42, wherein said generating shadow region masks for said at least one shadow region, further comprising: identifying said at least one shadow region from the input image; and generating shadow region masks based on said at least one shadow region identified.
44. The method of claim 39 to 43, further comprising: converting the input image to data in a fourth colour space; generating a histogram based on a colour channel extracted from the data; smoothing the histogram using a gaussian window; identifying local minima on the smoothed histogram; determining a threshold based on the identified local minima and a parameter associated with the input image, wherein the parameter is selected based on size of the input image; and selecting pixels to be masked based on the colour channel of the converted image being less than the threshold.
45. An apparatus for processing images to maintain colour constancy of the images, the apparatus comprising: at least one model configured to perform steps according to any method of claims 37 to 44.
46. A system for processing images to establish colour constancy for an image by removing illumination from the image, the system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform any method of claims 37 to 44.