WO2003052712A1

WO2003052712A1 - Method and device for automatic zooming

Info

Publication number: WO2003052712A1
Application number: PCT/FI2002/001023
Authority: WO
Inventors: Jarno Tulkki
Original assignee: Hantro Products Oy
Priority date: 2001-12-18
Filing date: 2002-12-13
Publication date: 2003-06-26
Also published as: FI20020368A7; AU2002350783A1; FI112017B; FI20020368A0

Abstract

The invention relates to a method and a device for automatic zooming. The device comprises means (116) for forming difference data representing the difference in magnitude between each luminance pixel of a previous image and a corresponding luminance pixel of the present image; means (118, 124) for adding the formed difference data to a cumulative motion register, in which a data element corresponds to each luminance pixel of the image; means (136) for detecting as motion the data elements, whose value exceeds a pre-determined threshold value; and means (132) for zooming to the detected motion.

Description

METHOD AND DEVICE FOR AUTOMATIC ZOOMING

FIELD

[0001] The invention relates to a method and a device for automatic zooming.

BACKGROUND

[0002] IPC classes G03B and H04N describe various solutions for implementing automatic follow-up and automatic zooming of an object to be captured in image. US patent 4,719,485 discloses a camera that employs a thermal sensor for detecting living beings in an image area. Detection is based on a difference in temperature between the living being and the ambient temperature. After the living being is detected, its distance from the camera is measured. The camera is then turned on a motorized base to follow the motion of the object. A motorized zoom is employed to keep the size of the object constant in the frame irrespective of the distance of the object from the cam- era. Regarding the costs, a problem with the solution is that it requires a thermal sensor, a motorized base and means for measuring distances.

BRIEF DESCRIPTION

[0003] An object of the invention is to provide an improved method for automatic zooming and an improved device for automatic zooming. [0004] One aspect of the invention is a method for automatic zooming, the method comprising: forming difference data representing the difference in magnitude between each luminance pixel of a previous image and a corresponding luminance pixel of the present image; adding the formed difference data to a cumulative motion register, in which a data element corresponds to each luminance pixel of the image; detecting as a motion the data elements, whose value exceeds a predetermined threshold value; and zooming to the detected motion.

[0005] One aspect of the invention is a device for automatic zooming, the device comprising: means for forming difference data representing the difference in magnitude between each luminance pixel of a previous image and a corresponding luminance pixel of the present image; means for adding the formed difference data to a cumulative motion register, in which a data element corresponds to each luminance pixel of the image; means for detect- ing as a motion the data elements, whose value exceeds a predetermined threshold value; and means for zooming to the detected motion.

[0006] One aspect of the invention is a device for automatic zooming, the device being configured to form difference data representing the differ- ence in magnitude between each luminance pixel of a previous image and a corresponding luminance pixel of the present image; to add the formed difference data to a cumulative motion register, in which a data element corresponds to each luminance pixel of the image; to detect as a motion the data elements, whose value exceeds a predetermined threshold value; and to zoom to the detected motion.

[0007] The preferred embodiments of the invention are disclosed in the dependent claims.

[0008] The invention is based on analyzing differences between luminance pixels of successive images, whereby it is possible to detect motion automatically. The procedure slightly resembles the motion estimation used in video coding, but the motion register used is cumulative, and therefore, a person who moves for a short while and then stops is also detected as motion in the image.

[0009] Several advantages are achieved with the method and the device of the invention. The procedure allows zooming to a detected motion without any particular components, for instance, without the thermal sensors required by the above-mentioned US patent. The invention does not necessarily require an optical zoom, because zooming can be performed digitally. Previously, it has not been actually known to zoom to detected motion, but the known solutions operate on a focus, in other words, they zoom to a so-called nearest focal point. The procedure of the invention also allows automatic zooming to other points than the frame centre, whereas the prior art solutions zoom to the frame centre.

[0010] The embodiments of the procedure have also other advan- tages, which will be described in greater detail below.

LIST OF DRAWINGS

[0011] In the following the invention will be described in greater detail in connection with preferred embodiments, with reference to the attached drawings, wherein Figure 1 is a block diagram of a device for automatic zooming; Figure 2 is a flow chart illustrating a method for automatic zooming;

Figure 3 illustrates luminance pixel processing in a motion register;

Figures 4A, 5A, 6A and 7A are four images selected from a sequence of successive images; Figures 4B, 5B, 6B and 7B show how a motion detected in Figures

4A, 5A, 6A and 7A would appear in a black-and-white image;

Figures 4C, 5C, 6C and 7C show an image zoomed to the motion detected in Figures 4A, 5A, 6A and 7A;

Figure 8 illustrates the effect of the camera being moved; Figure 9 illustrates the effect of the use of an optical zoom in the camera; and

Figure 10 illustrates luminance pixel processing in a motion register in connection with the example of Figures 4A, 5A, 6A and 7A.

DESCRIPTION OF THE EMBODIMENTS [0012] With reference to Figure 1 , a device for automatic zooming will be described. The basic principle of the device is simple. It is configured to form difference data by subtracting from each luminance pixel of a previous image a corresponding luminance pixel of the present image. The device is also configured to add the formed difference data to a cumulative motion regis- ter. In the cumulative motion register, a data element corresponds to each luminance pixel of the image. The device is configured to detect as motion the data elements whose value exceeds a predetermined threshold value. The actual zooming is carried out such that the device is configured to zoom to the detected motion. [0013] The device shown in Figure 1 is now studied in greater detail. The device communicates with an image source, in our example with a video camera 158. The image source used can be any known device, which provides successive images 100. Because the invention analyses changes in luminance pixels of successive images, information to be obtained from the image source must contain luminance information, either as separate luminance pixels or the luminance pixels must be distinguishable or convertible from the information received from the image source.

[0014] Coding of successive images, for instance of a video image, is employed to reduce the amount of data, so that it can be more effectively stored on a storage medium or transferred using data communication link. One example of the video coding standard is MPEG-4 (Moving Images Expert Group). The image sizes used are different: for instance, the cif size comprises 352 x 288 pixels and the qcif size comprises 176 x 144 pixels. Typically, an individual image is divided into blocks, which include data on luminance, colour and location. The data in the blocks is compressed blockwise using a desired coding method. The compression is based on deleting less significant data. Principally, the compression methods are divided into three different classes: spectral redundancy reduction, spatial redundancy reduction and temporal redundancy reduction. Typically, the compression employs various combinations of these methods.

[0015] To reduce the spectral redundancy, a YUV colour model is applied, for instance. The YUV model utilizes the fact that the human eye is more sensitive to variation in luminance than changes in chrominance, i.e. in colour. The YUV model comprises one luminance component (Y) and two chrominance components (U and V, or C_b and C_r). For instance, a luminance block in accordance with H.263 video coding standard comprises 16 x 16 pixels and both of the chrominance blocks, which cover the same area as the luminance block, comprise 8 x 8 pixels. The combination of one luminance block and two chrominance blocks is called a macro block. Each pixel, both in the luminance block and the chrominance block, can have a value within the range of 0 to 255, i.e. it takes eight bits to represent one pixel. For instance, the luminance pixel value 0 refers to black and the value 255 refers to white.

[0016] The successive images 100 entering the device can thus be in accordance with the YUV model, in which the luminance pixels already ap- pear as a specific component, for instance in a cif size image there are 352 x 288 luminance pixels.

[0017] The image 100 of the image source is applied to a frame buffer 106, where the luminance and chrominance components of the image are stored. [0018] The luminance component of the preceding image in the frame buffer 106 is applied 108 to a second frame buffer 110.

[0019] Then, difference data, representing the difference in magnitude between each luminance pixel of the previous image and the corresponding luminance pixel of the present image, is formed in block 116. The formation of the difference data can be carried out, for instance, by subtracting from each luminance pixel of the previous image 114 the corresponding luminance pixel of the present image 112, or by using any other mathematical operation that gives the same result.

[0020] The formed difference data 118 are then added to a cumulative motion register 126. The cumulative motion register 126 is a memory, whose size is the frame height x the frame width. A possible reading area is [- 255,255] if an image of YUV format is used. The memory can also be smaller, but in that case the incoming data has to be preprocessed.

[0021] In one embodiment the device is configured to reduce the values of the data elements in the motion register differing from zero towards zero by a predetermined magnitude in connection with each difference data addition. In the example of Figure 1 , this is represented by block 120, to which the formed difference data 118 is applied and in which a predetermined number is subtracted from each data element of the difference data received 125 from the motion register 126, and the obtained remainder, added with the dif- ference data 124, is then entered in the motion register 126 to be a new data element value. The predetermined number can be one or any number higher than one. Naturally, the subtraction can also be performed such that the difference data is added to the motion register 126, in which a predetermined number is then subtracted from each data element differing from zero. The device comprises a control block 136, which controls the operation of different device parts. The predetermined number 122, which will be subtracted from each data element, is brought to block 120 from the control block 136.

[0022] The previous image and the present image having been processed in the above-described manner, the control block 136 scans 128 the motion register 126 in a predetermined manner, e.g. by lines or by columns. The data elements of the motion register 126, whose value exceeds the predetermined threshold value, are then detected as motion in the control block 136. In one embodiment the control block detects motion blockwise, for instance, by luminance macro blocks. The blocks may also overlap. In one embodiment motion is detected in an image block if at least a predetermined number of data elements in the motion register 126, corresponding to said block, exceeds the predetermined threshold value.

[0023] Figure 3 illustrates luminance pixel processing in the motion register 126. The example is based on prototypes and experiments made by the applicant. A chain of events, filmed with a video camera and showing a light table, serves as a basis for this example. A person arrives and leaves a black briefcase on the table. In Figure 3, the numbers 1 to 150 of the successive frames appear on the horizontal axis, and the value ranges of both the luminance pixel and the motion register data element appears on the vertical axis. Variations in the luminance pixel value to be observed are represented by curve 300. Variation in the motion register 126 data element value corresponding to said luminance pixel is represented by curve 302.

[0024] In frames 1 to 13, the luminance pixel selected for observation is part of the light table. Slight variation in the pixel value is noise. In frames 14 to 18, the luminance pixel is part of the black briefcase that is set on the table. Likewise, in frames 19 to 150, the luminance pixel is part of the black briefcase that was left on the table.

[0025] As curve 302 reveals, our example employs the embodiment, in which the data element values differing from zero in the motion register 126 are reduced towards zero by a predetermined quantity in connection with each difference data addition. In our example, the predetermined quantity is one. The reduction of the data element value causes the briefcase to disappear completely from the motion register 126 in frame 137, i.e. it has merged into the background.

[0026] Because the device detects as motion the data elements whose value exceeds a predetermined threshold value, it is interesting to ponder, what a suitable threshold value would be for the example of Figure 3. A suitable threshold value removing the noise, but not the motion, would be 10 to 15, for instance, whereby motion would no longer be detected in and around frame 130. [0027] When motion is detected in the image, the control block 136 controls 144 a frame buffer 106 to transfer the zoom area 130 of the present frame to a zooming block 132. The control data 144 includes information on which area to be zoomed in the frame will be transferred as the frame 130 to the zooming block 132. From the control block 136 the control data 134 is transferred to the zooming block 132, which control data includes the ratio between the sizes of the incoming and outgoing frames. Said ratio is 1:1 if zooming is not used. The zooming block 132 thus zooms to the detected motion, for instance, by interpolating the zoomed area to be of the original frame size using known interpolation methods. [0028] In motion detection an area, where motion takes place, is detected. This area is framed such that the ratio of the height to the width always remains the same with respect to the original image. In addition, it is advisable to have a zoom area that is slightly larger than the area where motion takes place. The enlargement must also have a maximum, which depends on the frame size used and the accuracy of the camera. For instance, a 100-fold enlargement of a qcif-size image is not sensible.

[0029] In one embodiment the control block 136 and/or the zooming block 132 store the zooming position in a memory so as to ensure a controlled change in the zoom area. Information on the zooming position can be utilized by allowing only a change of a predetermined degree in the zooming position between two successive frames. In addition to or instead of the zooming position, the control block 136 and/or the zooming block 132 may store a zooming ratio in the memory. Information on the zooming ratio can be utilized by allowing only a change of a predetermined degree in the zooming ratio between two successive frames. By limiting the change in the zooming position and/or the variation in the zooming ratio the frame will be easier to look at, because the frame does not change too quickly after the detected motion, but in phases of a given degree, for instance. Rate control is an optimisation task between the detected motion follow-up rate and the quality and information content of the image. [0030] In addition, it is advisable to allow the motion area shift freely inside the zoom area within certain limits. The above methods make it possible to achieve an image that moves smoothly.

[0031] The zoomed or unzoomed frame 142 is then applied to an optional encoding block 150, in which the image can be encoded, if desired. The encoding of the zoomed image provides the advantage that the final result of the encoding is improved, because irrelevant information is deleted from the image.

[0032] The encoding block 150 can be any known video encoder, for instance, an mpeg4-encoder. Frame 142 can also be applied to a memory means comprised by the device or to a memory means coupled to the device, for instance to a computer hard disk, to be stored therein. Frame 142 can also be applied to a viewing apparatus, for instance, to a monitor. Depending on the representation format, it may be necessary to convert the image.

[0033] In one embodiment, the predetermined threshold value con- trol used in connection with motion detection controls the sensitivity of the motion detection. Generally, this can be expressed, for instance, such that in the control block 136 a threshold value is formed for the present image using a quantity representing the average value of the luminance pixels of the present image and the threshold value of the previous image. For instance, the average value of the luminance pixels for image k is given by

where py is a luminance pixel and n x m is a size of the image. Calculation of the average value does not require all the luminance pixels of the image. [0034] A threshold value for image k is given by t_k = pa_k + q , and (2) t_k = (rt_k +st_k__x)/(r + s) , (3) where tκ-ι is the threshold value of the previous image. The formula 2 employs a linear function, where p and q are constant, but it is also possible to employ a non-linear function. In accordance with formula 3, it is also possible to weight the threshold value, r and s are constant, and by changing them it is possible to adjust the threshold value and hence the sensitivity of the device for great changes in luminance. The user interface of the device may comprise a controller, by which the threshold value can be controlled continuously to achieve the desired sensitivity. It is worthwhile to perform the weighting, in order that the system would not react to quick, extensive variations in luminance, such as a change in luminance produced by a lightning or a cloud. It is not necessary to use the mean in formulae 2 and 3, but any other statistical quantity representing an average can be used. Noise can also be measured from random pixels and the sensitivity can be adjusted accordingly.

[0035] In one embodiment the device is configured, for instance the control block 136 is ordered, to transfer the luminance pixels of a previous frame to the place of the luminance pixels representing the same image area in the present frame, to transfer the corresponding data elements in the motion register 126 to the place of said same image area, and to reset to zero only the data elements in the motion register 126 corresponding to the luminance pixels of the present frame, if the image area of the present frame has shifted with respect to the image area of the previous frame, i.e. the camera 158, 160 producing the frames was moved. [0036] In one embodiment the device is configured, for instance the control block 136 is ordered, to modify the image area of a previous frame to correspond to the image area of the present frame by interpolating thereto the missing luminance pixels corresponding to the image area of the present frame, and to modify the image area in the motion register 126 to correspond to the image area of the present frame by interpolating thereto the missing data elements corresponding to the fame area of the present frame, if the image area of the present frame was zoomed optically with respect to the image area of the previous frame, i.e. the camera 158, 160 producing the frames em- ployed optical zooming.

[0037] In one embodiment the device is configured, for instance the control block 136 commands 140 a selection block 102 to control 138 the camera 160 producing the frames to move in the direction of motion detected in the frame. In that case the camera 160 comprises e.g. an electric motor, by which the camera 160 can be oriented in the direction of the motion. The control command 138 indicates the magnitude of the necessary movement in degrees, for instance.

[0038] In one embodiment the device is configured, for instance the control block 136 commands 140 the selection block 102 to control 138 the camera 160 producing the frames to zoom optically to motion detected in the frame. In that case the camera 160 comprises an optical zoom with electrical control, by which the image produced by the camera 160 can be zoomed to the motion. The control command 138 indicates the required change in the zooming ratio. [0039] In one embodiment the device comprises a block 154 to transmit successive frames and/or a zoomed frame using data transmission link 156. The transmitted frame can be a frame coming from the original image source or a zoomed frame. It is also possible that the camera that produced the frames has been moved, or it has been optically zoomed in the direction of motion detected in the frame. The data transmission link employs known solutions, for instance, a wireless radio link, a local area network, the Internet, a fixed dedicated line, or any other known manner to transfer data between two points.

[0040] The device blocks shown in Figure 1 can be implemented as one or more application-specific integrated circuits (ASIC). Other implementations are also possible, for instance, a circuit composed of separate logic com- ponents, or a processor with software. A combination of these different implementations is also possible. When selecting an implementation, the person skilled in the art considers the requirements set for the size and power consumption of the device, required processing power, manufacturing costs and production capacity. It should be noted that Figure 1 mainly illustrates functional entities , and in practice the parts of the equipment may deviate from what is shown, because, ultimately, it is the degree of integration that counts: how the application concerned can implement a device with a desired feature for automatic zooming most efficiently and at reasonable costs. [0041] The above-described device is applicable to various purposes. It is also possible to manufacture a low-cost version of the device. One low-cost version is a device that also comprises a cheap camera and is con- nectable to mains current. In addition, the device comprises parts, which enable it to serve as a subscriber terminal in a mobile network. Hence, for in- stance, a cottage owner can set the device to keep the cottage under surveillance and the device will transmit an image to the subscriber terminal of the owner immediately on detecting motion and zooming to it. Thus the device acts as a burglar alarm and a surveillance device, by which the situation at the cottage can be checked. Naturally, other detectors can also be connected to the device, such as a fire alarm device. The person skilled in the art will also find other applications for the basic device described above.

[0042] Next is described the implementation of some functions of the device using a pseudocode according to the syntax of the C programming language. Header

#define mmin(a.b) ((a)<(b)?(a):(b)) /* macro for minimum of two values */

typedef struct { int **lum; int **cb; int **cr; }data;

typedef struct int mid[2]; /^* midpoint */ int hor; /^* horizontal magnitude int ver; /* vertical magnitude 7

}mot;

typedef struct

{ int lines; /* frame height 7 int columns; /* frame width 7 mot motion; /* area, where motion takes place 7 mot zoom; /^* area, where zoom is 7 data previous; /* previous frame 7 data present; /^* present frame 7 int **movement; /* motion register 7 int **shape; /* filtered motion register 7 int sensitivity; /* sensitivity 7

}secur; /* variable defined in main function 7

Motion register function

It adds the difference of the frames to the motion register and reduces the data in the register by one pixel at a time.

int Add_motion(secur *frame)

{ int i, j; /^* counters 7 int r = frame->lines; /* lines in frame 7 int c = frame->columns; /* columns in frame

for(i=0; i<r; i++)

{ forG=0; j<c; j++)

{ frame->movement[i][j]⁺=

((frame->present).lum[i] Hfra e->previous).lum[i][j]); if(frame->movement[i][j]>0) frame->movement[i][j] -= 1 ; if(frame->movement[i][j]^<0) frame->movement[i][j] += 1 ;

}

} retum(1);

}

Motion check function

int Motion_check(secur *frame)

{ int i. j, k, I; /* counters 7 int r = frame->lines; /* lines in frame 7 int c = frame->columns; /* columns in frame 7 int found=0; /* motion detected or not 7 int sum; /* number of motion pixels in block 7 int min_k, max_k, minj, maxj; /* variables of detected motion 7 int sens = frame->sensitivity; /* sensitivity 7

for(i=0; i<r; i++) /* copy and filter motion register 7

{ for(j=0; j<c; j++) { frame->shape[i][j]=frame->movement[i][j])/sens

} }

min_k = c; /* initialize minima and maxima 7 max_k = 0; minj = r; max I = 0;

for(k=0; k<c-7; k+=4) /* find motion and its area in 8x8 blocks7 { /^* blocks overlap 7

for(l=0; Kr-7; l+=4)

{ sum=0;

for(i=0; i<64; i++)

{ if(frame->shape[l+i/8][k+i%8]!=0) sum++;

}

if(sum>sens)

{ found=1; if(k<min_k) min_k=k; if(k>max_k) max_k=k+3; if(KminJ) min_l=l; if(l>max_l) max_l=l+3;

}

} }

if(found==1) /* if motion is found, calculate its width, height and midpoint 7 {

(frame->motion).mid[0]= (min_k+max_k)/2; (frame->motion).mid[1]= (min_l+max_l)/2; (frame->motion).hor = max_k-min_k; (frame->motion).ver = maxj-minj; } else /* if no motion is found, width, height and midpoint are those of frame 7 {

(frame->motion).hor = c;

(frame->motion).ver = r;

(frame->motion).mid[0] += mmax(((c/2)-(frame->motion).mid[0])/8,1);

(frame->motion).mid[1] += mmax(((r/2)-(frame->motion).mid[1])/8,1);

}

return(1); }

Zooming function

int Zoom_control(secur *frame)

{ int r = frame -> lines; /* lines in frame 7 int c = frame -> columns; /* columns in frame 7 int left, right, up, down; /* zoom borders 7 int i; /* counter 7

/* calculate new zoom height and width (with respect to motion) 7

if((frame->motion).hor < (c*(frame->motion).ver)/r)

{ if((frame->zoom).ver >= (frame->motion).ver +r/6)

{ (frame->zoom).ver -= mmin(abs(((frame->zoom)Ner-(frame->motion)Ner -r/6)/4),4);

(frame->zoom).ver = mmax(mmin((frame->zoom).ver,r),r/2); (frame->zoom).hor = (c*(frame->zoom).ver)/r;

(frame->zoom).hor = mmax(mmin((frame->zoom).hor,c),c/2); (frame->zoom).hor = ((frame->zoom).hor/2)*2; (frame->zoom)Ner = ((frame->zoom).ver/2)*2;

} if((frame->zoom)Ner < (frame->motion).ver + r/10) {

(frame->zoom).ver += mmin(abs(((frame->zoom).ver-(frame->motion).ver - r/10)/4),8);

(frame->zoom).ver = mmax(mmin((frame->zoom)Ner,r),r/2); (frame->zoom).hor = (c*(frame->zoom).ver)/r; (frame->zoom).hor = mmax(mmin((frame->zoom).hor,c),c/2); (frame->zoom).hor = ((frame->zoom).hor/2)*2; (frame->zoom).ver = ((frame->zoom)Ner/2)*2;

} } /* end of if-clause 7

else { if((frame->zoom).hor >= (frame->motion).hor + c/6)

{

(frame->zoom).hor -= mmin(abs(((frame->zoom).hor-(frame->motion).hor - c/6)/4),4);

(frame->zoom).hor = mmax(mmin((frame->zoom).hor,c),c/2); (frame->zoom)Ner = (r*(frame->zoom).hor)/c; (frame->zoom).ver = mmax(mmin((frame->zoom).ver,r),r/2); (frame->zoom).hor = ((frame->zoom).hor/2)*2; (frame->zoom).ver = ((frame->zoom).ver/2)*2;

} if((frame->zoom).hor < (frame->motion).hor + c/10)

{ (frame->zoom).hor += mmin(abs(((frame->zoom).hor-(frame->motion).hor - c/10)/4),8);

(frame->zoom).hor = mmax(mmin((frame->zoom).hor,c),c/2); (frame->zoom).ver = (r*(frame->zoom).hor)/c; (frame->zoom).ver = mmax(mmin((frame->zoom).ver,r),r/2);

(frame->zoom).hor = ((frame->zoom).hor/2)*2; (frame->zoom).ver = ((frame->zoom).ver/2)*2;

} } /* end of else-clause 7

/* move zoom focal point to midpoint of motion 7

for(i=0; i<2; i++)

{ if((frame->zoom).mid[i]<(frame->motion).mid[i]-4) {

(frame->zoom).mid[i] +=mmin(abs(((frame->zoom).mid[i]-(frame->motion).mid[i])/4),8);

} if((frame->zoom).mid[i]>(frame->motion).mid[i]+4) {

(frame->zoom).mid[i]

-=mmin(abs(((frame->zoom).mid[i]-(frame->motion).mid[i]))/4,8);

}

/* keep zoom inside frame 7

(frame~>zoom).mid[0] = mmin(mmax((frame->zoom).mid[0],(frame->zoom).hor/2),c-(frame-

>zoom).hor/2); (frame->zoom).mid[1] = mmin(mmax((frame->zoom).mid[1],(frame->zoom)Ner/2),r-(frame-

>zoom)Ner/2);

left = (frame->zoom).mid[0]-(frame->zoom).hor/2; right = (frame->zoom).mid[0]+(frame->zoom).hor/2; up = (frame->zoom).mid[1]-(frame->zoom).ver/2; down = (frame->zoom).mid[1]+(frame->zoom)Ner/2;

Convert(frame,left,right,up,down); /* Call for interpolation function 7

return(1); }

[0043] Next, a method for automatic zooming is described with reference to the flow chart in Figure 2. The method starts in 200. In 204 an image is then read from a camera. The image should be such that it allows direct lu- minance component reading or such that a luminance component can be generated on the basis of the information in the image. The luminance component is stored in a memory in 206.

[0044] Then, in 208 the next image is read from the camera. Sensitivity adjustment of the system, which is carried out in the above-described manner, is performed in 210.

[0045] Next, 212 is proceeded to, where difference data representing the difference in magnitude between each luminance pixel of a previous image and the corresponding luminance pixel of the present image is formed, for instance, by subtracting from each luminance pixel of the previous image the corresponding luminance pixel of the present image. In the same 212, the formed difference data are added to a cumulative motion register, where a data element corresponds to each luminance pixel of the image.

[0046] Thereafter, the data elements whose values exceed a predetermined threshold value are detected as motion in 214. Motion detection is checked in 216. If motion is detected, 230 is proceeded to, otherwise 222 is proceeded to.

[0047] Zooming to motion is performed in 230 in the previously described manner. After optional zooming the process proceeds to 232, where a zoomed frame is applied to an encoder. In one embodiment the camera can be movable or have an optical zoom, so in 234 it is checked whether these features exist.

[0048] If these features do not exist, the process proceeds to 208, where the next frame is read from the camera. If said features are in use, the process proceeds to 236, where the camera is controlled in a desired manner, for instance, in the previously described manner the camera is turned in the direction of detected motion or the motion is optically zoomed at. Also, in the previously described manner the values of the previous frame and the motion register are modified in 240, in order that motion detection would operate correctly, and motion resulting from camera movement or optical zooming would not be detected as motion. From 240 the process proceeds to 208, where the next frame is read from the camera. [0049] Figure 2 does not illustrate the ending of the method performance, because in principle it can be ended at any point. A natural ending point is the point, when there is no longer a need to examine successive frames. A device of the above-described type is well suited for performing the method, but other devices can also be applied in the implementation of the method. Advantageous embodiments of the method are disclosed in the attached dependent method claims. Their operation is described above in connection with the device, and therefore it is unnecessary to repeat the description herein. [0050] In the following, an example of image processing is described extensively in order to make the operating principle of automatic zooming completely clear.

[0051] Figures 4A, 5A, 6A and 7A show four images selected from a sequence of successive images. Because it would take too much space to show here all the frames of the sequence, four representative frames have been selected from said sequence.

[0052] A camera takes images of a table in the corner of a room. As appears from Figure 4A, there is a radio recorder on the table.

[0053] Next, Figure 5A shows a frame appearing later on in the se- quence, where a person has walked to the table.

[0054] In the frame of Figure 6A the person has taken the radio recorder from the table and holds it in his arms.

[0055] In the frame of Figure 7A the person and the radio recorder have disappeared from the image, which only shows the corner of the room and the empty table.

[0056] Figure 10 illustrates the processing of a luminance pixel in the motion register in connection with the example of Figures 4A, 5A, 6A and 7A. Figure 10 is formed according to the same principle as Figure 3, i.e. numbers 1 to 30 of the successive frames appear on the horizontal axis, and the value range of the luminance pixel and the motion register data element appear on the vertical axis. Variations in the luminance pixel value to be observed are represented by curve 1000. Variation in the motion register data element value corresponding to said luminance pixel is represented by curve 1002. [0057] In frames 1 to 13, the luminance pixel selected for observation is part of the radio recorder. Slight variation in the pixel value is noise. In frames 14 to 15, the luminance pixel is part of the person's hand. In frames 16 to 30, the luminance pixel is part of the table.

[0058] As curve 1002 reveals, our example employs again the embodiment, in which the data element values differing from zero in the motion register are reduced towards zero by a predetermined quantity in connection with each difference data addition. In our example, the predetermined quantity is five.

[0059] Figures 4B, 5B, 6B and 7B show how the motion detected in Figures 4A, 5A, 6A and 7A appears in a black-and-white image. In our exam- pie, black is selected to indicate motion and white to indicate immobility.

[0060] Because there is no motion in Figure 4A, Figure 4B shows only white.

[0061] In Figure 5A motion is detected, and it shows as a black figure of a person in Figure 5B. [0062] In Figure 6B it can be seen that the person has taken the radio recorder. As it can also be seen in Figure 6B, carrying the radio recorder away still shows as motion in the place where it stood. As it can be seen in Figure 7B, the motion of the person having disappeared already, it still shows that the radio recorder was carried away. In the earlier described manner, it is possible to adjust the duration of the motion detection by adjusting the threshold value of the motion detection and the predetermined quantity used for reduction in connection with the difference data addition. If the whole sequence of black-and-white frames is considered to be a moving image, it shows the motion of the person as a black figure that comes to the table, grabs the radio recorder and leaves the corner of the room in the image area captured by the camera. The reduction of the data element value causes the motion detected in the image to disappear from the black-and-white image eventually, i.e. when the value of the data elements in the motion register reduces sufficiently, the black radio recorder will also disappear from the black-and-white image. [0063] Figures 4C, 5C, 6C and 7C show an image when zoomed to motion detected in Figures 4A, 5A, 6A and 7A.

[0064] As the comparison of Figures 4A and 4C shows, zooming to motion is not used in Figure 4C, because no motion is detected yet in Figure 4B. [0065] Figure 5C shows how to start zooming the image in the direction of detected motion, i.e. towards the person who has walked into the image.

[0066] In Figure 6C zooming is continued. [0067] In Figure 7C zooming is directed to the radio recorder, which was carried away, because in Figure 7B the carrying away remains visible as motion.

[0068] When motion is no longer detected in the image, the image area can be restored, for instance, by continuous zooming back to the original image area.

[0069] Next, Figure 8 illustrates the effect of camera movements. Using the earlier described embodiment, the camera, which produced the images, is controlled to move in the direction of the motion detected in the image. The previous image is framed by a frame 800 and the present image by a frame 802. The diamond-patterned area 804 represents the old image area that is left out of the motion detection observation. The chequered area 806 is such that only appears in the new image, and from which the data elements that correspond only to the luminance pixels of the present image are zeroed. A motion vector 808 between the previous image 800 and the present image 802 defines the direction in which the camera was moved. The common area 810 of the previous image 800 and the present image 802 is the area, from which the luminance pixels of the previous image 800 will be transferred to the place of the luminance pixels representing the same image area in the present image 802, and likewise, the data elements in the motion register will be trans- ferred to the place of said same image area. In practice, the data elements in the motion register and the luminance pixels of the previous image 800 in the frame buffer are moved for a distance defined by the motion vector in the opposite direction.

[0070] Finally, Figure 9 illustrates the effect of the use of an optical zoom in the camera on the above-described procedures. Using the earlier described embodiment, the camera that produced the images is controlled to zoom optically in the direction of motion detected in the image. The previous image is framed by a frame 900 and the present image by a frame 902. The chequered area 904 is the area that appears only in the previous image 900. Area 906 is the common area of the previous image 900 and the present image 902. The image area of the previous image 900 is modified to correspond to the image area of the present image 902 by leaving out the area 904 and by interpolating thereto the missing luminance pixels corresponding to the image area of the present image 902, and the image area of the motion register is modified to correspond to the image area of the present image 902 by interpo- lating thereto the missing data elements corresponding to the image area of the present image 902.

[0071] Even though the invention is described above with reference to the example of the attached drawings, it is apparent that the invention is not limited thereto, but it can be modified in a variety of ways within the inventive idea disclosed in the attached claims.

Claims

1. A method for automatic zooming, characterized by the method comprising: forming (212) difference data representing the difference in magni- tude between each luminance pixel of a previous image and a corresponding luminance pixel of the present image; adding (212) the formed difference data to a cumulative motion register, in which a data element corresponds to each luminance pixel of the image; detecting (214) as motion the data elements, whose value exceeds a predetermined threshold value; and zooming (230) to the detected motion.

2. The method of claim 1, characterized by detecting the motion blockwise.

3. The method of claim 2, characterized by detecting the motion in an image block if at least a predetermined number of data elements in a motion register corresponding to said block exceed a predetermined threshold value.

4. The method of claim 1, characterized by reducing data element values differing from zero towards zero for a predetermined quantity in connection with each difference data addition.

5. The claim of claim ^characterized by zooming to the detected motion by interpolating the zoomed area to have the original image size.

6. The method of claim 1, characterized by encoding the zoomed image.

7. The method of claim 1, characterized by storing a zooming position in a memory and allowing only a change of a predetermined degree in the zooming position between two successive frames.

8. The method of claim ^characterized by storing a zoom- ing ratio in a memory and allowing only a change of a predetermined degree in the zooming ratio between two successive frames.

9. The method of claim 1, characterized by controlling the sensitivity of the motion detection by threshold value adjustment.

10. The method of claim 1, characterized by forming the threshold value for the present image using a quantity describing an average value of the luminance pixels in the present image and the threshold value of the previous image.

11. The method of claim ^ characterized in that if the image area of the present frame has shifted with respect to the image area of the previous frame, i.e. the camera producing the frames has been moved, the luminance pixels of the previous frame are transferred to the place of the luminance pixels representing the same image area in the present frame, the corresponding data elements in the motion register are transferred to the place of the same image area, and the data elements corresponding to the luminance pixels of the present frame only are reset to zero in the motion register.

12. The method of claim 1, characterized in that if the image area of the present frame has been zoomed optically with respect to the image area of the previous frame, i.e. the camera producing the frames has employed optical zooming, the image area of a previous frame is modified to correspond to the image area of the present frame by interpolating thereto the missing luminance pixels corresponding to the image area of the present frame, and the image area in the motion register is modified to correspond to the image area of the present frame by interpolating thereto the missing data elements corresponding to the image area of the present image.

13. The method of claim ^characterized by controlling the camera producing the images to move in the direction of the motion detected in the image.

14. The method of claim ^characterized by controlling the camera producing the images to zoom optically to the motion detected in the image.

15. A device for automatic zooming, characterized by comprising means (116) for forming difference data representing the difference in magnitude between each luminance pixel of a previous image and a corre- sponding luminance pixel of the present image; means (118, 124) for adding the formed difference data to a cumulative motion register (126), in which a data element corresponds to each luminance pixel of the image; means (136) for detecting as motion the data elements, whose value exceeds a predetermined threshold value; and means (132) for zooming to the detected motion.

16. The device of claim 15, characterized in that the means (136) for detecting the motion operate blockwise.

17. The device of claim 16, characterized in that the motion is detected in an image block if at least a predetermined number of data ele- ments in a motion register (126) corresponding to said block exceed a predetermined threshold value.

18. The device of claim 15, characterized by also comprising means (120) for reducing data element values differing from zero towards zero by a predetermined quantity in connection with each difference data addi- tion.

19. The device of claim 15, characterized by also comprising means (132) for zooming to the detected motion by interpolating the zoomed area to have the original image size.

20. The device of claim 15, characterized by also compris- ing means (150) for encoding the zoomed image.

21. The device of claim 15, characterized by also comprising means (132, 136) for storing a zooming position in a memory and allowing only a change of a predetermined degree in the zooming position between two successive frames.

22. The device of claim 15, characterized by also comprising means (132, 136) for storing a zooming ratio in a memory and allowing only a change of a predetermined degree in the zooming ratio between two successive frames.

23. The device of claim 15, characterized by controlling the sensitivity of the motion detection by threshold value adjustment.

24. The device of claim 15, characterized by also comprising means (136) for forming the threshold value for the present image using a quantity describing an average value of the luminance pixels in the present image and the threshold value of the previous image.

25. The device of claim 15, characterized by also comprising means (136) for transferring the luminance pixels of a previous frame to the place of the luminance pixels representing the same image area in the present frame, for transferring the corresponding data elements in the motion register to the place of the same image area, and for resetting to zero the data ele- ments corresponding to the luminance pixels of the present frame only in the motion register, if the image area of the present frame has shifted with respect to the image area of the previous frame, i.e. the camera producing the frames has been moved.

26. The device of claim 15, characterized by also comprising means (136) for modifying the image area of a previous frame to corre- spond to the image area of the present frame by interpolating thereto the missing luminance pixels corresponding to the image area of the present frame, and for modifying the image area in the motion register to correspond to the image area of the present frame by interpolating thereto the missing data elements corresponding to the image area of the present image, if the image area of the present frame was zoomed optically with respect to the image area of the previous frame, i.e. the camera producing the frames employed optical zooming.

27. The device of claim 15, characterized by also comprising means (102, 136) for controlling the camera producing the images to move in the direction of the motion detected in the image.

28. The device of claim 15, characterized by also comprising means (102, 136) for controlling the camera producing the images to zoom optically to the motion detected in the image.

29. The device of claim 15, characterized by also compris- ing means (154) for transmitting the zoomed image using data transmission link.

30. A device for automatic zooming, characterized by being configured to form difference data representing the difference in magnitude be- tween each luminance pixel of a previous image and a corresponding luminance pixel of the present image; to add the formed difference data to a cumulative motion register, in which a data element corresponds to each luminance pixel of the image; to detect as motion the data elements, whose value exceeds a pre- determined threshold value; and to zoom to the detected motion.

31. The device of claim 30, characterized by being also configured to detect motion blockwise.

32. The device of claim 31, characterized in that motion is detected in an image block, if at least a predetermined number of data ele- ments in a motion register corresponding to said block exceeds a predetermined threshold value.

33. The device of claim 30, characterized by being also configured to reduce data element values differing from zero towards zero for a predetermined quantity in connection with each difference data addition.

34. The device of claim 30, characterized by being also configured to zoom to the detected motion by interpolating the zoomed area to have the original image size.

35. The device of claim 30, characterized by being also configured to encode the zoomed image.

36. The device of claim 30, characterized by being also configured to store a zooming position in a memory and allow only a change of a predetermined degree in the zooming position between two successive frames.

37. The device of claim 30, characterized by being also configured to store a zooming ratio in a memory and allow only a change of a predetermined degree in the zooming ratio between two successive frames.

38. The device of claim 30, characterized by being also configured to control the sensitivity of the motion detection by threshold value adjustment.

39. The device of claim 30, characterized by being also configured to form the threshold value for the present image using a quantity describing an average value of the luminance pixels in the present image and the threshold value of the previous image.

40. The device of claim 30, characterized by being also configured to transfer the luminance pixels of a previous frame to the place of the luminance pixels representing the same image area in the present frame, to transfer the corresponding data elements in the motion register to the place of said same image area, and to reset to zero the data elements in the motion register corresponding to the luminance pixels of the present frame only, if the image area of the present frame has shifted with respect to the image area of the previous frame, i.e. the camera producing the frames has been moved.

41. The device of claim 30, characterized by being also configured to modify the image area of the previous frame to correspond to the image area of the present frame by interpolating thereto the missing luminance pixels corresponding to the image area of the present frame, and to modify the image area in the motion register to correspond to the image area of the present frame by interpolating thereto the missing data elements corresponding to the image area of the present frame, if the image area of the present frame was zoomed optically with respect to the image area of the previous frame, i.e. the camera producing the frames employed optical zooming.

42. The device of claim 30, characterized by being also configured to control the camera producing the images to move in the direction of the motion detected in the image.

43. The device of claim 30, characterized by being also configured to control the camera producing the images to zoom optically to the motion detected in the image.

44. The device of claim 30, characterized by being also configured to transmit the zoomed image using data transmission link.