WO2024240353A1 - Method and devices for media capturing - Google Patents
Method and devices for media capturing Download PDFInfo
- Publication number
- WO2024240353A1 WO2024240353A1 PCT/EP2023/064040 EP2023064040W WO2024240353A1 WO 2024240353 A1 WO2024240353 A1 WO 2024240353A1 EP 2023064040 W EP2023064040 W EP 2023064040W WO 2024240353 A1 WO2024240353 A1 WO 2024240353A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- media
- capturing device
- media capturing
- partial
- module
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 230000009471 action Effects 0.000 claims abstract description 47
- 238000004590 computer program Methods 0.000 claims abstract description 26
- 238000003860 storage Methods 0.000 claims description 13
- 230000003997 social interaction Effects 0.000 claims description 7
- 238000012913 prioritisation Methods 0.000 description 6
- 238000005457 optimization Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 101000741965 Homo sapiens Inactive tyrosine-protein kinase PRAG1 Proteins 0.000 description 1
- 102100038659 Inactive tyrosine-protein kinase PRAG1 Human genes 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000003467 diminishing effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003028 elevating effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/69—Control of means for changing angle of the field of view, e.g. optical zoom objectives or electronic zooming
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
- H04N23/611—Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
- H04N23/631—Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
- H04N23/633—Control of cameras or camera modules by using electronic viewfinders for displaying additional information relating to control or operation of the camera
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
- H04N23/633—Control of cameras or camera modules by using electronic viewfinders for displaying additional information relating to control or operation of the camera
- H04N23/635—Region indicators; Field of view indicators
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/64—Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/695—Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/698—Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
Definitions
- the technology disclosed herein relates generally to the field of media capturing, and in particular to devices and methods for capturing media by a media capturing device.
- the user is notified about that, e.g., a part of a face is outside the picture and may then adapt camera parameters so that no or at least fewer features are lost while capturing and thereby giving the best possible result.
- a user is taking a picture of a soccer team having 19 players and 3 leaders. If the user somehow risks capturing an image including several partial faces, a known solution is that re-adjustments of camera parameters are made in order to avoid the partial faces.
- An objective of embodiments herein is to address and improve various aspects relating to picture capturing.
- a particular objective is to improve quality of a photograph by avoiding partial objects and/or persons being photographed.
- Still another objective is to enable, preferably automatically, a user to select which persons to prioritize, such as friends and relatives, even in a crowd.
- Another objective is to provide a user in action of capturing a photo or a video using e.g., a mobile device, one or more suggested execution alternatives.
- a method for capturing media by a media capturing device is performed in a device in, or connected to, the media capturing device.
- the method comprises identifying presence, on a screen of the media capturing device, of one or more partial objects in the media to be captured and obtaining information about relations between a person and the one or more partial objects.
- the method further comprises associating, based on the obtained information, a significance value to each identified partial object, and suggesting an adjustment action of the media capturing device based on the one or more significance values.
- the computer program comprises computer code which, when run on processing circuitry of a device, causes the device to perform a method according to the first aspect.
- a computer program product comprising a computer program as above, and a computer readable storage medium on which the computer program is stored.
- a device for for a media capturing device is configured to identify presence, on a screen of the media capturing device, one or more partial objects in the media to be captured, to obtain information about relations between a person and the one or more partial objects, to associate, based on the obtained information, a significance value to each identified partial object, and to suggest an adjustment action of the media capturing device based on the one or more significance values.
- these aspects enable a user in action of capturing a photo or a video to obtain a more valuable photography in that persons and/objects that are captured are - in some sense - prioritized or preferred by the user.
- a user using e.g., a mobile device is provided with one or more suggested execution alternatives.
- One such execution alternative maybe to assess a number of partial faces that risk to be only partially captured and then suggest mitigations for this detected scenario.
- Persons known and related to the user may be swiftly and automatically identified and selected over persons unknown to the user.
- the user may be presented with execution alternatives to reduce e.g., partial faces or number of unknown persons.
- Fig. i illustrates partial face recognition according to embodiments.
- Fig. 2 illustrates partial face recognition using adjustment features according to embodiments.
- Fig. 3 illustrates a partial object avoidance feature according to embodiments.
- Fig. 4 illustrates a zooming feature according to embodiments.
- Fig. 5 illustrates a feature on prioritization of combined objects according to embodiments.
- Fig. 6 is a flowchart of various embodiments of a method.
- Fig. 7 is a schematic diagram showing functional units of a device according to an embodiment.
- Fig. 8 is a schematic diagram showing functional modules of a device according to an embodiment.
- Fig. 9 shows one example of a computer program product comprising computer readable means according to an embodiment.
- method and means such as devices, software and/or hardware, are disclosed for providing an improved way to detect and prioritize one or more partial objects, such as partial faces and/or objects located in edge regions of a capturing area of a media capturing device.
- a prioritization feature is provided, wherein, for instance, a user’s known relationships are used for suggesting objects to be included in the photography.
- a user-specific adjustment of camera settings is disclosed, that may be used for optimizing a metric such that less user-prioritized objects are excluded or partial and that more user-prioritized objects are fully captured.
- a person taking a picture is provided with a method and means to select, in an automated way, who and/or what should be photographed and thus enabling the photographer to catch selectable persons on a photography.
- some persons are more important than others, e.g., it is more important to include family members in a photo than strangers.
- the herein described method comprises steps of prioritizing detected partial object in relation to the user’s various relationships, e.g., known relationships such as friends, colleagues, and family, but also potential future relationships, such as friends of a friend (2 nd ring of friends), work colleagues and their contacts, etc.
- Information on such relations may be obtained (e.g., fetched or received) in a number of different ways, for instance via social contacts and networks, possibly in several stages/tiers, photo albums, previously captured photos, etc.
- the information is obtained in order to enable adjustment of camera settings that optimizes a partial object significance metric, such that less user-prioritized objects are excluded, or are still partial and/or that more user-prioritized objects (having higher significance metric) are fully captured to largest extent possible.
- the method and means disclosed herein suggests that, for instance, a camera re-direction or re-adjustment is evaluated, which re- direction/re-adjustment is optimized such that the user’s prioritized faces are to be fully included in the on-going media capturing, and thus providing a ’’best solution”, in a viewpoint of the user, among otherwise similarly good choices.
- FIG. 1 illustrates partial face recognition according to embodiments.
- a media capturing device e.g., a smart phone, a tablet, a smart lens, a digital camera or the like, comprises some type of media capturing capabilities. It may, for instance, be enabled for taking still photos and/ or recording videos.
- the captured media is denoted media stream, and in figure 1 an exemplary scenery 3 for a photo or video capturing session is shown. This scenery 3 comprises a crowd of faces but could of course be any type of scenery comprising any type of objects.
- the media capturing device 1 comprises an object- and/or face recognition feature for determining presence of a partial object/partial face in the media stream, in the following denoted face recognition feature for simplicity.
- face recognition feature visual aids 5, 6 are provided: a full-face indicator 6 shown in the figure as a rectangle around any recognized full face that is currently visible on a screen 2 of the media capturing device 1, and a partial face indicator 5, in the figure shown as a rectangle with dashed lines around a face recognized as being only partly visible on the screen 2.
- a module 4 is also indicated in figure 1, in which methods according to the present teachings may be implemented.
- the module 4 may implement all the disclosed steps or some of the steps in cooperation with e.g., a managing server 7, e.g., a cloud node acting as such a managing server 7.
- the module 4 may be an application (“app”) residing in, for instance, the media capturing device 1 and performing the steps of the method disclosed herein.
- the media capturing device 1 comprises means for obtaining information on the user’s relationships.
- Such means may, for instance, be the application, which in turn may be part of the module 4 in the media capturing device 1.
- the information on the user’s relationships may be obtained, e.g., by fetching, searching and/ or requesting it from one or more of: the managing server 7, various databases, social media servers, social contact lists, photo albums, previously captured photos, e-mail texts, addresses, contacts, shared playlists e.g., through streaming application and other information available in and/or by the media capturing device 1.
- the module 4 may then apply the information on these known relationships between the user and any partial face(s) to these partial faces captured in the current media stream when determining which partial faces should be included or excluded. For instance, a first partial face maybe identified as a family member, while a second partial face is identified as an unknown person.
- the module 4 may associate a significance metric value to each such detected face and/or partial face.
- the significance metric value may consider the user’s social relationships, acquired via social contacts, contact information, message application entries, endorsements, photo albums, etc., and/or in relation to user context.
- the significance metric value may further consider the user’s link(s) and/ or affiliation to non-human objects (pets, cars etc.) acquired in different ways, as has been exemplified.
- the significance metric value may further take into account the context of the media capturing such as the geo-location (indoors, outdoors, home, at a venue etc.), office hours, off-work, weekdays, weekend etc.
- the significance metric value for a work colleague could be higher during office hours than outside working hours, while the significance metric value for children and coaches would be higher during a soccer game.
- the significance metric value may consider if the occasion is private or business related, or an event (acquired from external servers). There is thus a vast number of ways to associate a significance metric value to a person.
- the significance metric value typically has different values associated to different persons.
- the significance metric value may be:
- a medium priority value for less occurring faces That is, faces that are less frequently occurring in contacts, interactions, media flows etc., and these faces may be frequently occurring but not VIP.
- the significance metric value is applied for the detected partial faces (and/or objects).
- re-adjustment actions are provided. These re-adjustment actions may be based on the obtained significance metric values for the detected partial faces (and/or objects). The aggregated sum of the partial object significance metric value present in the media capturing should be maximized. For instance, a degree of a partial face may be given a certain score, a degree of a partial object another score, etc. The scores may then be used for prioritizing what to include in the picture.
- a face of a family member is partially shown and given a score of 65 points, which is above a certain set threshold score of 50 points, the threshold score being set, for instance, such that for the partially visible family member to be relevant for inclusion in the picture at least 50 points should be reached.
- significance metric values for deciding if and in what priority order to include different persons and objects.
- significance metric value for family members and an associated threshold value, thr_family, and another metric value and threshold for persons not being a family member.
- An optimization procedure may then be implemented, wherein it is more worth having one full family face and one partial family face, than to a total stranger’s face and two partial family faces.
- Exemplary significance metric values could be that a fully visible family face is given a maximum value, e.g., 100 points, and a partially visible family face is given a value of 50 points. A total stranger’s fully visible face may be given 40 points, while a partially visible face is given 20 points. A number of potential camera actions may be based on such significance metric values.
- a first possible action may be that a fully visible family member’s face, a partially visible family member’s face and stranger’s face are captured. This first possible action is then given 170 points according to the given exemplary values.
- a second possible action may be that two partially visible family member’s faces and a fully visible stranger’s face are captured. This second possible action is thus given 140 p.
- a feature in or for the media capturing device 1 may be that the action to take is the photo (for instance) to be captured is the one obtaining the highest score.
- category “family” is given a first threshold and that category “unknown” is given a second threshold, so that if category “family 44 sums up to e.g. valuer (> thri) and resulting in that associated #Action A is given priority given that category “unknown” sums up to value2 ( ⁇ thr2) is not above its threshold value.
- the re-adjustment actions may comprise zooming out; turning the media capturing device 1 left, right, up or down, or any combinations thereof; rotating the media capturing device 1 clockwise or counter clockwise; sliding photo position left or right; creating a panorama picture to capture relevant faces (e.g., if zoom is not an option) by stitching of consecutive adjacent image captions. It is realized that there are several other types of re-adjustment actions that can be made to include faces that are currently only partly included and that have e.g., a high priority.
- Fig. 2 illustrates partial face recognition using adjustment features according to embodiments.
- the figure shows a particular example, wherein three partial faces are indicated as PFi (Partial Face i), PF2 (Partial Face 2), and PF3 (Partial Face 3).
- PFi Partial Face i
- PF2 Partial Face 2
- PF3 Partial Face 3
- a face recognition program has been applied and the relevance for these partial faces is determined. For instance, PFi was found to be a social contact with a high priority, PF2 a social contact with a medium priority, and PF3 an unknown person having e.g., a low priority or wherein priority is not applicable at all.
- a significance metric value may thus be given to each of the partial faces.
- At least one re-adjustment action is suggested so that the aggregated significance metric value is maximized.
- an arrow “Adjustment Feature” is shown, indicating that adjustments may be suggested and be made by the user, e.g., zooming out, zooming in, adjust field-of-view direction etc. It might, depending on the scenery and relative placement of the persons, happen that one or more known relations cannot be included by any re-adjustment of the media capturing device 1, but, for instance, only PFi and PF2.
- One exemplary re-adjustment action that would maximize the aggregated significance metric value could be to zoom out. If many solutions with equal significance are found, various criteria on how to select among these could be used.
- the solution which minimizes the distance that the media capturing device 1 would need to move could be one criterium; another could be the solution with the least number of necessary re-adjustments.
- the media capturing device 1 may suggest any such re-adjustment action in a prioritized order.
- Fig. 3 illustrates a partial object avoidance feature according to embodiments.
- a trade-off maybe required between objects that have the same priority values (significance metric values), or between clusters of objects of certain types.
- priority values signaling metric values
- multiple objects 10, 12, 14 are associated with the user and there is a need for further prioritization within these objects 10, 12, 14.
- a user preference and user-specific priority rules may consider a trade-off between the objects “husband”, “car” and “family dog”. Referring to the uppermost part of the figure, only the family dog is an object that is detected and classified, while the car and the husband are detected and only partly classified. In the lower-most part of the figure, a full view of a possible image is shown, which includes all three objects.
- Fig. 4 illustrates a zooming feature according to embodiments and related to figure 3. Depending on the user’s partial object significance metric values, the result may be that the suggested re-adjustment is to zoom in on a specific object, in this case, illustrated in figure 4, the car and not the possible full view shown in lowermost part of figure 3.
- the herein presented partial object avoidance solution may, according to a further aspect, consider other photo preferences and optimization metric than for the user taking the photo or video.
- the module 4 may also be configured to obtain preferences from other users so that the first user’s media capturing device 1 may adjust according to any suggested adjustments. Examples on such adjustments may comprise: elevating (moving up or down), strafing (moving left or right), surging (moving forwards or backwards), pitching (looking up or down), rolling (head pivoting from side to side), yawing (head turning along horizontal axis).
- the user is requested to take a picture with some known faces included, whereas the person, the second person, is the one requesting the picture.
- the user typically does not know how to prioritize among (partial) objects in the camera’s field of view (FOV), as the user does not have relations to these objects.
- the second person may typically have other social interactions to consider than the user.
- an as-is feature is provided. Assuming that the first user has a social relation to the second person, who is the requester of the media, the user may in a FOV of the media capturing device 1 indicate and/or select a face as as-is media capturer. In a following step, the user of the media capturing device 1 may request information from the managing server 7.
- object/face recognition attributes and associated social relations metric for the second person is requested from the managing server 7, and having acquired that information, the user may, in the prioritization step, apply settings of the second person.
- the user provides a request to the managing server 7 in order to obtain re-adjustment setting(s) associated with a person in the field of view.
- the social connections of the person indicated in the picture are known by the managing server 7 and is used by the media capturing device 1 (e.g., camera).
- the managing server 7, which may comprise of be an application manager, may provide a process execution response contain any of ACK / NACK for a requester (e.g., first user).
- multiple versions of capturing may be stored associated with different prioritization profiles. For example, for same full FOV venue-scenery, e.g., as shown in figure 3, one car-enthusiast version, shown in figure 4, may be optimized for and stored, another optimization and media capturing for a dog-enthusiast profile, and e.g., a third one “parent’s view” optimizing and storing media with kid’s face/ doings.
- the module 4 and/or the managing server 7 may request information of e.g., public or known faces (e.g., celebrities, officials) for which a partial face optimization may be considered.
- information of e.g., public or known faces e.g., celebrities, officials
- the media capturing device 1 may output a first indicative sound(s), for instance a buzzing sound, and/or haptic feedback, or the like, if partial faces, partial objects, etc. are detected in the FOV of the media capturing device 1.
- the user may then, in response to an output or notification from suggested re-adjustments, move or adjust the media capturing device 1 according to a direction (according to significance metric values).
- the media capturing device 1 may output a second indicative sound, e.g., buzzing and/or haptic feedback indicative for an increased user significance metric value, correspondingly, if the user moves or adjusts the media capturing device 1 in a direction giving a diminishing significance metric value, it may output a third indicative sound (buzzing or haptic feedback) indicative for a decreased user significance metric value.
- a second indicative sound e.g., buzzing and/or haptic feedback indicative for an increased user significance metric value
- a device such as smart glasses, smartwatch, smart ring, or enhanced clothing (e.g., gloves) using smart fabrics, etc., is paired and/or operating in conjunction with the media capturing device 1.
- Such device may render first, second and/or third haptic signals on instructions from the media capturing device 1 and/or the managing server 7.
- a user setting in the device 1 and/or the managing server 7 may describe e.g., “fraction of partial faces” wherein e.g., a 25 % setting might correspond to “only 1 in four detected faces are allowed to be partial/non-full” for the described method to be active.
- high priority value maybe associated for objects or faces with another relative weight in how to prioritize among fractions of partial objects or faces.
- the herein disclosed method may combine objects with different priority values in order to identify preferred zooming/ panning of the media capturing device 1.
- Fig. 5 illustrates a feature on prioritization of combined objects according to embodiments.
- the media capturing device 1 detects and classifies several objects in its FOV, e.g., four kids and one football.
- the boxes drawn with solid lines indicate objects detected and classified and the boxes drawn with dashed lines indicate objects partly detected, prioritized, and classified.
- the method combines known images analysis, the significance metric values for the detected objects “kid” and “ball” based on the user’s preferences.
- the analysis may, for instance and as shown in the lower part of Figure 5, result in instructions to zooming and panning of the combined object with the highest significance metric value. In this case to zoom in on the kid with the ball and two of the other kids.
- the disclosed embodiments may consider the context of the media capturing session in order to determine that objects “kid” and “ball” implies “soccer tournament”, e.g., via out-of-band information such as event notification, schedule entries, tickets, etc.
- Fig. 6 is a flowchart of various embodiments of a method 50 according to the herein presented teachings.
- the method 50 may be used for capturing media by a media capturing device 1.
- the method 50 may be performed in a module 4 that is connected to the media capturing device 1 or in a module 4 that is a part of the media capturing device 1, i.e., integrated therein.
- the module 4 maybe an application (“app”) residing in, for instance, the media capturing device 1, or in a smartphone, tablet, smart lens or the like.
- the module 4 may, in various embodiments, be partially deployed in a cloud node, which acts as a managing server 7.
- the module 4 may thus comprise hardware and/or software, and the method 50 maybe a computer-implemented method performing any of the herein described steps, e.g., the steps 51, 52, 53, 54 described next.
- the method 50 comprises identifying 51 presences of one or more partial objects in the media to be captured on a screen of the media capturing device 1.
- the identifying can be made in different ways, for instance by applying object recognition and/or face recognition to determine presence of a partial object, for instance a face in a media stream, as described earlier, e.g., in relation to figure 1.
- the method comprises obtaining 52 information about relations between a person and the one or more partial objects.
- the information may be obtained in many different ways, for instance by receiving information from external sources such as databases, cloud entities, by receiving information as an input from a user, or by fetching and/ or searching the information from external sources.
- the obtaining 52 of the information comprises searching, in digital media, for any connections between the person and the one or more partial objects in the media. Such connections may, for instance, be family connections, work relations, friends etc. Various other examples have also been given earlier.
- the information may be obtained by fetching it from a memory that is available in/from one or more of: a memory of the media capturing device 1, a digital photo album of the media capturing device 1, any information or photos available from a cloud entity, social media via an application of the media capturing device 1, contact information of the media capturing device 1.
- the method 50 entails a number of advantages.
- the method 50 enables a user to capture photos and videos with as high relevance as possible, wherein the user may set (and change) relevance scores in any desired way.
- the user is provided with execution alternatives to select among ensuring the best possible picture or video to be taken.
- the number of partial faces included in a e.g., a photo may be reduced or eliminated, family members can be prioritized, etc., in accordance with the user settings.
- the obtaining 52 of information comprises searching, in digital media, for any connections between the person and the one or more partial objects in the media.
- the obtaining 52 information may comprise obtaining information available from one or more of: a memory of the media capturing device 1, a photo album of the media capturing device 1, any information or photos available from a cloud entity, social media via an application of the media capturing device 1, contact information of the media capturing device 1.
- the method 50 comprises to, in the associating 53, basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
- the obtaining 52 information about relations comprises obtaining relations between a selectable person and the one or more partial objects.
- the selectable person may, for instance, be the person performing the media capturing.
- the information on the relations may comprise a social relation selected among: family member or other relatives, friends, work colleagues, neighbours etc.
- the method 50 comprises informing on that the number of partial objects is exceeding a set threshold.
- the user may thereby take proper action or be given suggestions on which partial objects to select (e.g., according to significance metric value).
- the method 50 comprises suggesting 54 as adjustment action to zoom in on an object, which is partially or fully included, based on a given priority of the object.
- the media capturing device 1 is a camera for taking one or both of pictures and videos, and the media thus comprising one or both of pictures and videos.
- the module 4 is configured to identify presence, on a screen 2 of the media capturing device 1, one or more partial objects in the media to be captured.
- the screen 2 is thus showing a scenery in real-life, in the direction that the user points it.
- the module 4 is configured to obtain information about relations between a person and the one or more partial objects, and to associate, based on the obtained information, a significance value to each identified partial object.
- the module 4 is then configured to, based on the one or more significance values, suggest an adjustment action of the media capturing device 1.
- the module 4 is, in various embodiments, configured to obtain information by searching, in digital media, for any connections between the person and the one or more partial objects in the media.
- the module 4 is, in various embodiments, configured to obtain information by obtaining information available from one or more of: a memory of the media capturing device 1, a photo album of the media capturing device 1, any information or photos available from a cloud entity, social media via an application of the media capturing device 1, contact information of the media capturing device 1.
- the module 4 is, in various embodiments, configured to, in the associating, basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
- the module 4 is, in various embodiments, configured to suggest an adjustment action by prioritized inclusion of partial objects in the media according to their respective significance value, wherein a higher significance value represents a higher priority.
- the module 4 is, in various embodiments, configured to identify by using one or both of a first indicator 5 for indicating partial objects and a second indicator 6 for indicating fully included objects.
- the adjustment action is one or more of: zooming out, redirecting the media capturing device 1 in any suggested direction, such as forward, backwards, left, right, up or down, rotating the media capturing device (1) clockwise or counter clockwise, sliding media capturing position left or right, and/or creating a panorama media.
- the module 4 is, in various embodiments, configured to obtain information about relations by obtaining relations between a selectable person and the one or more partial objects.
- the selectable person is the person performing the media capturing.
- the module 4 is, in various embodiments, configured to inform on that number of partial objects is exceeding a set threshold.
- the information maybe a sound, text, alarm, haptic etc.
- the user is thereby made aware of this and can act accordingly, e.g., to sort among the partial objects, e.g., removing some or making re-adjustments.
- the module 4 is, in various embodiments, configured to suggest, as an adjustment action, zooming in on an object, partially or fully included, based on the priority of the object. This facilitates for the user to select among objects.
- the module 4 is, in various embodiments, the media capturing device 1 is a camera for taking one or both of pictures and videos, and the media thus comprising one or both of pictures and videos.
- Fig. 7 schematically illustrates, in terms of a number of functional units, the components of a module 4, as has been described, according to an embodiment.
- Processing circuitry no is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), etc., capable of executing software instructions stored in a computer program product 330 (as shown in Fig. 9), e.g., in the form of a storage medium 130.
- the processing circuitry 110 may further be provided as at least one application specific integrated circuit (ASIC), or field programmable gate array (FPGA).
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- the processing circuitry 110 is configured to cause the module 4 to perform a set of operations, or actions, as disclosed herein.
- the storage medium 130 may store the set of operations, and the processing circuitry no maybe configured to retrieve the set of operations from the storage medium 130 to cause the module 4 to perform the set of operations.
- the set of operations may be provided as a set of executable instructions.
- the processing circuitry 110 is thereby arranged to execute methods as herein disclosed.
- the storage medium 130 may also comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory.
- the module 4 may further comprise a communications interface 120 for communications with other entities, functions, nodes, and devices, over suitable interfaces.
- the communications interface 120 may comprise one or more transmitters and receivers, comprising analogue and digital components.
- the processing circuitry no controls the general operation of the module 4 e.g., by sending data and control signals to the communications interface 120 and the storage medium 130, by receiving data and reports from the communications interface 120, and by retrieving data and instructions from the storage medium 130.
- Other components, as well as the related functionality, of the module 4 are omitted in order not to obscure the concepts presented herein.
- Fig. 8 schematically illustrates, in terms of a number of functional modules, the components of a module 4 according to an embodiment.
- the module 4 of Fig. 8 comprises a number of functional modules; an apply module 210 may, for instance, be configured to apply a recognition means; an obtain module 220 configured to obtain information about relations between a person and the one or more partial objects; an associate module 230, configured to associate, based on the obtained information, a significance value to each identified partial object; and a suggest module 240 configured to suggest an adjustment action of the media capturing device 1 based on the one or more significance values.
- the module 4 of Fig. 8 may further comprise a number of optional functional modules, for performing the method as disclosed herein.
- each functional module 210 - 240 may be implemented in hardware or in software.
- one or more or all functional modules 210 - 240 may be implemented by the processing circuitry no, possibly in cooperation with the communications interface 120 and the storage medium 130.
- the processing circuitry 110 may thus be arranged to from the storage medium 130 fetch instructions as provided by a functional module 210 - 240 and to execute these instructions, thereby performing any actions of the module 4 as disclosed herein.
- Fig. 9 shows one example of a computer program product 330 comprising computer readable means 340.
- a computer program 320 can be stored, which computer program 320 can cause the processing circuitry no and thereto operatively coupled entities and devices, such as the communications interface 120 and the storage medium 130, to execute methods according to embodiments described herein.
- the computer program 320 and/or computer program product 330 may thus provide means for performing any actions of the module 4 as disclosed herein.
- the computer program product 330 is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc.
- the computer program product 330 could also be embodied as a memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or an electrically erasable programmable read-only memory (EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory or a Flash memory, such as a compact Flash memory.
- the computer program 320 is here schematically shown as a track on the depicted optical disk, the computer program 320 can be stored in any way which is suitable for the computer program product 330.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Studio Devices (AREA)
Abstract
A method for capturing media by a media capturing device is disclosed. The method is performed in a device in, or connected to, the media capturing device. The method comprises identifying presence, on a screen of the media capturing device, of one or more partial objects in the media to be captured; obtaining information about relations between a person and the one or more partial objects; associating, based on the obtained information, a significance value to each identified partial object; and suggesting an adjustment action of the media capturing device (1) based on the one or more significance values. A device for a media capturing device, computer program and computer program product are also disclosed.
Description
METHOD AND DEVICES FOR MEDIA CAPTURING
TECHNICAL FIELD
The technology disclosed herein relates generally to the field of media capturing, and in particular to devices and methods for capturing media by a media capturing device.
BACKGROUND
Nowadays it is more common than not to have a picture capturing device, such as a mobile device, readily available. Taking a picture or capturing a video using the mobile device may however be frustrating in many ways. It is not uncommon that the user ends up with having taken a photo of only parts of a face or body and not the intended full face and/ or body.
It is known to identify missing features of an object by using a camera preview while capturing the image. The user is notified about that, e.g., a part of a face is outside the picture and may then adapt camera parameters so that no or at least fewer features are lost while capturing and thereby giving the best possible result. For example, a user is taking a picture of a soccer team having 19 players and 3 leaders. If the user somehow risks capturing an image including several partial faces, a known solution is that re-adjustments of camera parameters are made in order to avoid the partial faces.
In this very common and popular activity of taking pictures there is still a need for finding ways to facilitate for the user to take the best possible picture.
SUMMARY
An objective of embodiments herein is to address and improve various aspects relating to picture capturing. A particular objective is to improve quality of a photograph by avoiding partial objects and/or persons being photographed. Still another objective is to enable, preferably automatically, a user to select which persons to prioritize, such as friends and relatives, even in a crowd. Another objective is to provide a user in action of capturing a photo or a video using e.g., a mobile device, one or more suggested execution alternatives. These objectives and others are achieved by the methods, devices, computer programs and computer program
products according to the appended independent claims, and by the embodiments according to the dependent claims.
According to a first aspect there is presented a method for capturing media by a media capturing device. The method is performed in a device in, or connected to, the media capturing device. The method comprises identifying presence, on a screen of the media capturing device, of one or more partial objects in the media to be captured and obtaining information about relations between a person and the one or more partial objects. The method further comprises associating, based on the obtained information, a significance value to each identified partial object, and suggesting an adjustment action of the media capturing device based on the one or more significance values.
According to a second aspect there is presented a computer program for a media capturing device. The computer program comprises computer code which, when run on processing circuitry of a device, causes the device to perform a method according to the first aspect.
According to a third aspect there is presented a computer program product comprising a computer program as above, and a computer readable storage medium on which the computer program is stored.
According to a fourth aspect there is presented a device for for a media capturing device. The device is configured to identify presence, on a screen of the media capturing device, one or more partial objects in the media to be captured, to obtain information about relations between a person and the one or more partial objects, to associate, based on the obtained information, a significance value to each identified partial object, and to suggest an adjustment action of the media capturing device based on the one or more significance values.
Advantageously, these aspects enable a user in action of capturing a photo or a video to obtain a more valuable photography in that persons and/objects that are captured are - in some sense - prioritized or preferred by the user. A user using e.g., a mobile device, is provided with one or more suggested execution alternatives. One such execution alternative maybe to assess a number of partial faces that risk to be only partially captured and then suggest mitigations for this detected scenario. Persons
known and related to the user may be swiftly and automatically identified and selected over persons unknown to the user. The user may be presented with execution alternatives to reduce e.g., partial faces or number of unknown persons.
Other objectives, features and advantages of the enclosed embodiments will be apparent from the following detailed disclosure, from the attached dependent claims as well as from the drawings.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the element, apparatus, component, means, module, action, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, module, action, etc., unless explicitly stated otherwise. The actions of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
BRIEF DESCRIPTION OF THE DRAWINGS
The inventive concept is now described, by way of example, with reference to the accompanying drawings, in which:
Fig. i illustrates partial face recognition according to embodiments.
Fig. 2 illustrates partial face recognition using adjustment features according to embodiments.
Fig. 3 illustrates a partial object avoidance feature according to embodiments.
Fig. 4 illustrates a zooming feature according to embodiments.
Fig. 5 illustrates a feature on prioritization of combined objects according to embodiments.
Fig. 6 is a flowchart of various embodiments of a method.
Fig. 7 is a schematic diagram showing functional units of a device according to an embodiment.
Fig. 8 is a schematic diagram showing functional modules of a device according to an embodiment.
Fig. 9 shows one example of a computer program product comprising computer readable means according to an embodiment.
DETAILED DESCRIPTION
The inventive concept will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the inventive concept are shown. This inventive concept may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the inventive concept to those skilled in the art. Like numbers refer to like elements throughout the description. Any action or feature illustrated by dashed lines should be regarded as optional.
Briefly, method and means such as devices, software and/or hardware, are disclosed for providing an improved way to detect and prioritize one or more partial objects, such as partial faces and/or objects located in edge regions of a capturing area of a media capturing device. A prioritization feature is provided, wherein, for instance, a user’s known relationships are used for suggesting objects to be included in the photography. A user-specific adjustment of camera settings is disclosed, that may be used for optimizing a metric such that less user-prioritized objects are excluded or partial and that more user-prioritized objects are fully captured.
A person taking a picture is provided with a method and means to select, in an automated way, who and/or what should be photographed and thus enabling the photographer to catch selectable persons on a photography. Typically, some persons are more important than others, e.g., it is more important to include family members in a photo than strangers.
In steps of evaluating and determining which partial object(s) to focus on, the herein described method comprises steps of prioritizing detected partial object in relation to the user’s various relationships, e.g., known relationships such as friends, colleagues, and family, but also potential future relationships, such as friends of a friend (2nd ring
of friends), work colleagues and their contacts, etc. Information on such relations may be obtained (e.g., fetched or received) in a number of different ways, for instance via social contacts and networks, possibly in several stages/tiers, photo albums, previously captured photos, etc. The information is obtained in order to enable adjustment of camera settings that optimizes a partial object significance metric, such that less user-prioritized objects are excluded, or are still partial and/or that more user-prioritized objects (having higher significance metric) are fully captured to largest extent possible.
For example, returning to the scenario described in the background section, wherein a user is about to take a picture of a soccer team with 19 players and 3 leaders, among which two players are the user’s kids. Then, if the user somehow is about to capture an image with several partial faces, the method and means disclosed herein suggests that, for instance, a camera re-direction or re-adjustment is evaluated, which re- direction/re-adjustment is optimized such that the user’s prioritized faces are to be fully included in the on-going media capturing, and thus providing a ’’best solution”, in a viewpoint of the user, among otherwise similarly good choices.
Fig. 1 illustrates partial face recognition according to embodiments. A media capturing device 1, e.g., a smart phone, a tablet, a smart lens, a digital camera or the like, comprises some type of media capturing capabilities. It may, for instance, be enabled for taking still photos and/ or recording videos. In the following the captured media is denoted media stream, and in figure 1 an exemplary scenery 3 for a photo or video capturing session is shown. This scenery 3 comprises a crowd of faces but could of course be any type of scenery comprising any type of objects.
The media capturing device 1 comprises an object- and/or face recognition feature for determining presence of a partial object/partial face in the media stream, in the following denoted face recognition feature for simplicity. For the face recognition feature, visual aids 5, 6 are provided: a full-face indicator 6 shown in the figure as a rectangle around any recognized full face that is currently visible on a screen 2 of the media capturing device 1, and a partial face indicator 5, in the figure shown as a rectangle with dashed lines around a face recognized as being only partly visible on the screen 2.
A module 4 is also indicated in figure 1, in which methods according to the present teachings may be implemented. The module 4may implement all the disclosed steps or some of the steps in cooperation with e.g., a managing server 7, e.g., a cloud node acting as such a managing server 7. The module 4 may be an application (“app”) residing in, for instance, the media capturing device 1 and performing the steps of the method disclosed herein.
The media capturing device 1 comprises means for obtaining information on the user’s relationships. Such means may, for instance, be the application, which in turn may be part of the module 4 in the media capturing device 1. The information on the user’s relationships may be obtained, e.g., by fetching, searching and/ or requesting it from one or more of: the managing server 7, various databases, social media servers, social contact lists, photo albums, previously captured photos, e-mail texts, addresses, contacts, shared playlists e.g., through streaming application and other information available in and/or by the media capturing device 1.
The module 4 (e.g., being an application) may then apply the information on these known relationships between the user and any partial face(s) to these partial faces captured in the current media stream when determining which partial faces should be included or excluded. For instance, a first partial face maybe identified as a family member, while a second partial face is identified as an unknown person.
The module 4 may associate a significance metric value to each such detected face and/or partial face. The significance metric value may consider the user’s social relationships, acquired via social contacts, contact information, message application entries, endorsements, photo albums, etc., and/or in relation to user context. The significance metric value may further consider the user’s link(s) and/ or affiliation to non-human objects (pets, cars etc.) acquired in different ways, as has been exemplified. The significance metric value may further take into account the context of the media capturing such as the geo-location (indoors, outdoors, home, at a venue etc.), office hours, off-work, weekdays, weekend etc. For instance, if taking a picture during work hours (e.g., geo-located at a known job site or information on work- related meeting obtained from a calendar), the significance metric value for a work colleague could be higher during office hours than outside working hours, while the significance metric value for children and coaches would be higher during a soccer
game. Still further, the significance metric value may consider if the occasion is private or business related, or an event (acquired from external servers). There is thus a vast number of ways to associate a significance metric value to a person.
The significance metric value typically has different values associated to different persons. As a set of particular examples, the significance metric value may be:
- A high priority value for faces that are frequently occurring in the user’s contacts, social interactions and media flows. Such faces maybe family, kids, friends and these may all be VIP tagged.
- A medium priority value for less occurring faces. That is, faces that are less frequently occurring in contacts, interactions, media flows etc., and these faces may be frequently occurring but not VIP.
- A low priority value for seldomly occurring faces, occurring unfrequently in contacts, interactions and/ or media flows. These faces are known but not favored.
- A not-applicable priority value for faces that are not present in any media flow interactions for the user, i.e., unknown faces.
The significance metric value is applied for the detected partial faces (and/or objects).
In the media capturing application of the media capturing device i suggested camera re-adjustment actions are provided. These re-adjustment actions may be based on the obtained significance metric values for the detected partial faces (and/or objects). The aggregated sum of the partial object significance metric value present in the media capturing should be maximized. For instance, a degree of a partial face may be given a certain score, a degree of a partial object another score, etc. The scores may then be used for prioritizing what to include in the picture. As a particular example, a face of a family member is partially shown and given a score of 65 points, which is above a certain set threshold score of 50 points, the threshold score being set, for instance, such that for the partially visible family member to be relevant for inclusion in the picture at least 50 points should be reached. There may be various such significance metric values for deciding if and in what priority order to include different persons and objects.
As another example, there may be a significance metric value for family members and an associated threshold value, thr_family, and another metric value and threshold for persons not being a family member. An optimization procedure may then be implemented, wherein it is more worth having one full family face and one partial family face, than to a total stranger’s face and two partial family faces.
Exemplary significance metric values could be that a fully visible family face is given a maximum value, e.g., 100 points, and a partially visible family face is given a value of 50 points. A total stranger’s fully visible face may be given 40 points, while a partially visible face is given 20 points. A number of potential camera actions may be based on such significance metric values.
For instance, a first possible action may be that a fully visible family member’s face, a partially visible family member’s face and stranger’s face are captured. This first possible action is then given 170 points according to the given exemplary values.
A second possible action may be that two partially visible family member’s faces and a fully visible stranger’s face are captured. This second possible action is thus given 140 p.
A feature in or for the media capturing device 1 may be that the action to take is the photo (for instance) to be captured is the one obtaining the highest score.
There may also be rules such that category “family” is given a first threshold and that category “unknown” is given a second threshold, so that if category “family44 sums up to e.g. valuer (> thri) and resulting in that associated #Action A is given priority given that category “unknown” sums up to value2 (< thr2) is not above its threshold value.
The re-adjustment actions may comprise zooming out; turning the media capturing device 1 left, right, up or down, or any combinations thereof; rotating the media capturing device 1 clockwise or counter clockwise; sliding photo position left or right; creating a panorama picture to capture relevant faces (e.g., if zoom is not an option) by stitching of consecutive adjacent image captions. It is realized that there are several other types of re-adjustment actions that can be made to include faces that are currently only partly included and that have e.g., a high priority.
Fig. 2 illustrates partial face recognition using adjustment features according to embodiments. The figure shows a particular example, wherein three partial faces are indicated as PFi (Partial Face i), PF2 (Partial Face 2), and PF3 (Partial Face 3). A face recognition program has been applied and the relevance for these partial faces is determined. For instance, PFi was found to be a social contact with a high priority, PF2 a social contact with a medium priority, and PF3 an unknown person having e.g., a low priority or wherein priority is not applicable at all. A significance metric value may thus be given to each of the partial faces.
Next, at least one re-adjustment action is suggested so that the aggregated significance metric value is maximized. In the figure, an arrow “Adjustment Feature” is shown, indicating that adjustments may be suggested and be made by the user, e.g., zooming out, zooming in, adjust field-of-view direction etc. It might, depending on the scenery and relative placement of the persons, happen that one or more known relations cannot be included by any re-adjustment of the media capturing device 1, but, for instance, only PFi and PF2. One exemplary re-adjustment action that would maximize the aggregated significance metric value could be to zoom out. If many solutions with equal significance are found, various criteria on how to select among these could be used. For instance, the solution which minimizes the distance that the media capturing device 1 would need to move could be one criterium; another could be the solution with the least number of necessary re-adjustments. The media capturing device 1 may suggest any such re-adjustment action in a prioritized order.
Fig. 3 illustrates a partial object avoidance feature according to embodiments. In some cases, a trade-off maybe required between objects that have the same priority values (significance metric values), or between clusters of objects of certain types. In a scenario shown in figure 3, multiple objects 10, 12, 14 are associated with the user and there is a need for further prioritization within these objects 10, 12, 14.
In the illustrated case a user preference and user-specific priority rules may consider a trade-off between the objects “husband”, “car” and “family dog”. Referring to the uppermost part of the figure, only the family dog is an object that is detected and classified, while the car and the husband are detected and only partly classified. In the lower-most part of the figure, a full view of a possible image is shown, which includes all three objects.
Fig. 4 illustrates a zooming feature according to embodiments and related to figure 3. Depending on the user’s partial object significance metric values, the result may be that the suggested re-adjustment is to zoom in on a specific object, in this case, illustrated in figure 4, the car and not the possible full view shown in lowermost part of figure 3.
The herein presented partial object avoidance solution may, according to a further aspect, consider other photo preferences and optimization metric than for the user taking the photo or video. The module 4 may also be configured to obtain preferences from other users so that the first user’s media capturing device 1 may adjust according to any suggested adjustments. Examples on such adjustments may comprise: elevating (moving up or down), strafing (moving left or right), surging (moving forwards or backwards), pitching (looking up or down), rolling (head pivoting from side to side), yawing (head turning along horizontal axis).
For example, the user is requested to take a picture with some known faces included, whereas the person, the second person, is the one requesting the picture. However, the user typically does not know how to prioritize among (partial) objects in the camera’s field of view (FOV), as the user does not have relations to these objects. The second person may typically have other social interactions to consider than the user.
Therefore, in order to support execution of partial object avoidance in capturing media on behalf of other persons, in this example second person, an as-is feature is provided. Assuming that the first user has a social relation to the second person, who is the requester of the media, the user may in a FOV of the media capturing device 1 indicate and/or select a face as as-is media capturer. In a following step, the user of the media capturing device 1 may request information from the managing server 7.
In one embodiment, object/face recognition attributes and associated social relations metric for the second person is requested from the managing server 7, and having acquired that information, the user may, in the prioritization step, apply settings of the second person.
In another embodiment, the user provides a request to the managing server 7 in order to obtain re-adjustment setting(s) associated with a person in the field of view.
The social connections of the person indicated in the picture are known by the managing server 7 and is used by the media capturing device 1 (e.g., camera).
In any of the described embodiments, the managing server 7, which may comprise of be an application manager, may provide a process execution response contain any of ACK / NACK for a requester (e.g., first user).
Further, in the process where certain partial objects are detected and potentially actions taken with respect to optimization, multiple versions of capturing may be stored associated with different prioritization profiles. For example, for same full FOV venue-scenery, e.g., as shown in figure 3, one car-enthusiast version, shown in figure 4, may be optimized for and stored, another optimization and media capturing for a dog-enthusiast profile, and e.g., a third one “parent’s view” optimizing and storing media with kid’s face/ doings.
In a step, where e.g., no known or no identified “personal relation” objects/faces are present in the media capturing, the module 4 and/or the managing server 7 may request information of e.g., public or known faces (e.g., celebrities, officials) for which a partial face optimization may be considered.
The media capturing device 1 may output a first indicative sound(s), for instance a buzzing sound, and/or haptic feedback, or the like, if partial faces, partial objects, etc. are detected in the FOV of the media capturing device 1. The user may then, in response to an output or notification from suggested re-adjustments, move or adjust the media capturing device 1 according to a direction (according to significance metric values). The media capturing device 1 may output a second indicative sound, e.g., buzzing and/or haptic feedback indicative for an increased user significance metric value, correspondingly, if the user moves or adjusts the media capturing device 1 in a direction giving a diminishing significance metric value, it may output a third indicative sound (buzzing or haptic feedback) indicative for a decreased user significance metric value.
In a related aspect, a device (not illustrated), such as smart glasses, smartwatch, smart ring, or enhanced clothing (e.g., gloves) using smart fabrics, etc., is paired and/or operating in conjunction with the media capturing device 1. Such device may
render first, second and/or third haptic signals on instructions from the media capturing device 1 and/or the managing server 7.
A user setting in the device 1 and/or the managing server 7 may describe e.g., “fraction of partial faces” wherein e.g., a 25 % setting might correspond to “only 1 in four detected faces are allowed to be partial/non-full” for the described method to be active. In this aspect, high priority value maybe associated for objects or faces with another relative weight in how to prioritize among fractions of partial objects or faces.
In a further example, the herein disclosed method may combine objects with different priority values in order to identify preferred zooming/ panning of the media capturing device 1.
Fig. 5 illustrates a feature on prioritization of combined objects according to embodiments. To the left in the uppermost view of figure 5, the media capturing device 1 detects and classifies several objects in its FOV, e.g., four kids and one football. Here, the boxes drawn with solid lines indicate objects detected and classified and the boxes drawn with dashed lines indicate objects partly detected, prioritized, and classified. In an aspect, the method combines known images analysis, the significance metric values for the detected objects “kid” and “ball” based on the user’s preferences.
The analysis may, for instance and as shown in the lower part of Figure 5, result in instructions to zooming and panning of the combined object with the highest significance metric value. In this case to zoom in on the kid with the ball and two of the other kids.
The disclosed embodiments may consider the context of the media capturing session in order to determine that objects “kid” and “ball” implies “soccer tournament”, e.g., via out-of-band information such as event notification, schedule entries, tickets, etc.
Fig. 6 is a flowchart of various embodiments of a method 50 according to the herein presented teachings. The method 50 may be used for capturing media by a media capturing device 1. The method 50 may be performed in a module 4 that is connected to the media capturing device 1 or in a module 4 that is a part of the media capturing device 1, i.e., integrated therein. The module 4 maybe an application (“app”) residing
in, for instance, the media capturing device 1, or in a smartphone, tablet, smart lens or the like. The module 4 may, in various embodiments, be partially deployed in a cloud node, which acts as a managing server 7. The module 4 may thus comprise hardware and/or software, and the method 50 maybe a computer-implemented method performing any of the herein described steps, e.g., the steps 51, 52, 53, 54 described next.
The method 50 comprises identifying 51 presences of one or more partial objects in the media to be captured on a screen of the media capturing device 1. The identifying can be made in different ways, for instance by applying object recognition and/or face recognition to determine presence of a partial object, for instance a face in a media stream, as described earlier, e.g., in relation to figure 1.
The method comprises obtaining 52 information about relations between a person and the one or more partial objects. The information may be obtained in many different ways, for instance by receiving information from external sources such as databases, cloud entities, by receiving information as an input from a user, or by fetching and/ or searching the information from external sources. In some embodiments, the obtaining 52 of the information comprises searching, in digital media, for any connections between the person and the one or more partial objects in the media. Such connections may, for instance, be family connections, work relations, friends etc. Various other examples have also been given earlier.
In further examples the information may be obtained by fetching it from a memory that is available in/from one or more of: a memory of the media capturing device 1, a digital photo album of the media capturing device 1, any information or photos available from a cloud entity, social media via an application of the media capturing device 1, contact information of the media capturing device 1.
The method 50 comprises associating 53 a significance value to each identified partial object, based on the obtained information. In various embodiments, the associating 53 comprises basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
The method 50 comprises suggesting 54 an adjustment action of the media capturing device 1 based on the one or more significance values. Several examples of adjustment actions have been given earlier.
The method 50 entails a number of advantages. The method 50 enables a user to capture photos and videos with as high relevance as possible, wherein the user may set (and change) relevance scores in any desired way. The user is provided with execution alternatives to select among ensuring the best possible picture or video to be taken. The number of partial faces included in a e.g., a photo may be reduced or eliminated, family members can be prioritized, etc., in accordance with the user settings.
In various embodiments, the obtaining 52 of information comprises searching, in digital media, for any connections between the person and the one or more partial objects in the media. The obtaining 52 information may comprise obtaining information available from one or more of: a memory of the media capturing device 1, a photo album of the media capturing device 1, any information or photos available from a cloud entity, social media via an application of the media capturing device 1, contact information of the media capturing device 1.
In various embodiments, the method 50 comprises to, in the associating 53, basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
In various embodiments, the suggesting 54 an adjustment action comprises suggesting prioritizing inclusion of partial objects in the media according to their respective significance value, wherein the higher significance value, the higher priority.
In various embodiments, the identifying comprises using one or both of a first indicator 5 for indicating partial objects and a second indicator 6 for indicating fully included objects. Such indicators have been exemplified by rectangles herein, but various other indicators maybe used instead or in addition to rectangles.
In various embodiments, the adjustment action is one or more of: zooming out, redirecting the media capturing device 1 in any suggested direction, such as forward, backwards, left, right, up or down, rotating the media capturing device i clockwise or counter clockwise, sliding media capturing position left or right, and/or creating a panorama media.
In various embodiments, the obtaining 52 information about relations comprises obtaining relations between a selectable person and the one or more partial objects. The selectable person may, for instance, be the person performing the media capturing. The information on the relations may comprise a social relation selected among: family member or other relatives, friends, work colleagues, neighbours etc.
In various embodiments, the method 50 comprises informing on that the number of partial objects is exceeding a set threshold. The user may thereby take proper action or be given suggestions on which partial objects to select (e.g., according to significance metric value).
In various embodiments, the method 50 comprises suggesting 54 as adjustment action to zoom in on an object, which is partially or fully included, based on a given priority of the object.
In various embodiments, the media capturing device 1 is a camera for taking one or both of pictures and videos, and the media thus comprising one or both of pictures and videos.
A module 4 for a media capturing device 1 is also disclosed, the module 4 being configured to perform any of the embodiments, as has been described. As has been described herein, the module 4 may, for instance, be an application (“app”) residing in, for instance, the media capturing device 1 and performing the steps of the method disclosed herein.
The module 4 is configured to identify presence, on a screen 2 of the media capturing device 1, one or more partial objects in the media to be captured. The screen 2 is thus showing a scenery in real-life, in the direction that the user points it.
The module 4 is configured to obtain information about relations between a person and the one or more partial objects, and to associate, based on the obtained information, a significance value to each identified partial object.
The module 4 is then configured to, based on the one or more significance values, suggest an adjustment action of the media capturing device 1.
The module 4 is, in various embodiments, configured to obtain information by searching, in digital media, for any connections between the person and the one or more partial objects in the media.
The module 4 is, in various embodiments, configured to obtain information by obtaining information available from one or more of: a memory of the media capturing device 1, a photo album of the media capturing device 1, any information or photos available from a cloud entity, social media via an application of the media capturing device 1, contact information of the media capturing device 1.
The module 4 is, in various embodiments, configured to, in the associating, basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
The module 4 is, in various embodiments, configured to suggest an adjustment action by prioritized inclusion of partial objects in the media according to their respective significance value, wherein a higher significance value represents a higher priority.
The module 4 is, in various embodiments, configured to identify by using one or both of a first indicator 5 for indicating partial objects and a second indicator 6 for indicating fully included objects.
In various embodiments, the adjustment action is one or more of: zooming out, redirecting the media capturing device 1 in any suggested direction, such as forward, backwards, left, right, up or down, rotating the media capturing device (1) clockwise or counter clockwise, sliding media capturing position left or right, and/or creating a panorama media.
The module 4 is, in various embodiments, configured to obtain information about relations by obtaining relations between a selectable person and the one or more partial objects. In some embodiments, the selectable person is the person performing the media capturing.
The module 4 is, in various embodiments, configured to inform on that number of partial objects is exceeding a set threshold. The information maybe a sound, text, alarm, haptic etc. The user is thereby made aware of this and can act accordingly, e.g., to sort among the partial objects, e.g., removing some or making re-adjustments.
The module 4 is, in various embodiments, configured to suggest, as an adjustment action, zooming in on an object, partially or fully included, based on the priority of the object. This facilitates for the user to select among objects.
The module 4 is, in various embodiments, the media capturing device 1 is a camera for taking one or both of pictures and videos, and the media thus comprising one or both of pictures and videos.
Fig. 7 schematically illustrates, in terms of a number of functional units, the components of a module 4, as has been described, according to an embodiment. Processing circuitry no is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), etc., capable of executing software instructions stored in a computer program product 330 (as shown in Fig. 9), e.g., in the form of a storage medium 130. The processing circuitry 110 may further be provided as at least one application specific integrated circuit (ASIC), or field programmable gate array (FPGA).
Particularly, the processing circuitry 110 is configured to cause the module 4 to perform a set of operations, or actions, as disclosed herein. For example, the storage medium 130 may store the set of operations, and the processing circuitry no maybe configured to retrieve the set of operations from the storage medium 130 to cause the module 4 to perform the set of operations. The set of operations may be provided as a set of executable instructions. The processing circuitry 110 is thereby arranged to execute methods as herein disclosed.
The storage medium 130 may also comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory.
The module 4 may further comprise a communications interface 120 for communications with other entities, functions, nodes, and devices, over suitable interfaces. As such the communications interface 120 may comprise one or more transmitters and receivers, comprising analogue and digital components.
The processing circuitry no controls the general operation of the module 4 e.g., by sending data and control signals to the communications interface 120 and the storage medium 130, by receiving data and reports from the communications interface 120, and by retrieving data and instructions from the storage medium 130. Other components, as well as the related functionality, of the module 4 are omitted in order not to obscure the concepts presented herein.
Fig. 8 schematically illustrates, in terms of a number of functional modules, the components of a module 4 according to an embodiment. The module 4 of Fig. 8 comprises a number of functional modules; an apply module 210 may, for instance, be configured to apply a recognition means; an obtain module 220 configured to obtain information about relations between a person and the one or more partial objects; an associate module 230, configured to associate, based on the obtained information, a significance value to each identified partial object; and a suggest module 240 configured to suggest an adjustment action of the media capturing device 1 based on the one or more significance values. The module 4 of Fig. 8 may further comprise a number of optional functional modules, for performing the method as disclosed herein. In general terms, each functional module 210 - 240 maybe implemented in hardware or in software. Preferably, one or more or all functional modules 210 - 240 may be implemented by the processing circuitry no, possibly in cooperation with the communications interface 120 and the storage medium 130. The processing circuitry 110 may thus be arranged to from the storage medium 130 fetch instructions as provided by a functional module 210 - 240 and to execute these instructions, thereby performing any actions of the module 4 as disclosed herein.
Fig. 9 shows one example of a computer program product 330 comprising computer readable means 340. On this computer readable means 340, a computer program
320 can be stored, which computer program 320 can cause the processing circuitry no and thereto operatively coupled entities and devices, such as the communications interface 120 and the storage medium 130, to execute methods according to embodiments described herein. The computer program 320 and/or computer program product 330 may thus provide means for performing any actions of the module 4 as disclosed herein.
In the example of Fig. 9, the computer program product 330 is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. The computer program product 330 could also be embodied as a memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or an electrically erasable programmable read-only memory (EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory or a Flash memory, such as a compact Flash memory. Thus, while the computer program 320 is here schematically shown as a track on the depicted optical disk, the computer program 320 can be stored in any way which is suitable for the computer program product 330.
The inventive concept has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the inventive concept, as defined by the appended patent claims.
Claims
1. A method (50) for capturing media by a media capturing device (1), the method (50) being performed in a module (4) in, or connected to, the media capturing device (1), the method (50) comprising:
- identifying (51) presence, on a screen (2) of the media capturing device (1), of one or more partial objects in the media to be captured,
- obtaining (52) information about relations between a person and the one or more partial objects,
- associating (53), based on the obtained information, a significance value to each identified partial object, and
- suggesting (54) an adjustment action of the media capturing device (1) based on the one or more significance values.
2. The method (50) as claimed in claim 1, wherein the obtaining (52) information comprises searching, in digital media, for any connections between the person and the one or more partial objects in the media.
3. The method (50) as claimed in claim 1 or 2, wherein the obtaining (52) information comprises obtaining information available from one or more of: a memory of the media capturing device (1), a photo album of the media capturing device (1), any information or photos available from a cloud entity, social media via an application of the media capturing device (1), contact information of the media capturing device (1).
4. The method (50) as claimed in any of the preceding claims, comprising, in the associating (53), basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
5. The method (50) as claimed in any of the preceding claims, wherein the suggesting (54) an adjustment action comprises suggesting prioritizing inclusion of partial objects in the media according to their respective significance value, wherein the higher significance value, the higher priority.
6. The method (50) as claimed in any of the preceding claims, wherein the identifying comprises using one or both of a first indicator (5) for indicating partial objects and a second indicator (6) for indicating fully included objects.
7. The method (50) as claimed in any of the preceding claims, wherein the adjustment action is one or more of: zooming out, zooming in, re-directing the media capturing device (1) in any suggested direction, such as left, right, up or down, rotating the media capturing device (1) clockwise or counter clockwise, sliding media capturing position left or right, and/or creating a panorama media.
8. The method (50) as claimed in any of the preceding claims, wherein the obtaining (52) information about relations comprises obtaining relations between a selectable person and the one or more partial objects.
9. The method (50) as claimed in claim 8, wherein the selectable person is the person performing the media capturing.
10. The method (50) as claimed in any of the preceding claims, comprising informing on that number of partial objects is exceeding a set threshold.
11. The method as claimed in any of the preceding claims, comprising suggesting (54) as adjustment action zooming in on an object, partially or fully included, based on the priority of the object.
12. The method (50) as claimed in any of the preceding claims, wherein the media capturing device (1) is a camera for taking one or both of pictures and videos, and the media thus comprising one or both of pictures and videos.
13. A computer program (320) for capturing media by a media capturing device (1), the computer program comprising computer code which, when run on processing circuitry (no) of a module (4) in, or connected to, the media capturing device (1) causes the media capturing device (1) to:
- identify presence, on a screen (2) of the media capturing device (1), of one or more partial objects in the media to be captured,
- obtain information about relations between a person and the one or more partial objects,
- associate, based on the obtained information, a significance value to each identified partial object, and
- suggest an adjustment action of the media capturing device (i) based on the one or more significance values.
14. A computer program product (330) comprising a computer program (320) according to claim 13, and a computer readable storage medium (340) on which the computer program (320) is stored.
15. A module (4) for a media capturing device (1), the module (4) being configured to:
- identify presence, on a screen (2) of the media capturing device (1), one or more partial objects in the media to be captured,
- obtain information about relations between a person and the one or more partial objects,
- associate, based on the obtained information, a significance value to each identified partial object, and
- suggest an adjustment action of the media capturing device (1) based on the one or more significance values.
16. The module (4) as claimed in claim 15, configured to obtain information by searching, in digital media, for any connections between the person and the one or more partial objects in the media.
17. The module (4) as claimed in claim 15 or 16, configured to obtain information by obtaining information available from one or more of: a memory of the media capturing device (1), a photo album of the media capturing device (1), any information or photos available from a cloud entity, social media via an application of the media capturing device (1), contact information of the media capturing device (1).
18. The module (4) as claimed in any of claims 15 - 17, configured to, in the associating, basing the significance value on a value given to one or more of: the persons social relations to each identified partial object, number of obtained social interactions with each respective identified partial object, number of occurrences of each identified partial object, number of found media flows for each identified partial object.
19. The module (4) as claimed in any of claims 15 - 18, configured to suggest an adjustment action by prioritized inclusion of partial objects in the media
according to their respective significance value, wherein a higher significance value represents a higher priority.
20.The module (4) as claimed in any of claims 15 - 19, configured to identify by using one or both of a first indicator (5) for indicating partial objects and a second indicator (6) for indicating fully included objects.
21. The module (4) as claimed in any of claims 15 - 20, wherein the adjustment action is one or more of: zooming out, re-directing the media capturing device (1) in any suggested direction, such as left, right, up or down, rotating the media capturing device (1) clockwise or counter clockwise, sliding media capturing position left or right, and/or creating a panorama media.
22. The module (4) as claimed in any of claims 15 - 21, configured to obtain information about relations by obtaining relations between a selectable person and the one or more partial objects.
23. The module (4) as claimed in claim 22, wherein the selectable person is the person performing the media capturing.
24. The module (4) as claimed in any of claims 15 - 23, configured to inform on that number of partial objects is exceeding a set threshold.
25. The module (4) as claimed in any of claims 15 - 24, configured to suggest, as adjustment action, zooming in on an object, partially or fully included, based on the priority of the object.
26. The module (4) as claimed in any of claims 15 - 25, wherein the media capturing device (1) is a camera for taking one or both of pictures and videos, and the media thus comprising one or both of pictures and videos.
27. A media capturing device (1) comprising a module (4) as claimed in any of claims 15 - 26.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2023/064040 WO2024240353A1 (en) | 2023-05-25 | 2023-05-25 | Method and devices for media capturing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2023/064040 WO2024240353A1 (en) | 2023-05-25 | 2023-05-25 | Method and devices for media capturing |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024240353A1 true WO2024240353A1 (en) | 2024-11-28 |
Family
ID=86657378
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2023/064040 WO2024240353A1 (en) | 2023-05-25 | 2023-05-25 | Method and devices for media capturing |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024240353A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150281566A1 (en) * | 2013-07-11 | 2015-10-01 | Sightera Technologies Ltd. | Method and system for capturing important objects using a camera based on predefined metrics |
US20200244873A1 (en) * | 2015-08-24 | 2020-07-30 | Samsung Electronics Co., Ltd. | Technique for supporting photography in device having camera, and device therefor |
-
2023
- 2023-05-25 WO PCT/EP2023/064040 patent/WO2024240353A1/en unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150281566A1 (en) * | 2013-07-11 | 2015-10-01 | Sightera Technologies Ltd. | Method and system for capturing important objects using a camera based on predefined metrics |
US20200244873A1 (en) * | 2015-08-24 | 2020-07-30 | Samsung Electronics Co., Ltd. | Technique for supporting photography in device having camera, and device therefor |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11995530B2 (en) | Systems and methods for providing feedback for artificial intelligence-based image capture devices | |
EP2567536B1 (en) | Generating a combined image from multiple images | |
US8484223B2 (en) | Image searching apparatus and image searching method | |
US9646227B2 (en) | Computerized machine learning of interesting video sections | |
US8111942B2 (en) | System and method for optimizing camera settings | |
US8013913B2 (en) | Image recording method, image recording device, and storage medium | |
JP2009533761A (en) | Image value index based on user input to the camera | |
US9639778B2 (en) | Information processing apparatus, control method thereof, and storage medium | |
US10880478B2 (en) | Camera, system and method of selecting camera settings | |
CN113906437A (en) | Improved facial quality of captured images | |
US8953067B2 (en) | Method, device, and machine readable medium for image capture and selection | |
JP2015198300A (en) | Information processor, imaging apparatus, and image management system | |
WO2024240353A1 (en) | Method and devices for media capturing | |
US9633445B1 (en) | Calculating a sharpness value | |
US11809991B2 (en) | Information management apparatus, information processing apparatus, and control method thereof | |
US20220391686A1 (en) | Dynamic obstacle avoidance during media capture | |
US11570367B2 (en) | Method and electronic device for intelligent camera zoom | |
JP6394199B2 (en) | Image transmitting apparatus, image transmitting method and program | |
JP2018077712A (en) | Information processing device and information processing program | |
US20250088737A1 (en) | Information processing apparatus, control method, and storage medium | |
CN113360710B (en) | Method and device for determining combination degree between objects, computer equipment and storage medium | |
CN113938712B (en) | Video playing method and device and electronic equipment | |
US12294781B2 (en) | Information processing apparatus, information processing method, and non-transitory computer-readable storage medium | |
US11012611B2 (en) | Electronic apparatus, control method, and storage medium | |
US11653087B2 (en) | Information processing device, information processing system, and information processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23728067 Country of ref document: EP Kind code of ref document: A1 |