Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, some embodiments of the present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
A first embodiment of the present application relates to a method for establishing a map, which is applied to a terminal, and as shown in fig. 1, the method for establishing a map includes the following steps:
step 101: n maps describing the same space are obtained. Wherein N is an integer greater than 1.
Specifically, the terminal collects a plurality of image sequences in the same space through the vision sensor, extracts the information and space invariant features of map points in each group of image sequences respectively, and establishes a map corresponding to each image sequence according to the extracted information and space invariant features of the map points.
It should be noted that the map may be a map that is established by the terminal according to an image sequence acquired by the visual sensor, or a map that is transmitted to the terminal by the cloud or other terminals, and the source of the map is not limited in this embodiment.
Step 102: and extracting the respective space invariant features of the N maps.
Specifically, the spatially invariant feature may be any one or any combination of a line feature, a semantic feature, and tag information. The line features refer to features of line segments in the map, and include information of lengths, angles, intersections and the like of the line segments. Semantic features refer to features assigned to the same map points in different maps. The terminal identifies the same map points in different maps by an image identification method and allocates the same features to the same map points. The marker information refers to information in a map of a positioning marker in space. Among them, the positioning markers are markers arranged at a plurality of fixed positions in space, and the markers may be Quick Response (QR) codes or Data Matrix (DM) codes.
Step 103: and combining the information of the map points in the N maps according to the respective space invariant features of the N maps to obtain a combined map.
The following exemplifies a method for obtaining a merged map by merging the information of map points in the N maps according to the respective space invariant features of the N maps.
The method comprises the following steps: and the terminal creates a new map as a reference map and determines the relative pose relationship between the reference map and a map A in the N maps. Wherein, the map A is any one map in the N maps. And the terminal adds the information of the space invariant features and map points in the map A to the reference map according to the relative pose relationship between the reference map and the map A. And the terminal respectively matches the space invariant features in the N-1 maps except the map A with the space invariant features in the reference map, and determines the relative pose relations between the N-1 maps and the reference map according to the matching results respectively corresponding to the N-1 maps. And the terminal merges the map point information of the N-1 maps into the reference map according to the relative pose relations between the N-1 maps and the reference map respectively to obtain a merged map.
The method 2 comprises the following steps: and the terminal selects one map from the N maps as a reference map, respectively matches the space invariant features in the N-1 maps except the reference map with the space invariant features in the reference map, and determines the relative pose relationship between the N-1 maps and the reference map according to the matching results respectively corresponding to the N-1 maps. And the terminal merges the map point information of the N-1 maps into the reference map according to the relative pose relations between the N-1 maps and the reference map respectively to obtain a merged map.
For clarity, in this embodiment, a method for determining the relative pose relationship between the map B and the map C is taken as an example, and a method for determining the relative pose relationship between the N-1 maps and the reference map by the terminal in the method 1 and the method 2 is described. The terminal matches the space invariant features in the map B with the space invariant features in the map C, and according to a matching result, a point Perspective (PnP) pose measurement algorithm is adopted to solve the relative pose relation between the map B and the map C. Optionally, in the process of solving by using the PnP pose measurement algorithm, the relative pose relationship is optimized by using a Bundle Adjustment (BA) algorithm.
It should be noted that, in the process of merging the information of the map points of the N-1 maps into the reference map, the same map points may not be merged into the same map point. For this situation, the terminal may merge map points in the merged map whose distance is smaller than a preset value after obtaining the merged map, and the location information of the merged map points is determined according to the location information of the map points before merging, where the preset value may be determined according to actual needs. For example, the spatial coordinates of the map point a in the merged map are (xa, ya, za), and the spatial coordinates of the map point b are (xb, yb, zb), where xa represents the abscissa of the map point a, ya represents the ordinate of the map point a, za represents the ordinate of the map point a, xb represents the abscissa of the map point b, yb represents the ordinate of the map point b, and zb represents the ordinate of the map point b. The terminal calculates the distance between the map point a and the map point b, judges whether the distance between the map point a and the map point b is smaller than a preset value, if the distance is smaller than the preset value, the map point a and the map point b are combined to obtain a map point c, the space coordinate of the map point c is ((xa + xb)/2, (ya + yb)/2, (za + zb)/2), and the description information set of the map point c comprises the description information set of the map point a and the description information set of the map point b.
It is worth mentioning that the position information of the map points after combination is determined according to the position information of the map points before combination, which is equivalent to collecting the position information of the same map point for many times, and the accuracy of the position information of the map points is improved.
It is worth mentioning that the merged map is established according to the map point information of the N maps, so that the management efficiency of the map is improved. After the merged map is established by the method for establishing a map according to the embodiment, when the terminal needs to expand the map, the position of the information needing to be expanded in the merged map is determined, the information needing to be expanded is added at the position, and the expansion of the map information for each map of the N maps is not needed. When the terminal needs to delete or update the information of the map points, only the information of the map points in the combined map needs to be deleted or updated, and the information of the map points does not need to be deleted or updated for each map of the N maps.
Compared with the prior art, in the method for establishing the map provided by the embodiment, because the information of the map points in the plurality of maps is stored in the merged map, the terminal only needs to operate the merged map when performing operations such as expansion, deletion and update of the information of the map points, and the management efficiency of the map is improved. In addition, when the terminal uses the combined map for positioning, the map for matching does not need to be switched, and the positioning efficiency is improved.
A second embodiment of the present application relates to a method for establishing a map, and this embodiment is a further refinement of the first embodiment, and specifically describes a process of merging map point information of N-1 maps into a reference map according to relative pose relationships between the N-1 maps and the reference map, respectively.
Specifically, a flowchart of a method for merging map point information of N-1 maps into a reference map is shown in fig. 2, and includes the following steps:
step 201: and determining the corresponding relation between the map points in the N-1 maps and the map points in the reference map according to the relative pose relation between the N-1 maps and the reference map.
Step 202: and determining a description information set of the map points in the combined map according to the corresponding relation between the map points in the N-1 maps and the map points in the reference map and the description information of the map points in the N-1 maps.
It should be noted that, after obtaining the description information sets of the map points, the terminal may cluster the description information in the description information sets of the map points through a clustering algorithm to obtain clustered description information sets, and determine the merged map according to the clustered description information sets of the map points.
Specifically, the terminal clusters the description information in the description information set of the map points, classifies the description information of the adjacent map points into one class, and records the central point of the class as the clustered description information. For example, after determining that there is description information of a map point that is not classified, the terminal randomly selects description information of one map point as a center point from the description information of the map points that are not classified. For this central point, the terminal performs the following operations: finding all description information with the distance from the central point within a first preset value, and recording the description information as a set M; determining a vector from the central point to each element in the set M, and adding all vectors to obtain an offset vector; controlling the central point to move along the direction of the offset vector, wherein the moving distance is half of the mode of the offset vector; and judging whether the modulus of the offset vector is smaller than a second preset value, if so, recording the central point, otherwise, determining the vector from the current central point to each element in the set M, adding all vectors to obtain the offset vector, and controlling the central point to move along the direction of the offset vector by a distance which is half … … of the modulus of the offset vector until the modulus of the offset vector is smaller than the second preset value. And after the terminal determines that all the description information of the map points is classified, determining the clustered description information according to the recorded central point.
It is worth mentioning that the description information of similar map points is merged through a clustering algorithm, so that the data volume of the map is reduced.
In another specific implementation, the information of the map point further includes shooting information corresponding to the description information of the map point, where the shooting information includes shooting brightness and shooting angle. After determining the description information set of the map points, the terminal clusters the description information in the description information set of the map points through a clustering algorithm to obtain a clustered description information set. If the clustering algorithm clusters the L pieces of description information into one type, determining the shooting angles corresponding to the L pieces of description information clustered according to the shooting brightness corresponding to the L pieces of description information respectively, the shooting brightness corresponding to the description information clustered by the L pieces of description information, and the shooting angles corresponding to the description information clustered by the L pieces of description information respectively; and determining shooting information corresponding to each piece of description information in the clustered description information set according to the shooting brightness corresponding to the clustered description information and the shooting angle corresponding to the clustered description information. In specific implementation, the terminal calculates an average value of the shooting brightness corresponding to the L pieces of description information as a first average value, and uses the first average value as the shooting brightness corresponding to the description information after the L pieces of description information are clustered. And the terminal calculates the average value of the shooting angles corresponding to the L pieces of description information as a second average value, and the second average value is used as the shooting angle corresponding to the description information after the L pieces of description information are clustered.
The following describes a process of determining shooting information corresponding to the clustered description information, with reference to an actual scene.
For example, the description information corresponding to the map point P includes first description information (ka), second description information (kb), third description information (kc), fourth description information (kd), and fifth description information (ke) … …. The shooting brightness of ka is a first shooting brightness (hka), the shooting brightness of kb is a second shooting brightness (hkb), the shooting brightness of kc is a third shooting brightness (hkc) … …, the shooting angle of ka is a first shooting angle (tka), the shooting angle of kb is a second shooting angle (tkb), the shooting angle of kc is a third shooting angle (tkc) … …, ka, kb and kc are classified into one class by a clustering algorithm, the shooting brightness of clustered description information is (hka + hkb + hkc)/3, and the shooting angle of clustered description information is (tka + tkb + tkc)/3.
It is worth mentioning that the storage space of the maps is reduced by combining the N maps.
The reason why the storage space of the maps can be reduced by combining N maps is exemplified below. In the N un-merged maps, the storage format of the description information of map points is: (shot information 1, map point id, location information, description information 1 of map point); (shot information 2, map point id, location information, description information 2 of map point); (shot information 3, map point id, location information, description information of map point 3) … … (shot information p, map point id, location information, description information of map point p) where p represents the number of description information of map points and map point id represents the number of terminal to map point to determine the same map point in different maps. After the merged map is obtained by using the method for establishing a map mentioned in this embodiment, the description information of the map points in the merged map is stored in the form of a description information set, and the storage format is as follows: { map point id, position information, (photographing information 1, description information 1 of map point), (photographing information 2, description information 2 of map point), (photographing information 3, description information 3 of map point) … … (photographing information q, description information q of map point) }. And comparing the merged map with the un-merged map, wherein q is less than or equal to p after clustering the description information in the description information set of the merged map. Assume that the data type of the shooting information is unsigned character type data (unsigned char), the data size is 1 byte, the data type of the map point id is unsigned integer type data (unsigned int), the data size is 4 bytes, the position information is the space coordinate of the map point, the data type of each coordinate is floating point type data (float), the data size is 4 bytes, so the data size of the position information is 4 × 3 to 12 bytes, the data type of the description information of the map point is unsigned char, and the data size is 8 bytes. The data size occupied by the N maps before merging is p + p (4+12+8) × map points number p +24 × p, the data size of the merged map is [4+12+ q (1+8) ], the map points number (16+ q 9) × map points number, since the map points number is usually a large value, and for a large scene, it is generally of the order of one hundred thousand, therefore, the ratio k of the data of the merged map and the data of the N maps is approximately (16+ q 9)/(24 × p), and when q ═ p >1, k is about 37.5% -70.8%, so the storage space of the merged map is smaller.
It should be noted that, after the merged map is created by using the method for creating a map as mentioned in the first embodiment or the second embodiment, the terminal may perform positioning based on the merged map. The relationship between the map building method and the positioning method is shown in fig. 3, in the map, map merging in multiple states refers to merging of maps corresponding to different shooting information, and single-state positioning refers to positioning according to shooting information when a terminal initiates a positioning request. The method for positioning by using the merged map by the terminal is shown in fig. 4, and comprises the following steps:
step 301: an image for positioning is acquired.
Step 302: and extracting description information of map points in the image.
Step 303: and determining the description information matched with the description information of the map points in the image in the merged map.
In specific implementation, the terminal matches the description information of the map points in the image with all the description information in the merged map, and determines the matched description information.
In another specific implementation, the terminal determines the shooting information of the image by parsing the image or by a sensor (e.g., a photosensitive sensor) on the terminal when extracting the description information of the map in the image. And the terminal determines shooting information corresponding to the description information of the map points in the image according to the shooting information of the image. And the terminal screens the description information for matching from the description information set of the map points in the combined map according to the shooting information corresponding to the description information of the map points in the image and the shooting information corresponding to each description information in the description information set of the map points in the combined map. And the terminal determines a temporary map according to the description information for matching and carries out positioning according to the temporary map. And the terminal determines the description information matched with the description information of the map points in the image according to the description information used for matching in the temporary map. When the shooting information includes the shooting brightness and the shooting angle, the specific process of the terminal determining the description information for matching is as follows: the terminal calculates a first difference value between the shooting brightness of the image and the shooting brightness corresponding to each description information, and a second difference value between the shooting angle of the image and the shooting angle corresponding to each description information; according to the first difference and the second difference, M pieces of description information are selected from the description information set of the map points as description information used for matching; wherein M is a positive integer.
The following describes a process of obtaining description information for matching by screening from a feature information set of map points in a map by a terminal in combination with an actual scene.
Assuming that the terminal obtains the shooting brightness of the map point in the second image as H, the shooting angle as L, and the feature information set of the map point in the map as { (description information a1, shooting brightness H1, shooting angle L1), (description information a2, shooting brightness H2, shooting angle L2), (description information a3, shooting brightness H3, shooting angle L3) … … }, that is, the shooting brightness corresponding to description information a1 is H1, the corresponding shooting angle is L1, the shooting brightness corresponding to description information a2 is H2, the corresponding shooting angle is L2 … …, the terminal calculates the difference between H1 and H and the difference between L1 and L, and sets different weights for the two differences according to actual needs, thereby determining the distance dt between the shooting information of the map point in the second image and the shooting information of the map point in the map, that is dt ═ a × H1) + L1, wherein, a is the weight of the difference value of the shooting brightness, and b is the weight of the difference value of the shooting angle. By analogy, the terminal calculates the distance between the shooting information of the map point of the second image and the shooting information corresponding to each description information in the feature information set of the map point in the map. And the terminal sorts each description information in the feature information set of the map points in the map according to the sequence of the distance from small to large, and selects the top M pieces of description information arranged in the front as the description information for matching.
It should be noted that, in practical applications, it may be understood by those skilled in the art that the feature information set may also be screened by a method of setting a preset distance value, that is, description information in which a distance of shooting information corresponding to description information of a map point in the second image is smaller than the preset distance value is used as description information for matching.
Step 304: and acquiring the position information of the map point corresponding to the matched description information from the merged map, and determining a positioning result according to the acquired position information.
Specifically, the terminal determines a positioning result by using a pose estimation algorithm, for example, a PnP pose measurement algorithm, according to a matching result between a map point in the map and a map point in the second image. And the positioning result comprises pose information of the terminal.
In specific implementation, before the position information of the map points corresponding to the matched description information is acquired from the merged map and the positioning result is determined according to the acquired position information, the terminal can also determine that at least T map points in the image are successfully matched with the map. Where T is a positive integer, e.g., T equals 10.
It is worth mentioning that after the number of map points successfully matched reaches T, the pose information of the terminal is calculated, so that resource waste caused by determining a positioning result under the condition that the number of map points successfully matched is insufficient and positioning cannot be performed is avoided.
Compared with the prior art, in the method for establishing the map provided by the embodiment, because the information of the map points in the plurality of maps is stored in the merged map, the terminal only needs to operate the merged map when performing operations such as expansion, deletion and update of the information of the map points, and the management efficiency of the map is improved. In addition, when the terminal uses the combined map for positioning, the map for matching does not need to be switched, and the positioning efficiency is improved. In addition, the terminal merges the description information of similar map points through a clustering algorithm, so that the data volume of the map is reduced.
A third embodiment of the present application relates to a terminal, as shown in fig. 5, comprising at least one processor 401; and a memory 402 communicatively coupled to the at least one processor 401. The memory 402 stores instructions executable by the at least one processor 401, and the instructions are executed by the at least one processor 401 to enable the at least one processor 401 to perform the above-mentioned map building method.
In this embodiment, the processor 401 is exemplified by a Central Processing Unit (CPU), and the Memory 402 is exemplified by a Random Access Memory (RAM). The processor 401 and the memory 402 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example. The memory 402 is a non-volatile computer-readable storage medium for storing non-volatile software programs, non-volatile computer-executable programs, and modules, and the description information of the map points in the embodiment of the present application is stored in the memory 402. The processor 401 executes various functional applications of the device and data processing by running non-volatile software programs, instructions, and modules stored in the memory 402, that is, implements the above-described method of creating a map.
The memory 402 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 402 may optionally include memory located remotely from the processor, which may be connected to an external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory and, when executed by the one or more processors, perform a method of building a map in any of the method embodiments described above.
The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.
A fourth embodiment of the present application relates to a computer-readable storage medium storing a computer program. The computer program, when executed by a processor, implements the method of building a map as described in any of the method embodiments above.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the present application, and that various changes in form and details may be made therein without departing from the spirit and scope of the present application in practice.