CN115641362B - Trajectory prediction method, device and storage medium - Google Patents
Trajectory prediction method, device and storage mediumInfo
- Publication number
- CN115641362B CN115641362B CN202211401325.1A CN202211401325A CN115641362B CN 115641362 B CN115641362 B CN 115641362B CN 202211401325 A CN202211401325 A CN 202211401325A CN 115641362 B CN115641362 B CN 115641362B
- Authority
- CN
- China
- Prior art keywords
- information
- grid
- dynamic
- optical flow
- track
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Image Analysis (AREA)
Abstract
The present disclosure provides a track prediction method, a device and a storage medium, wherein the method includes predicting, according to static information of a target scene, occupancy grid information, dynamic information and historical track information of a plurality of dynamic objects in the target scene, occupancy grid information, optical flow information and track information of the dynamic objects in the future; and then verifying the track information by utilizing the optimized optical flow information to obtain a target track of the dynamic object, avoiding that the predicted target track is not feasible in an actual environment, and improving the prediction accuracy of the target track.
Description
Technical Field
The disclosure relates to the technical field of vehicle driving, in particular to a track prediction method, a track prediction device and a storage medium.
Background
With the increasing maturity of intelligent driving technology, intelligent driving automobiles begin to be applied to more and more scenes such as urban roads, closed parks and highways, the scenes have clear lane marks and perfect traffic rules, however, in the scenes without standard lane lines and strictly executed traffic rules, surrounding vehicles have more complicated dynamic changes, for example, in parking lot scenes, the distance between vehicles is relatively close, and the parking operation of the vehicles is relatively frequent, so that the existing track prediction method cannot meet the track prediction task of the vehicles.
Disclosure of Invention
To overcome the problems in the related art, the present disclosure provides a trajectory prediction method, apparatus, and storage medium.
According to a first aspect of the present disclosure, there is provided a trajectory prediction method, the method comprising:
Acquiring grid map information of a multi-frame target scene, wherein the grid map information comprises static information of the target scene, occupation grid information of a plurality of dynamic objects in the target scene, dynamic information and historical track information, and the dynamic information is obtained by mapping information of the plurality of dynamic objects sensed by a target vehicle onto a grid map;
predicting according to the grid map information of the multi-frame target scene to obtain the occupied grid information, the optical flow information and the track information of the dynamic object in the future multi-frame;
Optimizing the optical flow information by utilizing the occupancy grid information of the dynamic object;
and verifying the track information by utilizing the optimized optical flow information to obtain the target track of the dynamic object.
In any embodiment, the static information comprises semantic information of each pixel point in the grid map, and the method further comprises:
acquiring semantic information of each pixel point in the grid map;
Acquiring the positions of a plurality of dynamic objects in a target scene and the size information of the dynamic objects;
Mapping the position of the dynamic object and the size information into the grid map to obtain the occupied grid information of the dynamic object;
And acquiring dynamic information of the dynamic object, and marking the dynamic information on each pixel point in the range of the occupied grid of the dynamic object to obtain grid map information of the target scene.
In any embodiment, the dynamic information includes a velocity, acceleration, and attitude angle of the dynamic object;
the marking the dynamic information on each pixel point in the range of the occupied grid of the dynamic object comprises the following steps:
According to the speed, the acceleration and the attitude angle of the dynamic object and a preset time interval, determining the displacement vector of each pixel point in the range of the occupied grid of the dynamic object;
and taking the components of the displacement vector of the pixel point in the x-axis direction and the y-axis direction as optical flow information of the pixel point.
In any embodiment, the predicting, according to the grid map information of the multi-frame target scene, occupancy grid information, optical flow information and track information of the dynamic object in the future multi-frame includes:
extracting features of grid map information of a multi-frame target scene to obtain features of the grid map information;
Inputting the characteristics into a pre-trained occupied grid and an optical flow prediction network to predict, so as to obtain occupied grid information and optical flow information of the dynamic object in a future multiframe;
and inputting the characteristics into a pre-trained track prediction network to predict, so as to obtain track information of the dynamic object in the future.
In any embodiment, the feature extraction of the grid map information of the multi-frame target scene to obtain the feature of the grid map information includes:
downsampling the grid map information of each frame of target scene for multiple times to obtain sub-features with different scales;
And fusing the sub-features with different scales to obtain the features of the grid map information.
In any embodiment, the occupancy grid and the flow prediction network are obtained by training according to sample images marked with occupancy grid labels and optical flow labels, wherein the optical flow labels are used for indicating displacement differences of pixel points in two adjacent frames of sample images;
the step of inputting the characteristics into a pre-trained occupation grid and optical flow prediction network to predict, so as to obtain occupation grid information and optical flow information of the dynamic object in the future multiframes, which comprises the following steps:
Fusing the characteristics corresponding to the multi-frame grid map information by utilizing the occupied grid and the optical flow prediction network;
and predicting the occupied grid information and the optical flow information of the dynamic object in the future multiple frames according to the fusion characteristics.
In any embodiment, the track prediction network is obtained by training according to a sample image marked with a real track;
The step of inputting the characteristics into a pre-trained track prediction network for prediction to obtain track information of the dynamic object in a future multiframe comprises the following steps:
Carrying out semantic enhancement on the features corresponding to the grid map information of each frame by utilizing a track prediction network;
And predicting track information of the dynamic object in the future for multiple frames according to the enhanced characteristics.
In any embodiment, the dynamic object corresponds to a plurality of pieces of optical flow information, and each piece of optical flow information corresponds to occupied grid information;
the optimizing the optical flow information by using the occupancy grid information of the dynamic object includes:
acquiring target occupation grid information overlapped with occupation grids of other dynamic objects in the occupation grid information of the dynamic objects;
And deleting the optical flow information corresponding to the target occupation grid information.
According to a second aspect of the present disclosure, there is provided a trajectory prediction device, the device comprising:
The system comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring grid map information of a multi-frame target scene, the grid map information comprises static information of the target scene, occupied grid information of a plurality of dynamic objects in the target scene, dynamic information and historical track information, and the dynamic information is obtained by mapping information of the plurality of dynamic objects sensed by a target vehicle onto a grid map;
The prediction unit is used for predicting and obtaining the occupied grid information, the optical flow information and the track information of the dynamic object in the future multiframes according to the grid map information of the multiframe target scene;
The optimizing unit is used for optimizing the optical flow information by utilizing the grid information occupied by the dynamic object;
And the verification unit is used for verifying the track information by utilizing the optimized optical flow information to obtain the target track of the dynamic object.
According to a third aspect of the present disclosure there is provided an electronic device comprising a processor, a memory for storing processor executable instructions to perform the method of any embodiment of the present disclosure.
According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method according to any embodiment of the present disclosure.
The technical scheme provided by the disclosure can have the advantages that the occupancy grid information, the dynamic information and the historical track information of a plurality of dynamic objects in a target scene are predicted according to the static information of the target scene, the occupancy grid information, the optical flow information and the track information of the dynamic objects are optimized by utilizing the occupancy grid information of the dynamic objects, the spatial topological relation between the dynamic objects in the target scene and the target scene is considered, and then the track information is verified by utilizing the optimized optical flow information to obtain the target track of the dynamic objects, so that the predicted target track is prevented from being infeasible in an actual environment, and the prediction accuracy of the target track is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and, together with the description, serve to explain the technical aspects of the disclosure.
Fig. 1 is a flowchart of a trajectory prediction method according to an exemplary embodiment of the present disclosure.
Fig. 2 is a schematic diagram of occupancy grid information in a parking lot, as shown by the present disclosure, according to an example embodiment.
FIG. 3 is a schematic diagram of optical flow information in a parking lot, according to an example embodiment of the present disclosure.
Fig. 4 is a specific flowchart illustrating a trajectory prediction method according to an exemplary embodiment of the present disclosure.
Fig. 5 is a schematic structural view of a trajectory prediction device according to an exemplary embodiment of the present disclosure.
Fig. 6 is a schematic diagram of an electronic device structure of trajectory prediction according to an exemplary embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
At present, when the moving track of a dynamic object is predicted, track points are usually predicted according to map information and historical track information of the dynamic object so as to generate the moving track of the dynamic object, but the prediction method has lower efficiency for a scene with a faster change of surrounding environment, for example, in a parking lot scene, because a space in the parking lot is crowded compared with a highway, surrounding obstacles are closer to a vehicle, parking operation of the vehicle is usually more complex than road running, and the accuracy is lower when the target track of the dynamic object in the parking lot is predicted by using the conventional track prediction method, so that the collision risk is increased.
In view of this, the present disclosure provides a trajectory prediction method that can predict a trajectory of a vehicle, which is in a parking operation around a target vehicle, and other dynamic obstacles, etc., so as to predict a risk, and avoid an accident.
The following embodiments will explain a trajectory prediction method provided by the present disclosure with reference to the accompanying drawings.
Fig. 1 is a flowchart of a trajectory prediction method according to an exemplary embodiment of the present disclosure, and as shown in fig. 1, the trajectory prediction method includes the following steps 101 to 104.
In step 101, grid map information of a multi-frame target scene is obtained, where the grid map information includes static information of the target scene, occupied grid information of a plurality of dynamic objects in the target scene, dynamic information and historical track information, and the dynamic information is obtained by mapping information of the plurality of dynamic objects sensed by a target vehicle onto a grid map.
The target track of the dynamic object in the future multiframes can be predicted according to the historical track information of the multiframes, but because the motion of the dynamic object in some target scenes is more frequent, the static information of the target scenes, the occupied grid information and the dynamic information of a plurality of dynamic objects in the target scenes can be considered when the track is preset.
In this embodiment, a map of a target scene may be obtained in advance, n pixels are sampled according to a manner of one pixel per m meters, semantic information of each pixel is obtained, and pixels with semantic information are combined to obtain a grid map.
Mapping the dynamic object onto a grid map according to the coordinate of the dynamic object, and determining the grid occupation information of the dynamic object on the grid map according to the size information of the dynamic object.
In this embodiment, the running information of a plurality of dynamic objects sensed by the target vehicle may be acquired, and the running information may be mapped to the grid map to obtain the dynamic information of the dynamic objects.
In step 102, the occupied grid information, the optical flow information and the track information of the dynamic object in the future multiframes are obtained according to the grid map information prediction of the multiframe target scene.
Wherein the optical flow information is used for representing the motion vector of the pixel point of the dynamic object.
In this embodiment, a future second number of grid map information may be predicted according to the first number of grid map information, where the first number and the second number may be determined according to actual requirements, for example, the first number may be 10, 11, etc., the second number may be 8, 9, etc., typically the first number is greater than the second number, and the last grid map information in the first number is adjacent to the first grid map information in the second number.
In embodiments of the present disclosure, pre-trained network models may be utilized to predict occupancy grid information, optical flow information, and trajectory information for future multi-frame dynamic objects.
The grid map information of the multi-frame target scene can be utilized to predict and obtain a plurality of pieces of optical flow information and a plurality of pieces of track information which accord with the spatial relation of the dynamic object in the future, and each piece of optical flow information and track information can have a probability value for describing the selection of the optical flow information and track information by the dynamic object.
In step 103, the optical flow information is optimized using the occupancy grid information of the dynamic object.
The occupied grid information of the dynamic object can represent the position of the dynamic object in the current frame, and the optical flow information can be optimized according to the occupied grid information of the dynamic object, so that the optical flow information conforming to the space topological relation in the target scene is obtained.
In step 104, the track information is verified by using the optimized optical flow information, so as to obtain the target track of the dynamic object.
Track information which is inconsistent with the optimized optical flow information in the track information can be deleted, and track information which is identical with the movement trend represented by the optical flow information can be obtained as the target track of the dynamic object according to the optimized optical flow information.
The method comprises the steps of predicting the occupancy grid information, the optical flow information and the track information of a plurality of dynamic objects in a target scene according to the static information of the target scene, predicting the occupancy grid information, the optical flow information and the track information of the dynamic objects in the future according to the occupancy grid information, the dynamic information and the history track information of the plurality of dynamic objects in the target scene, optimizing the optical flow information by utilizing the occupancy grid information of the dynamic objects, considering the spatial topological relation between the dynamic objects in the target scene and the target scene, and verifying the track information by utilizing the optimized optical flow information to obtain the target track of the dynamic objects, so that the predicted target track is prevented from being infeasible in the actual environment, and the prediction accuracy of the target track is improved.
In some embodiments, the static information may include semantic information of each pixel in the grid map, and in this case, the method further includes obtaining semantic information of each pixel in the grid map, obtaining positions of a plurality of dynamic objects in a target scene and size information of the dynamic objects, mapping the positions of the dynamic objects and the size information into the grid map to obtain occupied grid information of the dynamic objects, obtaining dynamic information of the dynamic objects, and marking the dynamic information on each pixel in a range of occupied grids of the dynamic objects to obtain grid map information of the target scene.
In this embodiment, the semantic information of each pixel point in the pre-stored grid map may be updated according to the obtained semantic information of each pixel point in the grid map, where the semantic information may include a road that can be driven, an open parking space, a static obstacle (e.g., a vehicle parked in a parking space), and the like. If the acquired semantic information of the pixel point is an obstacle, for example, a vehicle or a pedestrian, and the prestored semantic information of the pixel point is a road capable of running, and the obstacle is described to cover the road capable of running, the semantic information of the pixel point in the grid map is updated to be the obstacle.
The position information (namely, GPS coordinates) of the target vehicle is acquired, and the sensing module of the target vehicle acquires the position information of other dynamic objects around the target vehicle relative to the target vehicle, so that the position coordinates of the target vehicle on the grid map can be determined according to the conversion relation between the world coordinate system and the image coordinate system, and the position coordinates of the dynamic object on the grid map can be determined according to the position relation of the dynamic object relative to the target vehicle. And determining the grid occupation information corresponding to the dynamic object according to the position coordinates of the dynamic object on the grid map and the size of the dynamic object.
In some embodiments, the dynamic information includes a speed, an acceleration and an attitude angle of the dynamic object, and in this case, the marking the dynamic information on each pixel point in the range of the occupancy grid of the dynamic object may include determining, according to the speed, the acceleration and the attitude angle of the dynamic object and a preset time interval, a displacement vector of each pixel point in the range of the occupancy grid of the dynamic object, and taking components of the displacement vector of each pixel point in an x-axis direction and a y-axis direction as optical flow information of the pixel point.
The sensing module for acquiring the target vehicle acquires dynamic information of a dynamic object acquired according to a set frequency in the running process of the target vehicle, wherein the dynamic information can comprise speed, acceleration, attitude angle, angular speed, angular acceleration and the like. In order to facilitate calculation, the dynamic information of the dynamic object collected by the sensing module of the target vehicle can be converted, and data centered on the target vehicle can be obtained after conversion.
In one embodiment, a grid map of a target scene may be cached to reduce redundant computation and reduce memory usage. And mapping the acquired static information of the target scene, the dynamic information of a plurality of dynamic objects in the target scene and the position information of the dynamic objects onto a pre-stored grid map during prediction to obtain grid map information of the target scene.
In some embodiments, the predicting the grid map information according to the multi-frame target scene to obtain the occupancy grid information, the optical flow information and the track information of the dynamic object in the future multi-frame target scene may include extracting features of the grid map information of the multi-frame target scene to obtain features of the grid map information, inputting the features into a pre-trained occupancy grid and optical flow prediction network to predict to obtain the occupancy grid information and the optical flow information of the dynamic object in the future multi-frame, and inputting the features into a pre-trained track prediction network to predict to obtain the track information of the dynamic object in the future multi-frame.
In the embodiment, the grid map information of each frame of target scene can be downsampled for multiple times to obtain the sub-features of different scales, and the sub-features of different scales are fused to obtain the features of the grid map information.
In some embodiments, the occupancy grid and optical flow prediction network is trained from sample images labeled with occupancy grid labels and optical flow labels that are used to indicate the displacement differences of pixel points in two adjacent frames of sample images.
Wherein a feature map corresponding to a sample image used to train the occupancy grid and the optical flow prediction network may include multiple dimensions, such as a channel dimension, a height dimension, a width dimension, a batch (batch) dimension, etc., the format of the feature map may be expressed, for example, as [ BHW ], where B represents the batch dimension, H represents the height dimension, and W represents the width dimension. The labels occupying the grid can be divided into two, one being an observable grid in the shape of [ Batch, H, W,1], the second being an occluded grid in the shape of [ Batch, H, W,1], with "1" referring to the value 0 or 1,1 of the category representing the presence of a grid at each location on the grid map. The label of the optical flow is the real flow, the shape is [ Batch, H, W,2], "2" represents dx, dy, the difference of the displacement of the previous frame and the current frame.
In this case, the feature input pre-trained occupied grid and optical flow prediction network predicts to obtain occupied grid information and optical flow information of the dynamic object in the future multiframes, and the method comprises the steps of utilizing the occupied grid and the optical flow prediction network to fuse the features corresponding to multiframe grid map information, and predicting the occupied grid information and the optical flow information of the dynamic object in the future multiframes according to the fused features.
Fig. 2 is a schematic diagram of occupancy grid information in a parking lot according to an exemplary embodiment of the present disclosure, and as shown in fig. 2, an occupancy grid 21 is used to represent occupancy grid information corresponding to a vehicle parked in a parking space, and an occupancy grid 22 and an occupancy grid 23 are used to represent occupancy grid information corresponding to a vehicle in motion.
FIG. 3 is a schematic diagram of optical flow information in a parking lot, as shown in FIG. 3, from which optical flow information of dynamic objects in the parking lot may be predicted, according to an exemplary embodiment of the present disclosure.
In some embodiments, the track prediction network is trained from a sample image marked with a real track, for example, the label of the track may be the track true value of a future frame, and the shape is [ Batch,1,60,2], where "2" indicates whether the track exists at the x, y position.
In this embodiment, the step of inputting the features into the pre-trained track prediction network to predict the track information of the dynamic object in the future multiframe includes the steps of using the track prediction network to semantically enhance the features corresponding to the grid map information of each frame, and predicting the track information of the dynamic object in the future multiframe according to the enhanced features.
In this embodiment, the features corresponding to the obtained raster map information of each frame may be input into a codec structure, and semantic enhancement of the features may be implemented by using an encoder, where the encoder structure may select an encoder of a transducer. The enhanced features are input into a decoder, which may select a convolutional neural network, a cyclic neural network, a transducer decoder, etc., to generate predicted trajectory information.
In some embodiments, the dynamic object corresponds to a plurality of pieces of optical flow information, each piece of optical flow information corresponds to an occupied grid information, and in this case, the optimizing the optical flow information by using the occupied grid information of the dynamic object may include acquiring target occupied grid information overlapping with occupied grids of other dynamic objects in the occupied grid information of the dynamic object, and deleting the optical flow information corresponding to the target occupied grid information.
Fig. 4 is a specific flow diagram of a trajectory prediction method according to an exemplary embodiment of the present disclosure, as shown in fig. 4, in an implementation, trajectory prediction may be performed according to the following steps.
In step 401, information acquired by a sensing module of a target vehicle is acquired, and the acquired information is preprocessed to obtain static information of a target scene, occupation grid information, dynamic information and historical track information of a plurality of dynamic objects in the target scene.
In step 402, grid map information of a multi-frame target scene is acquired, and feature extraction is performed on the grid map information of the multi-frame target scene to obtain features of the grid map information.
In step 403, the features of the multi-frame grid map information are aggregated to obtain aggregated features.
In step 404a, the aggregated features are input into a pre-trained occupancy grid and optical flow prediction network to predict, so as to obtain occupancy grid information and optical flow information of the dynamic object for multiple frames in the future, and the occupancy grid information of the dynamic object is utilized to optimize the optical flow information.
In step 404b, the aggregated features are input into a pre-trained track prediction network to predict, so as to obtain track information of the dynamic object for multiple frames in the future.
In step 405, the track information is verified by using the optimized optical flow information, so as to obtain the target track of the dynamic object.
Corresponding to the embodiments of the aforementioned method, the present disclosure also provides embodiments of the apparatus and the terminal to which it is applied.
Fig. 5 is a schematic structural diagram of a trajectory prediction device according to an exemplary embodiment of the present disclosure, and as shown in fig. 5, the trajectory prediction device includes an acquisition unit 501, a prediction unit 502, an optimization unit 503, and a verification unit 504.
An obtaining unit 501, configured to obtain raster map information of a multi-frame target scene, where the raster map information includes static information of the target scene, occupied raster information of a plurality of dynamic objects in the target scene, dynamic information, and historical track information, where the dynamic information is obtained by mapping information of the plurality of dynamic objects sensed by a target vehicle onto a raster map;
the prediction unit 502 is configured to predict, according to the grid map information of the multi-frame target scene, occupancy grid information, optical flow information, and track information of the dynamic object in a future multi-frame;
An optimizing unit 503, configured to optimize the optical flow information by using the grid information occupied by the dynamic object;
and the verification unit 504 is configured to verify the track information by using the optimized optical flow information, so as to obtain a target track of the dynamic object.
In some embodiments, the static information includes semantic information of each pixel point in the grid map, and the obtaining unit 501 is further configured to:
acquiring semantic information of each pixel point in the grid map;
Acquiring the positions of a plurality of dynamic objects in a target scene and the size information of the dynamic objects;
Mapping the position of the dynamic object and the size information into the grid map to obtain the occupied grid information of the dynamic object;
And acquiring dynamic information of the dynamic object, and marking the dynamic information on each pixel point in the range of the occupied grid of the dynamic object to obtain grid map information of the target scene.
In some embodiments, the dynamic information includes velocity, acceleration, and attitude angle of the dynamic object;
the acquiring unit 501 is specifically configured to:
According to the speed, the acceleration and the attitude angle of the dynamic object and a preset time interval, determining the displacement vector of each pixel point in the range of the occupied grid of the dynamic object;
and taking the components of the displacement vector of the pixel point in the x-axis direction and the y-axis direction as optical flow information of the pixel point.
In some embodiments, the prediction unit 502 is specifically configured to:
extracting features of grid map information of a multi-frame target scene to obtain features of the grid map information;
Inputting the characteristics into a pre-trained occupied grid and an optical flow prediction network to predict, so as to obtain occupied grid information and optical flow information of the dynamic object in a future multiframe;
and inputting the characteristics into a pre-trained track prediction network to predict, so as to obtain track information of the dynamic object in the future.
In some embodiments, the prediction unit 502 is specifically configured to:
downsampling the grid map information of each frame of target scene for multiple times to obtain sub-features with different scales;
And fusing the sub-features with different scales to obtain the features of the grid map information.
In some embodiments, the occupancy grid and the flow prediction network are trained according to sample images marked with occupancy grid labels and optical flow labels, wherein the optical flow labels are used for indicating displacement differences of pixel points in two adjacent frames of sample images, and the prediction unit 502 is specifically configured to:
Fusing the characteristics corresponding to the multi-frame grid map information by utilizing the occupied grid and the optical flow prediction network;
and predicting the occupied grid information and the optical flow information of the dynamic object in the future multiple frames according to the fusion characteristics.
In some embodiments, the track prediction network is trained from a sample image marked with a real track, and the prediction unit 502 is specifically configured to:
Carrying out semantic enhancement on the features corresponding to the grid map information of each frame by utilizing a track prediction network;
And predicting track information of the dynamic object in the future for multiple frames according to the enhanced characteristics.
In some embodiments, the dynamic object corresponds to a plurality of pieces of optical flow information, each piece of optical flow information corresponds to occupancy grid information, and the optimizing unit 503 specifically includes:
acquiring target occupation grid information overlapped with occupation grids of other dynamic objects in the occupation grid information of the dynamic objects;
And deleting the optical flow information corresponding to the target occupation grid information.
Fig. 6 is a schematic structural diagram of an electronic device for trajectory prediction provided in at least one embodiment of the present disclosure. As shown in fig. 6, the electronic device includes a memory for storing computer instructions executable on the processor for implementing the trajectory prediction method of any of the embodiments of the present disclosure when the computer instructions are executed.
At least one embodiment of the present disclosure also proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the trajectory prediction methods of the present disclosure.
One skilled in the art will appreciate that one or more embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Moreover, one or more embodiments of the present disclosure may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
"And/or" in this disclosure means at least one of the two, e.g., "A and/or B" includes three schemes A, B, and "A and B".
The various embodiments in this disclosure are described in a progressive manner, and identical and similar parts of the various embodiments are all referred to each other, and each embodiment is mainly described as different from other embodiments. In particular, for data processing apparatus embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the description of method embodiments in part.
The foregoing has described certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the acts or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Embodiments of the subject matter and the functional operations described in this disclosure may be implemented in digital electronic circuitry, in tangibly embodied computer software or firmware, in computer hardware including the structures disclosed in this disclosure and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this disclosure can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on a manually-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
The processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Computers suitable for executing computer programs include, for example, general purpose and/or special purpose microprocessors, or any other type of central processing unit. Typically, the central processing unit will receive instructions and data from a read only memory and/or a random access memory. The essential elements of a computer include a central processing unit for carrying out or executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks, etc. However, a computer does not have to have such a device. Furthermore, the computer may be embedded in another device, such as a mobile phone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.
Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices including, for example, semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disk or removable disks), magneto-optical disks, and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this disclosure contains many specific implementation details, these should not be construed as limiting the scope of any invention or the scope of the claims, but rather as primarily describing features of specific embodiments of the particular invention. Certain features that are described in this disclosure in the context of separate embodiments can also be implemented in combination in a single embodiment. On the other hand, the various features described in the individual embodiments may also be implemented separately in the various embodiments or in any suitable subcombination. Furthermore, although features may be acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, although operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Furthermore, the processes depicted in the accompanying drawings are not necessarily required to be in the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.
The foregoing description of the preferred embodiment(s) of the present disclosure is merely intended to illustrate the embodiment(s) of the present disclosure, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the embodiment(s) of the present disclosure are intended to be included within the scope of the present disclosure.
Claims (10)
1. A method of trajectory prediction, the method comprising:
Acquiring grid map information of a multi-frame target scene, wherein the grid map information comprises static information of the target scene, occupation grid information of a plurality of dynamic objects in the target scene, dynamic information and historical track information, and the dynamic information is obtained by mapping information of the plurality of dynamic objects sensed by a target vehicle onto a grid map;
predicting according to the grid map information of the multi-frame target scene to obtain the occupied grid information, the optical flow information and the track information of the dynamic object in the future multi-frame;
Optimizing the optical flow information by utilizing the occupancy grid information of the dynamic object;
and verifying the track information by utilizing the optimized optical flow information to obtain the target track of the dynamic object.
2. The method of claim 1, wherein the static information comprises semantic information for each pixel in the grid map, the method further comprising:
acquiring semantic information of each pixel point in the grid map;
Acquiring the positions of a plurality of dynamic objects in a target scene and the size information of the dynamic objects;
Mapping the position of the dynamic object and the size information into the grid map to obtain the occupied grid information of the dynamic object;
And acquiring dynamic information of the dynamic object, and marking the dynamic information on each pixel point in the range of the occupied grid of the dynamic object to obtain grid map information of the target scene.
3. The method of claim 2, wherein the dynamic information includes a velocity, an acceleration, and an attitude angle of the dynamic object;
the marking the dynamic information on each pixel point in the range of the occupied grid of the dynamic object comprises the following steps:
According to the speed, the acceleration and the attitude angle of the dynamic object and a preset time interval, determining the displacement vector of each pixel point in the range of the occupied grid of the dynamic object;
and taking the components of the displacement vector of the pixel point in the x-axis direction and the y-axis direction as optical flow information of the pixel point.
4. The method according to claim 1, wherein predicting the occupancy grid information, the optical flow information, and the trajectory information of the dynamic object in the future from the grid map information of the multi-frame target scene includes:
extracting features of grid map information of a multi-frame target scene to obtain features of the grid map information;
Inputting the characteristics into a pre-trained occupied grid and an optical flow prediction network to predict, so as to obtain occupied grid information and optical flow information of the dynamic object in a future multiframe;
and inputting the characteristics into a pre-trained track prediction network to predict, so as to obtain track information of the dynamic object in the future.
5. The method according to claim 4, wherein the feature extraction of the grid map information of the multi-frame target scene to obtain the features of the grid map information includes:
downsampling the grid map information of each frame of target scene for multiple times to obtain sub-features with different scales;
And fusing the sub-features with different scales to obtain the features of the grid map information.
6. The method of claim 4, wherein the occupancy grid and flow prediction network is trained from sample images labeled with occupancy grid labels and optical flow labels, the optical flow labels being used to indicate displacement differences of pixels in two adjacent frames of sample images;
the step of inputting the characteristics into a pre-trained occupation grid and optical flow prediction network to predict, so as to obtain occupation grid information and optical flow information of the dynamic object in the future multiframes, which comprises the following steps:
Fusing the characteristics corresponding to the multi-frame grid map information by utilizing the occupied grid and the optical flow prediction network;
and predicting the occupied grid information and the optical flow information of the dynamic object in the future multiple frames according to the fusion characteristics.
7. The method of claim 4, wherein the trajectory prediction network is trained from sample images labeled with real trajectories;
The step of inputting the characteristics into a pre-trained track prediction network for prediction to obtain track information of the dynamic object in a future multiframe comprises the following steps:
Carrying out semantic enhancement on the features corresponding to the grid map information of each frame by utilizing a track prediction network;
And predicting track information of the dynamic object in the future for multiple frames according to the enhanced characteristics.
8. The method of any one of claims 1 to 7, wherein the dynamic object corresponds to a plurality of optical flow information, each optical flow information corresponding to occupancy grid information;
the optimizing the optical flow information by using the occupancy grid information of the dynamic object includes:
acquiring target occupation grid information overlapped with occupation grids of other dynamic objects in the occupation grid information of the dynamic objects;
And deleting the optical flow information corresponding to the target occupation grid information.
9. A trajectory prediction device, the device comprising:
The system comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring grid map information of a multi-frame target scene, the grid map information comprises static information of the target scene, occupied grid information of a plurality of dynamic objects in the target scene, dynamic information and historical track information, and the dynamic information is obtained by mapping information of the plurality of dynamic objects sensed by a target vehicle onto a grid map;
The prediction unit is used for predicting and obtaining the occupied grid information, the optical flow information and the track information of the dynamic object in the future multiframes according to the grid map information of the multiframe target scene;
The optimizing unit is used for optimizing the optical flow information by utilizing the grid information occupied by the dynamic object;
And the verification unit is used for verifying the track information by utilizing the optimized optical flow information to obtain the target track of the dynamic object.
10. An electronic device, the device comprising:
A processor;
a memory for storing processor-executable instructions to perform the method of any one of claims 1 to 8.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211401325.1A CN115641362B (en) | 2022-11-09 | 2022-11-09 | Trajectory prediction method, device and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211401325.1A CN115641362B (en) | 2022-11-09 | 2022-11-09 | Trajectory prediction method, device and storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN115641362A CN115641362A (en) | 2023-01-24 |
| CN115641362B true CN115641362B (en) | 2025-08-22 |
Family
ID=84948020
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211401325.1A Active CN115641362B (en) | 2022-11-09 | 2022-11-09 | Trajectory prediction method, device and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115641362B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120635679B (en) * | 2025-08-13 | 2025-11-07 | 北京人形机器人创新中心有限公司 | Model training methods, indoor scene occupancy prediction methods, equipment and media |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113753077A (en) * | 2021-08-17 | 2021-12-07 | 北京百度网讯科技有限公司 | Method and device for predicting movement locus of obstacle and automatic driving vehicle |
| CN115146873A (en) * | 2022-07-30 | 2022-10-04 | 重庆长安汽车股份有限公司 | Vehicle track prediction method and system |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11465633B2 (en) * | 2018-11-14 | 2022-10-11 | Huawei Technologies Co., Ltd. | Method and system for generating predicted occupancy grid maps |
| CN111046919B (en) * | 2019-11-21 | 2023-05-12 | 南京航空航天大学 | A system and method for predicting surrounding dynamic vehicle trajectories based on behavioral intentions |
| US12001958B2 (en) * | 2020-03-19 | 2024-06-04 | Nvidia Corporation | Future trajectory predictions in multi-actor environments for autonomous machine |
| CN111626097A (en) * | 2020-04-09 | 2020-09-04 | 吉利汽车研究院(宁波)有限公司 | A method, device, electronic device and storage medium for predicting the future trajectory of an obstacle |
| CN115246416B (en) * | 2021-05-13 | 2023-09-26 | 上海仙途智能科技有限公司 | Trajectory prediction method, device, equipment and computer-readable storage medium |
| CN114647246A (en) * | 2022-03-24 | 2022-06-21 | 重庆长安汽车股份有限公司 | Local path planning method and system for time-space coupling search |
-
2022
- 2022-11-09 CN CN202211401325.1A patent/CN115641362B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113753077A (en) * | 2021-08-17 | 2021-12-07 | 北京百度网讯科技有限公司 | Method and device for predicting movement locus of obstacle and automatic driving vehicle |
| CN115146873A (en) * | 2022-07-30 | 2022-10-04 | 重庆长安汽车股份有限公司 | Vehicle track prediction method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN115641362A (en) | 2023-01-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112015847B (en) | Obstacle trajectory prediction method and device, storage medium and electronic equipment | |
| Krajewski et al. | The round dataset: A drone dataset of road user trajectories at roundabouts in germany | |
| CN108073950A (en) | Recognition methods, identification device, identifier generation method and identifier generating means | |
| US20210216798A1 (en) | Using captured video data to identify pose of a vehicle | |
| CN108021858A (en) | Mobile object recognition methods and object flow analysis method | |
| CN112212874A (en) | Vehicle track prediction method and device, electronic equipment and computer readable medium | |
| CN112444258B (en) | Method for determining drivable area, intelligent driving system and intelligent car | |
| CN113424209B (en) | Trajectory prediction using deep learning multi-predictor fusion and Bayesian optimization | |
| EP4137845A1 (en) | Methods and systems for predicting properties of a plurality of objects in a vicinity of a vehicle | |
| KR102517086B1 (en) | Apparatus and method for generating a lane polyline using a neural network model | |
| CN115523934A (en) | Vehicle track prediction method and system based on deep learning | |
| JP2024019629A (en) | Prediction device, prediction method, program and vehicle control system | |
| CN115115084B (en) | Predicting future movement of agents in an environment using occupied flow fields | |
| CN117015792A (en) | System and method for generating object detection tags for automated driving with concave image magnification | |
| Chamola et al. | Overtaking mechanisms based on augmented intelligence for autonomous driving: Data sets, methods, and challenges | |
| CN113112524A (en) | Method and device for predicting track of moving object in automatic driving and computing equipment | |
| Friji et al. | A dqn-based autonomous car-following framework using rgb-d frames | |
| JP2023548516A (en) | Methods for providing information about road users | |
| WO2019195191A1 (en) | Dynamic image region selection for visual inference | |
| CN115641362B (en) | Trajectory prediction method, device and storage medium | |
| EP4148600A1 (en) | Attentional sampling for long range detection in autonomous vehicles | |
| KR102499023B1 (en) | Apparatus and method for determining traffic flow by lane | |
| CN119283893A (en) | Vehicle driving state prediction method and related device, equipment, and storage medium | |
| US12276511B2 (en) | Method of selecting a route for recording vehicle | |
| WO2020073271A1 (en) | Snapshot image of traffic scenario |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CP03 | Change of name, title or address |
Address after: 314000 Zhejiang Province Jiaxing City Nanhu District Yuxin Town Jiangxian Road 570 Building 4 Room 516 Patentee after: Jiaxing Xiantu Intelligent Technology Co.,Ltd. Country or region after: China Address before: 201600, 5, 13 building, 68 Chuang Chuang Road, Songjiang District, Shanghai. Patentee before: SHANGHAI XIANTU INTELLIGENT TECHNOLOGY Co.,Ltd. Country or region before: China |