CN108320331B

CN108320331B - A method and device for generating augmented reality video information of a user scene

Info

Publication number: CN108320331B
Application number: CN201710032139.8A
Authority: CN
Inventors: 胡晨鹏
Original assignee: Shanghai Zhangmen Science and Technology Co Ltd
Current assignee: Shanghai Zongzhang Technology Group Co.,Ltd.
Priority date: 2017-01-17
Filing date: 2017-01-17
Publication date: 2021-10-22
Anticipated expiration: 2037-01-17
Also published as: CN108320331A

Abstract

The purpose of this application is to provide a method and device for generating augmented reality video information of a user scene; this application combines the image matching recognition of network equipment with the image calibration recognition of user equipment, which breaks through the existing technology due to mobile The computing power and storage capacity of the device are limited, and only the limitations of simple face recognition can be realized, thereby effectively expanding the range of identifiable objects to any scene objects in the user scene. All scene objects can be identified and re-synthesized. Therefore, the augmented reality video information generated by this application will have obvious visual breakthroughs compared to traditional video applications or existing augmented reality user video chat applications. The variability of the augmented reality video information you see will be greatly enhanced, thereby enhancing the user's interactive fun and optimizing the user's intelligent video experience.

Description

Method and equipment for generating augmented reality video information of user scene

Technical Field

The present application relates to the field of communications, and in particular, to a technique for generating augmented reality video information of a user scene.

Background

With the development of augmented reality technology, mobile application products for beautifying chat videos around face recognition technology appear, and the functions of the mobile application products are basically: after the mobile device identifies and models the human face in the video, virtual objects are added to the human head/face picture through the augmented reality technology so as to realize human face beautification. Because the computing power and the storage capacity of the mobile device are limited, the mobile device can only realize simple face recognition, the reality enhancement of the face recognition is monotonous in an interaction mode, the face is not beautiful, such as face deformation, and a few virtual ornaments are added to the head of a user, so that compared with the traditional video application, the existing augmented reality-based user video chat application has no obvious visual breakthrough, and the intelligent video experience of the user is not rich.

Disclosure of Invention

The application aims to provide a method and equipment for presenting a user scene video based on augmented reality.

According to an aspect of the present application, a method for generating augmented reality video information of a user scene at a user equipment end is provided, which includes:

sending a video key frame of a first video stream corresponding to a user scene to corresponding network equipment;

acquiring scene object related information corresponding to the video key frame, which is determined by the network equipment based on image matching identification;

based on the scene object related information, carrying out image calibration identification on a target frame of a second video stream acquired by user equipment;

and synthesizing the corresponding virtual object and the second video stream into augmented reality video information based on the image calibration recognition result.

According to another aspect of the present application, a method for generating augmented reality video information of a user scene at a network device is provided, which includes:

the method comprises the steps of obtaining a video key frame corresponding to a user scene of user equipment, wherein the video key frame is determined based on a first video stream corresponding to a scene object collected by the user equipment;

performing image matching identification on the video key frames to determine scene object related information corresponding to the video key frames;

and sending the scene object related information to the user equipment.

According to another aspect of the present application, there is also provided a user equipment for generating augmented reality video information of a user scene, including:

the video key frame sending device is used for sending the video key frame of the first video stream corresponding to the user scene to the corresponding network equipment;

scene object related information acquisition means for acquiring scene object related information corresponding to the video key frame, which is determined by the network device based on image matching identification;

the image calibration identification device is used for carrying out image calibration identification on a target frame of the second video stream acquired by the user equipment based on the scene object related information;

and the synthesizing device is used for synthesizing the corresponding virtual object and the second video stream into the augmented reality video information based on the image calibration identification result.

According to another aspect of the present application, there is also provided a network device for generating augmented reality video information of a user scene, including:

the video key frame acquisition device is used for acquiring a video key frame corresponding to a user scene of user equipment, wherein the video key frame is determined based on a first video stream corresponding to a scene object acquired by the user equipment;

the image matching identification device is used for carrying out image matching identification on the video key frames and determining the scene object related information corresponding to the video key frames;

and the scene object related information sending device is used for sending the scene object related information to the user equipment.

According to yet another aspect of the present application, there is also provided a system for generating augmented reality video information of a user scene, wherein the system comprises: according to another aspect of the present application, a user device for presenting a video of a user scene based on augmented reality is provided, and according to yet another aspect of the present application, a network device for presenting a video of a user scene based on augmented reality is provided.

Compared with the prior art, the method and the device have the advantages that the video key frames corresponding to the scene objects are sent to the corresponding network devices, the scene object related information which is determined by the user devices based on image matching identification and corresponds to the video key frames, such as attribute information, position information, surface information and the like of the scene objects, is obtained, then the user devices perform image calibration identification on each target frame in the second video stream acquired by the current user devices in real time in combination with the scene object related information obtained from the network devices, and the corresponding virtual objects and the second video stream are synthesized into the augmented reality video information based on the image calibration identification result. The method combines the image matching identification of the network equipment with the image calibration identification of the user equipment, breaks through the limitation that only simple face identification can be realized due to the limited computing capability and storage capacity of the mobile equipment in the prior art, and effectively expands the range of the identifiable object to any scene object in a user scene, wherein on one hand, the core information for identifying the scene object, such as attribute information, position information, surface information and the like of the scene object can be effectively determined by utilizing the stronger computing capability and storage capability of the network equipment compared with the user equipment to perform the image matching identification on a video key frame; on the other hand, the user equipment may further perform image calibration recognition aiming at offset correction on a video stream updated in real time in the user equipment, such as a target frame of a second video stream, based on a result of the image matching recognition of the network equipment, so that accurate recognition of a scene object in each frame image of the current user equipment can be realized; then, based on the result of the image calibration recognition, the corresponding virtual object is rendered as augmented reality video information by synthesizing with the second video stream, and can be presented to the user. In the application, because any scene object corresponding to the user equipment can be identified and resynthesized, the augmented reality video information presented in the application is compared with the traditional video application or the existing video chatting application of the augmented reality user, the visual breakthrough is very obvious, the variability of the augmented reality video information seen by the user is greatly enhanced, the interaction interestingness of the user is improved, and the intelligent video experience of the user is optimized.

Meanwhile, only a small number of video key frames or scene object related information corresponding to the video key frames need to be transmitted between the user equipment and the corresponding network equipment, so that the transmission data volume is small, the network delay is small, the burden on data communication is small, and the user experience is not influenced.

Further, in an implementation manner, the augmented reality video information may also be provided to one or more other user devices corresponding to the user device. Here, the user scene video presentation based on augmented reality according to the present application may be a scene video presentation of a single user, such as a single user video mode; each user can share own user scene video to other users during interaction of multiple users, for example, a video chat mode of multiple users. Under the multi-user interaction mode, based on the application, the interaction interestingness of each user can be improved, and the intelligent video experience of each interactive user is optimized.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 illustrates a system diagram for generating augmented reality video information of a user scene in accordance with an aspect of the subject application;

fig. 2 shows a flowchart of a method for generating augmented reality video information of a user scene at a user device side and a network device side according to another aspect of the present application.

The same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

The present application is described in further detail below with reference to the attached figures.

In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

In one implementation of the present application, a user device for generating augmented reality video information of a user scene is provided; in an implementation manner of the application, a network device for generating augmented reality video information of a user scene is also provided; further, in an implementation manner of the present application, a system for generating augmented reality video information of a user scene is also provided, where the system includes the one or more user devices and the network device. The user device may include, but is not limited to, various mobile devices such as a smartphone, a tablet, a smart wearable device, and the like. In one implementation, the user equipment includes an acquisition module, such as a microphone, which may perform image video acquisition, such as a camera, or perform video acquisition. The network device may include, but is not limited to, a computer, a network host, a single network server, multiple network server sets, or a cloud server, wherein the cloud server is a virtual supercomputer operating in a distributed system and composed of a group of loosely coupled computer sets, which is used to realize a simple, efficient, secure, reliable, and processing power scalable computing service. In the present application, the user equipment may be referred to as user equipment 1, and the network equipment may be referred to as network equipment 2 (refer to fig. 1).

FIG. 1 illustrates a system diagram for generating augmented reality video information of a user scene in accordance with an aspect of the subject application. The system comprises a user equipment 1 and a network device 2. The user equipment 1 comprises a video key frame sending device 11, a scene object related information acquiring device 12, an image calibration identifying device 13 and a synthesizing device 14; the network device 2 comprises video key frame acquisition means 21, image matching recognition means 22 and scene object related information transmission means 23.

The video key frame sending device 11 may send the video key frame of the first video stream corresponding to the user scene to the corresponding network device 2; correspondingly, the video key frame obtaining device 21 may obtain a video key frame corresponding to a user scene of the user equipment 1; then, the image matching identification device 22 may perform image matching identification on the video key frame to determine the scene object related information corresponding to the video key frame; then, the scene object related information sending means 23 may send the scene object related information to the user equipment 1; correspondingly, the scene object related information obtaining device 12 may obtain the scene object related information corresponding to the video key frame, which is determined by the network device 2 based on the image matching identification; then, the image calibration recognition device 13 may perform image calibration recognition on the target frame of the second video stream acquired by the user equipment 1 based on the scene object related information; then, the synthesizing device 14 may synthesize the corresponding virtual object and the second video stream into the augmented reality video information based on the result of the image calibration recognition.

In the application, the generated augmented reality video information of the user scene may be applied to scene video presentation of a single user, such as a single user video mode, and also may be seen by each user sharing the augmented reality video information of the user scene of the user to other users when multiple users interact, for example, a multiple user video chat mode. In addition, any other mode that can be applied to the augmented reality video information of the user scene can be taken as the application scene of the present application and is included in the protection scope of the present application.

Specifically, the video key frame transmitting device 11 may transmit the video key frame of the first video stream corresponding to the user scene to the corresponding network device. Then, correspondingly, the video key frame acquiring means 21 may acquire a video key frame corresponding to the user scene of the user device 1.

In one implementation, the user equipment 1 further includes a capturing device (not shown) configured to capture a first video stream corresponding to a user scene. Here, the collecting device is used for collecting video information, namely the video stream, of a corresponding user during video recording or interaction with other users. In this application, the first video stream may be a video stream at any time. In one implementation, the capturing of the first video stream of the user scene may be performed by various types of cameras, or a combination of cameras, on the user device 1. Here, the first video stream corresponds to a plurality of consecutive frames, each frame corresponds to corresponding image information, and each object in the image information is a scene object in the user scene. In one implementation, the user equipment 1 may acquire, in real time, a first video stream corresponding to the scene object.

The user equipment 1 then further comprises video key frame determination means (not shown), where the video key frame determination means may determine video key frames from the first video stream. Here, the video key frame may be one or more frames in the first video stream, and the confirmation criteria of the video key frame may be customized based on different scene needs. In one implementation, when image information of a frame of the first video stream is changed greatly compared with image information of a previous frame, for example, a scene object is increased or decreased, and if the scene object moves obviously and reaches another preset image information change threshold, the frame is determined to be a video key frame; then, the video key frame sending device 11 may send the video key frame corresponding to the scene object to the corresponding network device 2, so as to perform image matching identification on the video key frame in the network device 2, where the image matching identification is used to effectively determine core information used for identifying the scene object, such as attribute information, position information, and surface information of the scene object. Furthermore, for a frame whose image information does not change much compared to the previous frame, it may be determined as a non-video key frame, and it is set that uploading is not required, and further, in actual operation, it may be selected for the non-video key frame to ignore the frame, or it may also be selected to perform image recognition on the user equipment 1 through image calibration recognition. In the application, only a small amount of video key frames need to be transmitted between the user equipment 1 and the corresponding network equipment 2, so that the transmission data volume is small, the network delay is small, the burden on data communication is small, the user experience cannot be influenced, and meanwhile, the defects that the user equipment 1 cannot perform a large amount of complex image recognition operation can be effectively overcome due to the strong computing capacity and storage capacity of the network equipment 2.

In one implementation, an information transmission channel may be established between the network device 2 and one or more user devices, and between multiple user devices that interact with each other through video, where the information transmission channel may include a signaling channel and a data channel, where the signaling channel is responsible for transmitting contents such as a control instruction with a small data volume, and the data channel is responsible for transmitting contents such as a video key frame, a video stream with a large data volume, and a virtual object set.

In one implementation, the user equipment 1 may acquire a video stream corresponding to the scene object in real time. Further, there may be video key frames in each video stream. For example, one or more key frames may be present in both the first video stream and the subsequent second video stream. Furthermore, in one implementation, the video key frame may be determined in real time, and the video key frame may be set to be sent to the corresponding network device 2. For example, the determination and uploading of video key frames in the first video stream may be performed as described above; in another example, the determination and uploading of the video key frame may also be performed on the subsequent second video stream.

Then, the image matching identification device 22 may perform image matching identification on the video key frame to determine the scene object related information corresponding to the video key frame; then, the scene object related information sending means 23 may send the scene object related information to the user equipment 1; in correspondence therewith, the scene object related information obtaining means 12 may obtain scene object related information corresponding to the video key frame, which is determined by the network device 2 based on image matching recognition. In one implementation, the image matching recognition may be performed on the video key frames through a scene object database preset or callable in the network device 2, or through a large number of trained image recognition models preset in the network device 2 and determined through machine learning, so as to recognize one or more scene objects of the video key frames, and match corresponding scene object related information for the one or more scene objects.

In one implementation, the scene object related information includes at least any one of: the method comprises the steps of firstly, attribute information of a scene object, secondly, position information of the scene object and thirdly, surface information of the scene object. For example, it is necessary to identify a table image in a video keyframe as a table object and identify the position coordinates of the table in the image, as well as the orientation of the table surface, e.g., the top surface orientation of the table, in order to subsequently place a virtual object on the table and provide interaction.

Specifically, in one implementation, the attribute information of the scene object may include what the scene object is, and here, fuzzy matching may be implemented: if the scene object is a building, furniture, plant, etc.; further, more accurate matching may also be achieved, such as the scene object being a tower, a table, a tree, etc. In one implementation, the position information of the scene object may include image position information of the scene object in the video key frame, and may include coordinate information, such as contour coordinate information of a tower, position coordinates of a table, and the like. In one implementation, the surface information of the scene object may include surface contour information of an object, where a surface contour of the scene object to be identified may be set, for example, an upper surface of a table needs to be identified for subsequently adding a virtual object on the table top, and thus, the identified surface information mainly includes the table upper surface information.

Here, those skilled in the art should understand that the attribute information of the scene object, the position information of the scene object, and the surface information of the scene object are only examples, and the information related to the scene object, which may be present or may appear in the future, as applicable to the present application, should also be included in the scope of the present application and included by reference.

Then, the image calibration recognition device 13 may perform image calibration recognition on the target frame of the second video stream acquired by the user equipment 1 based on the scene object related information. Here, the image calibration identification is a supplement to the image matching identification of the network device 2, and the image calibration identification is only image information identification performed on the video key frame, but for the user device 1, in the user video process, for example, in the user video process, or in the user video chat or other interaction processes, the collecting device, such as a camera, collects the video stream in real time, that is, collects continuous multiple frames in real time, and the picture information of each frame may have changes compared with the previous frame, such as the previous frame, which may be slight, and may also be identified without complex image matching operation, and at this time, the image calibration identification may be adopted in cooperation. Here, the image calibration recognition may be performed on the identified scene object related information corresponding to the video key frame, such as attribute information, position information, surface information, and the like of the scene object, on the basis of the image matching recognition, and the target frame of the second video stream, which is a new video stream currently acquired by the user equipment 1, is performed, the image calibration recognition aims to determine the scene object related information of the target frame, and particularly, to identify the slight change information of the position information, the surface information, and the like of the scene object therein, so that the second video stream may be rendered to have the augmented reality effect by performing the overlay synthesis of the virtual object on the basis of the scene object related information of the target frame determined by the recognition result. In one implementation, each frame in the second video stream may be set as the target frame, or one or more frames in the second video stream may also be set as the target frame.

Then, the synthesizing device 14 may synthesize the corresponding virtual object and the second video stream into the augmented reality video information based on the result of the image calibration recognition. In one implementation, one or more target frames in the second video that are subject to image alignment identification may be respectively composited with corresponding virtual objects. For example, image information of one target frame is superimposed with image information corresponding to a virtual object or a model, thereby synthesizing augmented reality image information corresponding to the image information of the target frame. The augmented reality video information corresponding to the second video stream may include one or more frames of augmented reality image information, for example, consecutive frames of the video stream are corresponding to the augmented reality image information. In one implementation, the image information of the target frame of the second video stream may be replaced with the augmented reality image information. In addition, in one implementation, the virtual object may be a set of virtual objects acquired from the network device 1 or other third-party devices, such as various virtual article images or models; in another implementation, the virtual object may also be extracted from the user equipment 1, for example, a picture in a picture application of the user equipment 1, such as a photo in a mobile phone album. In addition, in one implementation, the corresponding virtual object may be a single virtual object, or may be a combination of multiple virtual objects, for example, a virtual photo frame determined from a virtual object set is combined with a photo in a user's mobile phone album to form a photo frame photo.

Herein, the video key frame corresponding to the scene object is sent to the corresponding network device 2, and scene object related information, such as attribute information, position information, surface information, and the like of the scene object, which is determined by the user device 1 based on image matching identification and corresponds to the video key frame, is acquired, then, the user device 1 performs image calibration identification on each target frame in a second video stream acquired by the current user device 1 in real time in combination with the scene object related information acquired from the network device 2, and synthesizes a corresponding virtual object and the second video stream into augmented reality video information based on an image calibration identification result. Here, the method of combining image matching recognition of the network device 2 with image calibration recognition of the user device 1 breaks through the limitation that only simple face recognition can be realized due to the limited computing power and storage capacity of the mobile device in the prior art, so that the range of recognizable objects can be effectively expanded to any scene object in the user scene, wherein, on one hand, the core information for identifying the scene object, such as attribute information, position information, surface information and the like of the scene object can be effectively determined by utilizing the stronger computing power and storage power of the network device 2 compared with the user device 1 to perform image matching recognition on the video key frame; on the other hand, the user equipment 1 may further perform image calibration recognition aiming at deviation correction on a video stream updated in real time in the user equipment 1, such as a target frame of a second video stream, based on a result of image matching recognition of the network equipment 2, so that accurate recognition of a scene object in each frame of image of the current user equipment 1 can be realized; then, based on the result of the image calibration recognition, the corresponding virtual object is rendered as augmented reality video information by synthesizing with the second video stream, and can be presented to the user. In the application, because arbitrary scene object that user equipment 1 corresponds can all be discerned and resynthesized, therefore the augmented reality video information that this application presented compares in traditional video application or current augmented reality's user video chats application, and visual breakthrough will be very obvious, and the augmented reality video information variability that the user sees will strengthen greatly to user's interactive interest has been promoted, user's intelligent video experience has been optimized.

Meanwhile, only a small amount of video key frames or scene object related information corresponding to the video key frames need to be transmitted between the user equipment 1 and the corresponding network equipment 2, so that the transmission data volume is small, the network delay is small, the burden on data communication is small, and the user experience is not influenced.

In one implementation, the image calibration identification device 13 includes a first image calibration identification unit (not shown), a first determination unit (not shown). The first image calibration identification unit may perform image calibration identification on a first target frame of a second video stream acquired by the user equipment 1 based on the scene object related information; the first determination unit may determine scene object related information corresponding to the first target frame based on image calibration recognition performed on the first target frame.

In particular, in this implementation, a target frame in the second video stream, such as the first target frame, may perform the image alignment recognition with reference to scene object related information of a video key frame of the first video stream. First, comparing the image information of the first target frame with the image information of the video key frame to determine the difference between the two image information, such as comparing the outline of the scene object, comparing the position of the scene object, etc., and further, based on the scene object related information of the known video key frame, such as the attribute information, the position information, the surface information, etc. of the scene object, the data of each specific scene object related information corresponding to the first target frame is calculated, for example, the first target frame is compared with the video key frame, when the image position of one scene object table moves, the actual position coordinates of the table in the first target frame can be determined by combining the known position coordinates of the table in the video key frame based on the fact that the attribute information identified in the two frames calculated by comparison is the position offset of the object of the table. In one implementation, any target frame in the second video stream may be the first target frame, such that one or more first target frames may be identified based on scene object related information with reference to a video key frame of the first video stream.

Then, the synthesizing device 14 may synthesize the corresponding virtual object and the first target frame into first augmented reality image information based on the scene object related information corresponding to the first target frame; then, augmented reality video information is generated based on the first augmented reality image information. In an implementation manner, the image information included in the augmented reality video information may be all augmented reality image information similar to or identical to the first augmented reality image information, or may be common image information including a part of no augmented reality effect.

Further, in one implementation, the image calibration identification device 13 further includes a second image calibration identification unit (not shown), and a second determination unit (not shown). The second image calibration identification unit may perform image calibration identification on a second target frame of a second video stream acquired by the user equipment 1 based on scene object related information corresponding to the first target frame; next, the second determining unit may determine scene object related information corresponding to the second target frame based on image calibration recognition performed on the second target frame.

In particular, in this implementation, a target frame in the second video stream, such as the second target frame, may perform the image alignment recognition with reference to the scene object related information of the first target frame. In one implementation, the second target frame may be a frame in a second video stream that is sequentially subsequent to the first target frame. At this time, the appearance time of the first target frame is closer to the second target frame than the video key frame of the first video stream, so that it can be reasonably understood that the probability that the image information of the first target frame is more similar to the image information in the second target frame is relatively high.

Further, in an implementation manner, if the user equipment 1 acquires a new video key frame after the video key frame of the first video stream and the new video key frame appears in an order after the first target frame, the probability that the image information of the target frame of the new video key frame is higher in approximation degree than the image information in the second target frame is relatively higher than that of the first target frame, and at this time, the new video key frame may be preferentially used as a reference for identifying the image information of the second target frame.

Then, the synthesizing device 14 may synthesize the corresponding virtual object and the second target frame into second augmented reality image information based on the scene object related information corresponding to the second target frame; then, augmented reality video information is generated based on the first augmented reality image information and the second augmented reality image information. In an implementation manner, the image information included in the augmented reality video information may be all augmented reality image information similar to or the same as the first augmented reality image information or the second augmented reality image information, or may include part of common image information without augmented reality effect.

In one implementation, the user equipment 1 further comprises a presentation device (not shown); the presenting means may present the augmented reality video information corresponding to the second video stream.

Specifically, the user equipment 1 may play the augmented reality video information in real time on a corresponding device display screen, for example, in a process of taking a picture and recording by the user equipment 1 such as a mobile phone, the application is used to perform augmented reality effect processing on a video stream acquired in real time, and the corresponding augmented reality video information is presented on the mobile phone in real time; for another example, when the user performs a video chat with another user through the user equipment 1, for example, the mobile phone of the user may present a video picture with an augmented reality effect, and further, the mobile phone of another user interacting with the user may also view the augmented reality video information.

In one implementation, the user equipment 1 further includes a user interaction device (not shown), which may provide the augmented reality video information to one or more other user equipments corresponding to the user equipment 1. In the application, the user scene video presentation based on the augmented reality may be not only a scene video presentation of a single user, such as a single user video recording mode, but also a user scene video sharing mode in which each user shares its own user scene video with other users during interaction of multiple users, such as a multiple user video chat mode. In an implementation manner, the augmented reality video information, for example, an augmented reality video stream, may be sent to a corresponding network device by the user equipment 1, such as the network device 1, and then the network device 1 forwards the augmented reality video information to a corresponding other user equipment. In another implementation, the user equipment 1 and other user equipments may also directly interact with their respective augmented reality video information without the intermediary of the network equipment 1.

In one implementation, the user equipment 1 further includes a scene interaction device (not shown), and the scene interaction device can obtain operation instruction information of a user on the virtual object; and executing corresponding operation based on the operation instruction information. For example, a user may control a video scene or a virtual object in a video chat scene by touching or speaking the virtual object in the video chat scene, for example, a virtual pet may be placed on a table surface in a real environment, and a user who records the video or participates in the chat may control the virtual pet to perform a series of actions by touching, speaking, and the like. In one implementation, the interaction with the virtual object in the augmented reality video information may be performed by a user corresponding to the user equipment 1, and in another implementation, if the user interacts with another user, such as multi-user video chat, the interaction with the virtual object may also be implemented by the other user based on the interactive augmented reality video information.

Further, in one implementation, the scene interaction device includes at least any one of the following: a first scene interaction unit (not shown), which may acquire touch screen operation information of a user and determine operation instruction information of the user on a virtual object based on the touch screen operation information; for example, if the virtual object is a pet puppy, the user may instruct the puppy in the video to perform a corresponding reaction by clicking a preset region of the screen, such as a region where the puppy is located, and for example, the virtual puppy may operate a tail based on the clicked screen of the user. For another example, if the virtual object is a photo set in a mobile phone of the user, switching between photos may be performed through a sliding operation on the touch screen. A second scene interaction unit (not shown), which may obtain gesture information of the user through the user equipment camera device, and determine operation instruction information of the user on the virtual object based on the gesture information, for example, the user shoots a hand motion through a camera, extracts gesture information from a cluster, such as tapping, clicking, and the like, and then determines the operation instruction information based on a preset corresponding relationship between the gesture information and the operation instruction information. And a third scene interaction unit (not shown), which may acquire voice information of the user, and determine operation instruction information of the user on the virtual object based on the voice information, where the voice information of the user may be acquired through a microphone built in the user equipment 1, and the operation instruction information is determined based on a corresponding relationship between preset voice information and operation instruction information. Therefore, the interaction experience of the user can be further enriched through the interaction between the user and the virtual object in the augmented reality video information.

In one implementation, the user equipment 1 further includes a virtual object set obtaining device (not shown) and a target virtual object determining device (not shown), and the network device 2 further includes a virtual object set sending device (not shown). Specifically, the virtual object set sending means may send the virtual object set matching the scene object related information corresponding to the video key frame to the user equipment 1, and the virtual object set is obtained by the virtual object set obtaining means correspondingly. For example, the network device 2 may screen out a set of virtual objects matching the determined scene object in the video key frame based on the attribute information of the scene object in the video key frame, and if the scene object is a tree, screen out a set of virtual objects including various virtual small animals based on the determination required by the user scene. For another example, the filtering parameters such as the size of the virtual object may be set in combination with scene object related information such as position information and surface information of the scene object. Then, the target virtual object determining means may determine one or more target virtual objects from the set of virtual objects, and the synthesizing means 14 may synthesize the target virtual objects and the second video stream into augmented reality video information based on the result of the image alignment recognition. Here, this implementation can enrich the composite rendering effect of the augmented reality video information by matching the corresponding virtual object set for the user equipment 1, and at the same time, can optimize the intelligent experience of the user.

Fig. 2 shows a flowchart of a method for generating augmented reality video information of a user scene at a user device side and a network device side according to another aspect of the present application. The method includes step S301, step S302, step S303, step S304, step S401, step S402, and step S403.

In step S301, the user equipment 1 may send a video key frame of a first video stream corresponding to a user scene to a corresponding network device 2; correspondingly, in step S401, the network device 2 may obtain a video key frame corresponding to a user scene of the user device 1; next, in step S402, the network device 2 may perform image matching recognition on the video key frame to determine scene object related information corresponding to the video key frame; next, in step S403, the network device 2 may send the scene object related information to the user device 1; correspondingly, in step S302, the user equipment 1 may obtain scene object related information corresponding to the video key frame, which is determined by the network equipment 2 based on image matching identification; next, in step S303, the user equipment 1 may perform image calibration recognition on a target frame of the second video stream acquired by the user equipment 1 based on the scene object related information; next, in step S304, the user equipment 1 may synthesize the corresponding virtual object and the second video stream into augmented reality video information based on the result of the image calibration recognition.

Specifically, in step S301, the user equipment 1 may send a video key frame of a first video stream corresponding to a user scene to a corresponding network device. In step S401, the network device 2 may obtain a video key frame corresponding to the user scene of the user device 1.

In one implementation, the method further includes step S306 (not shown), and in step S306, the user equipment 1 may capture a first video stream corresponding to a user scene. Here, the collecting device is used for collecting video information, namely the video stream, of a corresponding user during video recording or interaction with other users. In this application, the first video stream may be a video stream at any time. In one implementation, the capturing of the first video stream of the user scene may be performed by various types of cameras, or a combination of cameras, on the user device 1. Here, the video stream corresponds to a plurality of consecutive frames, each frame corresponds to corresponding image information, and each object in the image information is a scene object in the user scene. In one implementation, the user equipment 1 may acquire, in real time, a first video stream corresponding to the scene object.

Next, the method further comprises step S307 (not shown), where in step S307 the user equipment 1 may determine video key frames from the first video stream. Here, the video key frame may be one or more frames in the first video stream, and the confirmation criteria of the video key frame may be customized based on different scene needs. In one implementation, when image information of a frame of the first video stream is changed greatly compared with image information of a previous frame, for example, a scene object is increased or decreased, and if the scene object moves obviously and reaches another preset image information change threshold, the frame is determined to be a video key frame; next, in step S301, the user equipment 1 may send the video key frame corresponding to the scene object to the corresponding network equipment 2, so as to perform image matching recognition on the video key frame in the network equipment 2, where the image matching recognition is used to effectively determine core information used for identifying the scene object, such as attribute information, position information, and surface information of the scene object. Furthermore, for a frame whose image information does not change much compared to the previous frame, it may be determined as a non-video key frame, and it is set that uploading is not required, and further, in actual operation, it may be selected for the non-video key frame to ignore the frame, or it may also be selected to perform image recognition on the user equipment 1 through image calibration recognition. In the application, only a small amount of video key frames need to be transmitted between the user equipment 1 and the corresponding network equipment 2, so that the transmission data volume is small, the network delay is small, the burden on data communication is small, the user experience cannot be influenced, and meanwhile, the defects that the user equipment 1 cannot perform a large amount of complex image recognition operation can be effectively overcome due to the strong computing capacity and storage capacity of the network equipment 2.

Next, in step S402, the network device 2 may perform image matching recognition on the video key frame to determine scene object related information corresponding to the video key frame; next, in step S403, the network device 2 may send the scene object related information to the user device 1; correspondingly, in step S302, the user equipment 1 may acquire scene object related information corresponding to the video key frame, which is determined by the network equipment 2 based on image matching identification. In one implementation, the image matching recognition may be performed on the video key frames through a scene object database preset or callable in the network device 2, or through a large number of trained image recognition models preset in the network device 2 and determined through machine learning, so as to recognize one or more scene objects of the video key frames, and match corresponding scene object related information for the one or more scene objects.

Next, in step S303, the user equipment 1 may perform image calibration recognition on the target frame of the second video stream acquired by the user equipment 1 based on the scene object related information. Here, the image calibration identification is a supplement to the image matching identification of the network device 2, and the image calibration identification is only image information identification performed on the video key frame, but for the user device 1, in the user video process, for example, in the user video process, or in the user video chat or other interaction processes, the collecting device, such as a camera, collects the video stream in real time, that is, collects continuous multiple frames in real time, and the picture information of each frame may have changes compared with the previous frame, such as the previous frame, which may be slight, and may also be identified without complex image matching operation, and at this time, the image calibration identification may be adopted in cooperation. Here, the image calibration recognition may be performed on the identified scene object related information corresponding to the video key frame, such as attribute information, position information, surface information, and the like of the scene object, on the basis of the image matching recognition, and the target frame of the second video stream, which is a new video stream currently acquired by the user equipment 1, is performed, the image calibration recognition aims to determine the scene object related information of the target frame, and particularly, to identify the slight change information of the position information, the surface information, and the like of the scene object therein, so that the second video stream may be rendered to have the augmented reality effect by performing the overlay synthesis of the virtual object on the basis of the scene object related information of the target frame determined by the recognition result. In one implementation, each frame in the second video stream may be set as the target frame, or one or more frames in the second video stream may also be set as the target frame.

Next, in step S304, the user equipment 1 may synthesize the corresponding virtual object and the second video stream into augmented reality video information based on the result of the image calibration recognition. In one implementation, one or more target frames in the second video that are subject to image alignment identification may be respectively composited with corresponding virtual objects. For example, image information of one target frame is superimposed with image information corresponding to a virtual object or a model, thereby synthesizing augmented reality image information corresponding to the image information of the target frame. The augmented reality video information corresponding to the second video stream may include one or more frames of augmented reality image information, for example, consecutive frames of the video stream are corresponding to the augmented reality image information. In one implementation, the image information of the target frame of the second video stream may be replaced with the augmented reality image information. In addition, in one implementation, the virtual object may be a set of virtual objects acquired from the network device 1 or other third-party devices, such as various virtual article images or models; in another implementation, the virtual object may also be extracted from the user equipment 1, for example, a picture in a picture application of the user equipment 1, such as a photo in a mobile phone album. In addition, in one implementation, the corresponding virtual object may be a single virtual object, or may be a combination of multiple virtual objects, for example, a virtual photo frame determined from a virtual object set is combined with a photo in a user's mobile phone album to form a photo frame photo.

In one implementation, the step S303 includes a step S3031 (not shown), and a step S3032 (not shown). In step S3031, the user equipment 1 may perform image calibration recognition on a first target frame of a second video stream acquired by the user equipment 1 based on the scene object related information; in step S3032, the user equipment 1 may determine scene object related information corresponding to the first target frame based on image calibration identification performed on the first target frame.

Next, in step S304, the user equipment 1 may synthesize a corresponding virtual object and the first target frame into first augmented reality image information based on the scene object related information corresponding to the first target frame; then, augmented reality video information is generated based on the first augmented reality image information. In an implementation manner, the image information included in the augmented reality video information may be all augmented reality image information similar to or identical to the first augmented reality image information, or may be common image information including a part of no augmented reality effect.

Further, in one implementation, the step S303 further includes a step S3033 (not shown), and a step S3034 (not shown). In step S3033, the user equipment 1 may perform image calibration identification on a second target frame of a second video stream acquired by the user equipment 1 based on the scene object related information corresponding to the first target frame; next, in step S3034, the user equipment 1 may determine scene object related information corresponding to the second target frame based on image calibration identification performed on the second target frame.

Next, in step S304, the user equipment 1 may synthesize a corresponding virtual object and the second target frame into second augmented reality image information based on the scene object related information corresponding to the second target frame; then, augmented reality video information is generated based on the first augmented reality image information and the second augmented reality image information. In an implementation manner, the image information included in the augmented reality video information may be all augmented reality image information similar to or the same as the first augmented reality image information or the second augmented reality image information, or may include part of common image information without augmented reality effect.

In one implementation, the method further includes step S305 (not shown); in step S305, the user equipment 1 may present the augmented reality video information corresponding to the second video stream.

In one implementation, the method further includes S308 (not shown), and in step S308, the user device 1 may provide the augmented reality video information to one or more other user devices corresponding to the user device 1. In the application, the user scene video presentation based on the augmented reality may be not only a scene video presentation of a single user, such as a single user video recording mode, but also a user scene video sharing mode in which each user shares its own user scene video with other users during interaction of multiple users, such as a multiple user video chat mode. In an implementation manner, the augmented reality video information, for example, an augmented reality video stream, may be sent to a corresponding network device by the user equipment 1, such as the network device 1, and then the network device 1 forwards the augmented reality video information to a corresponding other user equipment. In another implementation, the user equipment 1 and other user equipments may also directly interact with their respective augmented reality video information without the intermediary of the network equipment 1.

In one implementation, the method further includes S309 (not shown), in step S309, the user equipment 1 may acquire operation instruction information of the virtual object by the user; and executing corresponding operation based on the operation instruction information. For example, a user may control a video scene or a virtual object in a video chat scene by touching or speaking the virtual object in the video chat scene, for example, a virtual pet may be placed on a table surface in a real environment, and a user who records the video or participates in the chat may control the virtual pet to perform a series of actions by touching, speaking, and the like. In one implementation, the interaction with the virtual object in the augmented reality video information may be performed by a user corresponding to the user equipment 1, and in another implementation, if the user interacts with another user, such as multi-user video chat, the interaction with the virtual object may also be implemented by the other user based on the interactive augmented reality video information.

Further, in an implementation manner, the step S309 further includes at least any one of the steps S3091 (not shown), S3092 (not shown), and S3093 (not shown): in step S3091, the user equipment 1 may obtain touch screen operation information of a user, and determine operation instruction information of the user on the virtual object based on the touch screen operation information; for example, if the virtual object is a pet puppy, the user may instruct the puppy in the video to perform a corresponding reaction by clicking a preset region of the screen, such as a region where the puppy is located, and for example, the virtual puppy may operate a tail based on the clicked screen of the user. For another example, if the virtual object is a photo set in a mobile phone of the user, switching between photos may be performed through a sliding operation on the touch screen. In step S3092, the user equipment 1 may obtain gesture information of the user through the user equipment camera device, and determine operation instruction information of the user on the virtual object based on the gesture information, for example, the user takes a hand motion through a camera, extracts gesture information such as tapping, clicking and the like from a cluster, and then determines the operation instruction information based on a preset corresponding relationship between the gesture information and the operation instruction information. In step S3093, the user equipment 1 may acquire voice information of a user, and determine operation instruction information of the user on the virtual object based on the voice information, where the voice information of the user may be acquired through a microphone built in the user equipment 1, and the operation instruction information is determined based on a preset correspondence between the voice information and the operation instruction information. Therefore, the interaction experience of the user can be further enriched through the interaction between the user and the virtual object in the augmented reality video information.

In one implementation, the method further includes step S310 (not shown), step S311 (not shown), and step S404 (not shown).

Specifically, in step S404, the network device 2 may send the set of virtual objects matching the scene object related information corresponding to the video key frame to the user device 1, and correspondingly, in step S310, obtain the set of virtual objects by the user device 1. For example, the network device 2 may screen out a set of virtual objects matching the determined scene object in the video key frame based on the attribute information of the scene object in the video key frame, and if the scene object is a tree, screen out a set of virtual objects including various virtual small animals based on the determination required by the user scene. For another example, the filtering parameters such as the size of the virtual object may be set in combination with scene object related information such as position information and surface information of the scene object. Next, in step S311, one or more target virtual objects may be determined from the set of virtual objects by the user equipment 1, so that, in step S304, the target virtual objects and the second video stream may be synthesized into augmented reality video information by the user equipment 1 based on the result of the image calibration recognition. Here, this implementation can enrich the composite rendering effect of the augmented reality video information by matching the corresponding virtual object set for the user equipment 1, and at the same time, can optimize the intelligent experience of the user.

It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims

1. A method for generating augmented reality video information of a user scene at a user equipment side, wherein the method comprises:

Send the video key frame of the first video stream corresponding to the user scene to the corresponding network device; wherein, the video key frame is one or more frames in the first video stream, and the video key frame is based on the set according to the user scenario;

acquiring the scene object related information determined by the network device based on the image matching identification and corresponding to the video key frame, wherein the scene object related information includes attribute information, position information and surface information of the scene object;

Perform image calibration and identification on the target frame of the second video stream collected by the user equipment based on the scene object related information;

synthesizing the corresponding virtual object and the second video stream into augmented reality video information based on the result of the image calibration recognition;

Wherein, the first video stream and the second video stream are both video streams recorded by the user corresponding to the user scene, or in the process of interacting with other users, and the video stream corresponds to a plurality of consecutive frames, the second video stream is a subsequent video stream of the first video stream;

An information transmission channel is established between the network device and one or more user equipment, the information transmission channel includes a signaling channel and a data channel, the signaling channel is used to transmit control instructions, and the data channel is used to transmit virtual One or more virtual objects, video keyframes, and video streams in an object set.

2. The method of claim 1, wherein the method further comprises:

collecting the first video stream corresponding to the user scene;

determining video keyframes from the first video stream;

Wherein, the sending the video key frame of the first video stream corresponding to the user scene to the corresponding network device includes:

Send the video key frame to the corresponding network device.

3. The method according to claim 1 or 2, wherein, based on the scene object related information, performing image calibration and identification on the target frame of the second video stream collected by the user equipment comprises:

Perform image calibration and identification on the first target frame of the second video stream collected by the user equipment based on the scene object related information;

determining the scene object-related information corresponding to the first target frame based on the image calibration identification performed on the first target frame;

Wherein, synthesizing the corresponding virtual object and the second video stream into augmented reality video information based on the result of the image calibration and recognition includes:

Based on the scene object related information corresponding to the first target frame, combining the corresponding virtual object and the first target frame into first augmented reality image information;

Augmented reality video information is generated based on the first augmented reality image information.

4. The method according to claim 3, wherein the performing image calibration and identification on the target frame of the second video stream collected by the user equipment based on the scene object related information further comprises:

Perform image calibration and identification on the second target frame of the second video stream collected by the user equipment based on the scene object-related information corresponding to the first target frame;

Determine the scene object-related information corresponding to the second target frame based on the image calibration recognition performed on the second target frame;

Wherein, synthesizing the corresponding virtual object and the second video stream into augmented reality video information based on the result of the image calibration recognition further includes:

Based on the scene object-related information corresponding to the second target frame, combining the corresponding virtual object and the second target frame into second augmented reality image information;

Augmented reality video information is generated based on the first augmented reality image information and the second augmented reality image information.

5. The method of any one of claims 1 to 4, wherein the method further comprises:

The augmented reality video information corresponding to the second video stream is presented.

6. The method of any one of claims 1 to 5, wherein the method further comprises:

The augmented reality video information is provided to one or more other user equipment corresponding to the user equipment.

7. The method of any one of claims 1 to 6, wherein the method further comprises:

Obtain operation instruction information of the user on the virtual object, and perform corresponding operations based on the operation instruction information.

8. The method according to claim 7, wherein the acquiring the user's operation instruction information on the virtual object, and based on the operation instruction information, performing the corresponding operation comprises at least any one of the following:

Acquiring the user's touch screen operation information, and determining the user's operation instruction information on the virtual object based on the touch screen operation information;

Obtain the user's gesture information through the user equipment camera, and determine the user's operation instruction information on the virtual object based on the gesture information;

Acquire the user's voice information, and determine the user's operation instruction information on the virtual object based on the voice information.

9. The method of claim 1, wherein the method further comprises:

acquiring a virtual object set that matches the scene object-related information corresponding to the video key frame;

determining a target virtual object from the set of virtual objects;

Based on the result of the image calibration recognition, the target virtual object and the second video stream are synthesized into augmented reality video information.

10. A method for generating augmented reality video information of a user scene on a network device side, wherein the method comprises:

Acquire a video key frame corresponding to the user scene of the user equipment, wherein the video key frame is determined based on the first video stream corresponding to the user scene collected by the user equipment, and the video key frame is the first video stream in the first video stream. one or more frames, and the video keyframe is set based on the user scene;

Perform image matching and identification on the video key frame to determine scene object related information corresponding to the video key frame, wherein the scene object related information includes attribute information, position information and surface information of the scene object;

Send the scene object related information to the user equipment, so that the user equipment performs image calibration and recognition on the target frame of the second video stream collected by the user equipment based on the scene object related information, based on The result of the image calibration recognition synthesizes the corresponding virtual object and the second video stream into augmented reality video information;

11. The method of claim 10, wherein the method further comprises:

Send a virtual object set matching the scene object related information corresponding to the video key frame to the user equipment.

12. A user equipment for generating augmented reality video information of a user scene, wherein the equipment comprises:

A video key frame sending device, configured to send the video key frame of the first video stream corresponding to the user scene to the corresponding network device; wherein, the video key frame is one or more frames in the first video stream, and the video key frame is set based on the user scene;

A scene object related information acquisition device is used to acquire the scene object related information determined by the network device based on the image matching identification and corresponding to the video key frame, wherein the scene object related information includes attribute information, location information and surface information;

an image calibration and identification device, configured to perform image calibration and identification on the target frame of the second video stream collected by the user equipment based on the scene object-related information;

a synthesizing device for synthesizing the corresponding virtual object and the second video stream into augmented reality video information based on the result of the image calibration and recognition;

13. The apparatus of claim 12, wherein the apparatus further comprises:

a collection device, configured to collect the first video stream corresponding to the user scene;

a video key frame determining device for determining a video key frame from the first video stream;

Wherein, the video key frame sending device is used for:

Send the video key frame corresponding to the scene object to the corresponding network device.

14. The apparatus according to claim 12 or 13, wherein the image calibration identification means comprises:

a first image calibration and identification unit, configured to perform image calibration and identification on the first target frame of the second video stream collected by the user equipment based on the scene object related information;

a first determining unit, configured to determine the scene object-related information corresponding to the first target frame based on the image calibration recognition performed on the first target frame;

Wherein, the synthesis device is used for:

15. The apparatus of claim 14, wherein the image calibration identification means further comprises:

a second image calibration and identification unit, configured to perform image calibration and identification on the second target frame of the second video stream collected by the user equipment based on the scene object-related information corresponding to the first target frame;

a second determining unit, configured to determine the scene object-related information corresponding to the second target frame based on the image calibration recognition performed on the second target frame;

Wherein, the synthetic device is also used for:

Based on the scene object related information corresponding to the second target frame, combining the corresponding virtual object and the second target frame into second augmented reality image information;

16. The apparatus of any one of claims 12 to 15, wherein the apparatus further comprises:

A presentation device, configured to present the augmented reality video information corresponding to the second video stream.

17. The apparatus of any one of claims 12 to 16, wherein the apparatus further comprises:

User interaction means for providing the augmented reality video information to one or more other user equipments corresponding to the user equipment.

18. The apparatus of any one of claims 12 to 17, wherein the apparatus further comprises:

The scene interaction device is used to obtain the user's operation instruction information on the virtual object; and based on the operation instruction information, perform corresponding operations.

19. The apparatus according to claim 18, wherein the scene interaction means comprises at least any one of the following:

a first scene interaction unit, configured to acquire the user's touch screen operation information, and determine the user's operation instruction information on the virtual object based on the touch screen operation information;

a second scene interaction unit, configured to obtain the user's gesture information through the user equipment camera, and determine the user's operation instruction information on the virtual object based on the gesture information;

The third scene interaction unit is configured to acquire the user's voice information, and determine the user's operation instruction information on the virtual object based on the voice information.

20. The apparatus of claim 12, wherein the apparatus further comprises:

a virtual object set acquiring device, configured to acquire a virtual object set matching the scene object-related information corresponding to the video key frame;

a target virtual object determination device for determining a target virtual object from the virtual object set;

Wherein, the synthesis device is used for:

21. A network device for generating augmented reality video information of a user scene, wherein the device comprises:

A video key frame obtaining device, configured to obtain a video key frame corresponding to a user scene of the user equipment, wherein the video key frame is determined based on the first video stream corresponding to the user scene collected by the user equipment; wherein, the video key frame The frame is one or more frames in the first video stream, and the video key frame is set based on the user scene;

An image matching and identification device, configured to perform image matching and identification on the video key frame, so as to determine scene object related information corresponding to the video key frame, wherein the scene object related information includes attribute information and location of the scene object. information and surface information;

An apparatus for sending scene object-related information, configured to send the scene object-related information to the user equipment, so that the user equipment can, based on the scene object-related information, send the second video stream collected by the user equipment to the user equipment. After the target frame is subjected to image calibration and identification, based on the result of the image calibration and identification, the corresponding virtual object and the second video stream are synthesized into augmented reality video information;

22. The apparatus of claim 21, wherein the apparatus further comprises:

A virtual object set sending device, configured to send a virtual object set matching the scene object related information corresponding to the video key frame to the user equipment.

23. A system for generating augmented reality video information of a user scene, wherein the system comprises: the user equipment according to any one of claims 12 to 20, and the network equipment according to claim 21 or 22.