CN114596576A

CN114596576A - An image processing method, device, electronic device and storage medium

Info

Publication number: CN114596576A
Application number: CN202210242840.3A
Authority: CN
Inventors: 王晓燕; 吕鹏原; 范森; 章成全; 姚锟
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-03-11
Filing date: 2022-03-11
Publication date: 2022-06-07

Abstract

The present disclosure provides an image processing method, device, electronic device and storage medium, which relate to the technical field of artificial intelligence, and further relate to the technical field of computer vision and deep learning, so as to at least solve the technical problem of low target object recognition efficiency in the related art. The specific implementation scheme is: acquiring a target image, wherein the target image includes the object to be recognized; detecting the target image to obtain target pixel data, wherein the target pixel data is used to represent at least one pixel in the object to be recognized and the object to be recognized. The positional relationship between the vertex coordinates; the target image is corrected based on the target pixel data, and the correction result is obtained.

Description

An image processing method, device, electronic device and storage medium

技术领域technical field

本公开涉及人工智能技术领域，进一步涉及计算机视觉和深度学习技术领域,尤其涉及一种图像处理方法、装置、电子设备及存储介质。The present disclosure relates to the technical field of artificial intelligence, and further relates to the technical field of computer vision and deep learning, and in particular, to an image processing method, apparatus, electronic device, and storage medium.

背景技术Background technique

对快递单进行文字识别一般包括文字检测、文字识别两部分。但是在实际运输、筛捡的过程中，快递包裹随意放置，拍摄角度不固定，拍出的图片可能正向、倒置、倾斜、扭曲等。直接进行文字检测识别难度较高，人工摆正后识别会大大增加人工和时间成本。因此，采用现有技术对快递单进行检测设别的准确度较低。Text recognition for express orders generally includes two parts: text detection and text recognition. However, in the process of actual transportation and screening, the express packages are placed at random, the shooting angle is not fixed, and the pictures taken may be forward, upside-down, inclined, distorted, etc. Direct text detection and recognition is more difficult, and recognition after manual correction will greatly increase labor and time costs. Therefore, using the prior art to detect and set the express order has low accuracy.

发明内容SUMMARY OF THE INVENTION

本公开提供了一种图像处理方法、装置、电子设备及存储介质，以至少解决相关技术中对快递对象进行检测的准确度较低的技术问题。The present disclosure provides an image processing method, device, electronic device and storage medium, so as to at least solve the technical problem of low accuracy in detecting express objects in the related art.

根据本公开的一方面，提供了一种图像处理方法，包括：获取目标图像，其中，目标图像包括待识别对象；对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；基于目标像素数据对目标图像进行校正，得到校正结果。According to an aspect of the present disclosure, an image processing method is provided, including: acquiring a target image, where the target image includes an object to be recognized; and detecting the target image to obtain target pixel data, wherein the target pixel data is used to represent the target image to be identified. Identify the positional relationship between at least one pixel in the object and the vertex coordinates of the object to be identified; correct the target image based on the target pixel data to obtain a correction result.

根据本公开的又一方面，提供了一种图像处理装置，包括：获取模块，用于获取目标图像，其中，目标图像包括待识别对象；检测模块，用于对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；校正模块，用于基于目标像素数据对目标图像进行校正，得到校正结果。According to yet another aspect of the present disclosure, an image processing apparatus is provided, comprising: an acquisition module for acquiring a target image, wherein the target image includes an object to be recognized; and a detection module for detecting the target image to obtain target pixels The target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized; the correction module is used to correct the target image based on the target pixel data to obtain a correction result.

根据本公开的又一方面，提供了一种电子设备，包括：至少一个处理器；以及与至少一个处理器通信连接的存储器；其中，存储器存储有可被至少一个处理器执行的指令，指令被至少一个处理器执行，以使至少一个处理器能够执行本公开提出的图像处理方法。According to yet another aspect of the present disclosure, there is provided an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor. The at least one processor executes, so that the at least one processor can execute the image processing method proposed by the present disclosure.

根据本公开的又一方面，提供了一种存储有计算机指令的非瞬时计算机可读存储介质，其中，计算机指令用于使计算机执行本公开提出的图像处理方法。According to yet another aspect of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions is provided, wherein the computer instructions are used to cause a computer to execute the image processing method proposed by the present disclosure.

根据本公开的又一方面，提供了一种计算机程序产品，包括计算机程序，计算机程序在被处理器执行本公开提出的图像处理方法。According to another aspect of the present disclosure, a computer program product is provided, including a computer program, and the computer program is executed by a processor to execute the image processing method proposed by the present disclosure.

在本公开中，首先获取目标场景种的目标图像，其中，目标图像包括待识别对象；然后对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；最后基于目标像素数据对目标图像进行校正，得到校正结果。实现了提高对目标图像的识别效率。容易注意到的是，可以使用目标像素数据来表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系，然后基于目标像素数据对目标图像进行校正，可以进一步的提高识别的准确度，降低误检的情况，进而解决了相关技术中对快递对象进行检测的准确度较低的技术问题。In the present disclosure, a target image of a target scene is first acquired, wherein the target image includes an object to be recognized; then the target image is detected to obtain target pixel data, wherein the target pixel data is used to represent at least one of the objects to be recognized The positional relationship between the pixel and the vertex coordinates of the object to be recognized; finally, the target image is corrected based on the target pixel data, and the correction result is obtained. The recognition efficiency of the target image is improved. It is easy to notice that the target pixel data can be used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized, and then the target image is corrected based on the target pixel data, which can further improve recognition. The accuracy of the method can reduce the false detection, and then solve the technical problem of the low accuracy of the express object detection in the related art.

应当理解，本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征，也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

附图说明Description of drawings

附图用于更好地理解本方案，不构成对本公开的限定。其中：The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present disclosure. in:

图1是本公开实施例的一种快递单的局部图；1 is a partial view of a courier slip according to an embodiment of the present disclosure;

图2是根据本公开实施例的一种用于实现数据处理方法的计算机终端(或移动设备)的硬件结构框图；2 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a data processing method according to an embodiment of the present disclosure;

图3是根据本公开第一实施例的一种数据处理方法流程图；3 is a flowchart of a data processing method according to the first embodiment of the present disclosure;

图4a是本公开实施例的一种不规则摆放的快递单图；4a is a diagram of an irregularly placed express delivery note according to an embodiment of the present disclosure;

图4b是本公开实施例的一种不规则摆放的快递单图；4b is a diagram of an irregularly placed express delivery note according to an embodiment of the present disclosure;

图5a是本公开实施例的一种快递单外框检测图；Fig. 5a is a frame detection diagram of an express delivery slip according to an embodiment of the present disclosure;

图5b是本公开实施例的一种快递单外框矫正图；5b is a correction diagram of an outer frame of a courier slip according to an embodiment of the present disclosure;

图5c是根据本公开第二实施例的另一种数据处理方法流程图；5c is a flowchart of another data processing method according to the second embodiment of the present disclosure;

图6a是本公开实施例的一种样本图像；Figure 6a is a sample image of an embodiment of the present disclosure;

图6b是本公开实施例的一种样本图像的中心高斯分布区域图；6b is a central Gaussian distribution area diagram of a sample image according to an embodiment of the present disclosure;

图6c是根据本公开第三实施例的另一种数据处理方法流程图；6c is a flowchart of another data processing method according to the third embodiment of the present disclosure;

图7是根据本公开实施例的一种数据处理装置的结构框图。FIG. 7 is a structural block diagram of a data processing apparatus according to an embodiment of the present disclosure.

具体实施方式Detailed ways

以下结合附图对本公开的示范性实施例做出说明，其中包括本公开实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本公开的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

需要说明的是，本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second" and the like in the description and claims of the present disclosure and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

随着电商和交通运输的发展，快递行业蓬勃发展。2021年我国快递数量突破100亿，在促消费和加快经济循环中凸显了非常重要的作用。市面上常见的快递公司有几十家，每家公司又有多种样式的快递单，信息繁杂。每天快递站点分发快递、派送快递的过程中每个人接触的快递有上百件，如果通过人工操作录入单号、收件人信息，更新物流公司内部的信息管理系统的物流状态，或者根据电话、地址等信息划分区域派单，就需要耗费大量的人力和时间成本。在要求快递配送速度的环境下，人工操作出错的几率较高，可能会招致投诉。With the development of e-commerce and transportation, the express delivery industry is booming. In 2021, the number of express delivery in my country will exceed 10 billion, which will play a very important role in promoting consumption and accelerating the economic cycle. There are dozens of common express companies on the market, and each company has various styles of express orders, and the information is complicated. There are hundreds of couriers that each person comes into contact with in the process of distributing and delivering couriers at courier sites every day. If you manually enter the order number and recipient information, update the logistics status of the logistics company's internal information management system, or update the logistics status of the logistics company's internal information management system. It takes a lot of manpower and time to distribute orders based on information such as addresses. In an environment that requires express delivery speed, there is a high chance of errors in manual operations, which may lead to complaints.

目前，提高目标检测识别的方法主要有以下几种：At present, the methods for improving target detection and recognition mainly include the following:

方法1、四方向分类法，四方向分类模型可以输出图片的上下左右4个朝向，然后根据方向旋转90°，180°，270°矫正快递单图片。Method 1. Four-direction classification method. The four-direction classification model can output four orientations of the picture, up and down, left and right, and then rotate 90°, 180°, 270° according to the direction to correct the express order picture.

方法2、回归方法，利用回归模型直接检测快递单主体的4个顶点。Method 2. Regression method, using the regression model to directly detect the four vertices of the main body of the express order.

方法3、分割方法，基于分割算法，输出主体区域位置和文字正向1/2、文字正向左上角1/4区域位置。结合主体区域位置和1/4区域位置确定主体4顶点坐标以及起点顶点。Method 3. The segmentation method, based on the segmentation algorithm, outputs the position of the main area and the forward 1/2 of the text, and the position of the upper left corner of the upper left corner of the text. Combine the position of the main body area and the position of the 1/4 area to determine the coordinates of the vertex of the main body 4 and the vertex of the starting point.

相关技术中都存在一些问题，分别如下：方法1、四方向分类法，对于摄像头倾斜拍摄出的带仿射变换角度的图片或者旋转45度左右的图片分类困难。分类正确的情况下，旋转后的文字依然有一定角度的倾斜，影响后续文字检测、识别精度；方法2、回归方法，快递单版式多、样式复杂时，会出现顶点位置不准的情况；方法3、分割方法，图1是快递单的局部小图，在快递单小图和局部图如图1所示的场景下，特征分布分散，分割精度易受条形码等大面积图像特征的影响，1/4区域map易错，进而导致顶点起点判断有误差。There are some problems in the related art, which are as follows: Method 1, the four-direction classification method, it is difficult to classify pictures with affine transformation angles or pictures rotated about 45 degrees taken by the camera obliquely. When the classification is correct, the rotated text still has a certain angle of inclination, which affects the subsequent text detection and recognition accuracy; method 2, regression method, when there are many layouts and complex styles of express orders, the vertex position will be inaccurate; method 3. Segmentation method, Figure 1 is a small local image of the express order. In the scenario shown in Figure 1, the feature distribution is scattered, and the segmentation accuracy is easily affected by large-area image features such as barcodes. 1 The /4 area map is error-prone, which in turn leads to errors in the judgment of the vertex starting point.

根据本公开实施例，提供了一种图像处理方法，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present disclosure, an image processing method is provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer-executable instructions, and although the flowchart is shown in the flowchart A logical order is shown, but in some cases steps shown or described may be performed in a different order than shown.

本公开实施例所提供的方法实施例可以在移动终端、计算机终端或者类似的电子设备中执行。电子设备旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本公开的实现。图2示出了一种用于实现图像处理方法的计算机终端(或移动设备)的硬件结构框图。The method embodiments provided by the embodiments of the present disclosure may be executed in a mobile terminal, a computer terminal, or a similar electronic device. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein. FIG. 2 shows a block diagram of the hardware structure of a computer terminal (or mobile device) for implementing an image processing method.

如图2所示，计算机终端200包括计算单元201，其可以根据存储在只读存储器(ROM)202中的计算机程序或者从存储单元208加载到随机访问存储器(RAM)203中的计算机程序，来执行各种适当的动作和处理。在RAM 203中，还可存储计算机终端200操作所需的各种程序和数据。计算单元201、ROM 202以及RAM 203通过总线204彼此相连。输入/输出(I/O)接口205也连接至总线204。As shown in FIG. 2 , the computer terminal 200 includes a computing unit 201, which can be generated according to a computer program stored in a read only memory (ROM) 202 or a computer program loaded from a storage unit 208 into a random access memory (RAM) 203 Various appropriate actions and processes are performed. In the RAM 203, various programs and data necessary for the operation of the computer terminal 200 can also be stored. The computing unit 201 , the ROM 202 , and the RAM 203 are connected to each other through a bus 204 . An input/output (I/O) interface 205 is also connected to the bus 204 .

计算机终端200中的多个部件连接至I/O接口205，包括：输入单元206，例如键盘、鼠标等；输出单元207，例如各种类型的显示器、扬声器等；存储单元208，例如磁盘、光盘等；以及通信单元209，例如网卡、调制解调器、无线通信收发机等。通信单元209允许计算机终端200通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in the computer terminal 200 are connected to the I/O interface 205, including: an input unit 206, such as a keyboard, a mouse, etc.; an output unit 207, such as various types of displays, speakers, etc.; a storage unit 208, such as a magnetic disk, an optical disk, etc. etc.; and a communication unit 209, such as a network card, modem, wireless communication transceiver, and the like. The communication unit 209 allows the computer terminal 200 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

计算单元201可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元201的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元201执行本文所描述的图像处理方法。例如，在一些实施例中，图像处理方法可被实现为计算机软件程序，其被有形地包含于机器可读介质，例如存储单元208。在一些实施例中，计算机程序的部分或者全部可以经由ROM 202和/或通信单元209而被载入和/或安装到计算机终端200上。当计算机程序加载到RAM 203并由计算单元201执行时，可以执行本文描述的图像处理方法的一个或多个步骤。备选地，在其他实施例中，计算单元201可以通过其他任何适当的方式(例如，借助于固件)而被配置为执行图像处理方法。Computing unit 201 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 201 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 201 performs the image processing method described herein. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 208 . In some embodiments, part or all of the computer program may be loaded and/or installed on the computer terminal 200 via the ROM 202 and/or the communication unit 209 . When a computer program is loaded into RAM 203 and executed by computing unit 201, one or more steps of the image processing method described herein may be performed. Alternatively, in other embodiments, the computing unit 201 may be configured to perform the image processing method by any other suitable means (eg, by means of firmware).

本文中描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), system-on-chips System (SOC), Load Programmable Logic Device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

此处需要说明的是，在一些可选实施例中，上述图2所示的电子设备可以包括硬件元件(包括电路)、软件元件(包括存储在计算机可读介质上的计算机代码)、或硬件元件和软件元件两者的结合。应当指出的是，图2仅为特定具体实例的一个实例，并且旨在示出可存在于上述电子设备中的部件的类型。It should be noted here that, in some optional embodiments, the electronic device shown in FIG. 2 may include hardware elements (including circuits), software elements (including computer code stored on a computer-readable medium), or hardware elements A combination of both components and software components. It should be noted that Figure 2 is only one example of a specific embodiment, and is intended to illustrate the types of components that may be present in the electronic device described above.

在上述运行环境下，本公开提供了如图3所示的图像处理方法，该方法可以由图2所示的计算机终端或者类似的电子设备执行。图3是根据本公开第一实施例提供的一种图像处理方法流程图。如图3所示，该方法可以包括如下步骤：Under the above operating environment, the present disclosure provides the image processing method shown in FIG. 3 , and the method can be executed by the computer terminal shown in FIG. 2 or a similar electronic device. FIG. 3 is a flowchart of an image processing method provided according to the first embodiment of the present disclosure. As shown in Figure 3, the method may include the following steps:

步骤S301，获取目标图像，其中，目标图像包括待识别对象。Step S301, acquiring a target image, wherein the target image includes an object to be recognized.

上述的目标图像可以是包含有待识别对象的快递包裹图像，其中，待识别对象可以为快递包裹上的快递单据。The above-mentioned target image may be an image of an express package containing an object to be identified, wherein the object to be identified may be an express document on the express package.

上述的待识别对象可以为快递包裹上的快递单据。其中，该快递单据可以包含以下信息：快递单号、收件人信息等。The above-mentioned object to be identified may be the express document on the express package. The express document may include the following information: express number, recipient information, and the like.

上述的待识别对象还可以为图像中的发票、电子卡片、海报、文档等。The above-mentioned objects to be identified may also be invoices, electronic cards, posters, documents, etc. in the image.

在一种可选地实施例中，可以通过拍摄设备获取目标图像，其中，拍摄设备可以为手机、照相机等。In an optional embodiment, the target image may be acquired by a photographing device, where the photographing device may be a mobile phone, a camera, or the like.

在另一种可选地实施例中，在实际运输、筛捡的过程中，由于快递包裹角度随意放置，相机拍摄角度不固定，从而会导致拍出的快递包裹图像中的快递包裹出现正向、倒置、倾斜，甚至扭曲等，图4a和图4b是本公开中一种不规则摆放的快递包裹单图像，快递单快递包裹图像在如图4a和图4b所示的分布场景下，直接进行文字检测识别难度较高，人工摆正后识别会大大增加人工成本和时间成本。在本公开中，可以在获取到快递包裹图像之后，对快递包裹图像中的快递单据进行校正，以便得到正向的快递单据，从而提高对快递单据的检测准确度。In another optional embodiment, in the process of actual transportation and screening, since the angle of the express package is randomly placed and the camera shooting angle is not fixed, the express package in the photographed express package image will appear positive. , inverted, tilted, or even distorted, etc. Figures 4a and 4b are images of an irregularly placed express parcel in the present disclosure. Text detection and recognition is difficult, and recognition after manual correction will greatly increase labor costs and time costs. In the present disclosure, after the express package image is acquired, the express document in the express package image may be corrected to obtain a positive express document, thereby improving the detection accuracy of the express document.

步骤S302，对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系。Step S302: Detect the target image to obtain target pixel data, wherein the target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

上述的目标像素数据可以是待识别对象中所有像素与待识别对象的顶点坐标之间的位置关系。目标像素数据也可以是待识别对象中其中一个像素与待识别对象的顶点坐标之间的位置关系；目标像素数据还可以是待识别对象中的多个像素与待识别对象的顶点坐标之间的位置关系。可选地，目标像素数据可以为在快递单据主体框内的像素和主体框4个顶点的横、纵坐标的差值。The above-mentioned target pixel data may be the positional relationship between all pixels in the object to be recognized and the vertex coordinates of the object to be recognized. The target pixel data can also be the positional relationship between one of the pixels in the object to be recognized and the vertex coordinates of the object to be recognized; the target pixel data can also be the relationship between multiple pixels in the object to be recognized and the vertex coordinates of the object to be recognized. Positional relationship. Optionally, the target pixel data may be the difference between the horizontal and vertical coordinates of the pixels in the main frame of the express document and the four vertices of the main frame.

在一种可选地实施例中，上述的至少一个像素可以为快递单据中心区域的像素，由于中心区域的像素中文字信息较多，因此，利用中心区域的像素能够更精确的表示出像素与顶点坐标之间的位置关系。In an optional embodiment, the above-mentioned at least one pixel may be a pixel in the central area of the express document. Since the pixels in the central area contain more textual information, the pixels in the central area can be used to more accurately represent the pixel and the pixel in the central area. Positional relationship between vertex coordinates.

在另一种可选地实施例中，可以利用检测模型对目标图像进行检测，得到目标像素数据。可选地，检测模型可以采用多通道分割的方式，输出主体框内的像素和4个顶点的横纵坐标差值，从而来计算主体4个顶点坐标并判断4个顶点的起点坐标。起点坐标可以根据像素中文字的朝向确定，可选的，4顶点中的起点坐标可以为文字正向前提下的左上角。In another optional embodiment, a detection model may be used to detect the target image to obtain target pixel data. Optionally, the detection model can use multi-channel segmentation to output the difference between the horizontal and vertical coordinates of the pixels in the main frame and the four vertices, so as to calculate the coordinates of the four vertices of the main body and determine the starting point coordinates of the four vertices. The coordinates of the starting point can be determined according to the orientation of the text in the pixel. Optionally, the coordinates of the starting point in the 4 vertices can be the upper left corner under the premise that the text is positive.

上述的检测模型可以为卷积神经网络(Convolutional Neural Networks，也称为CNN)，其中，卷积神经网络可以是一类包含卷积计算且具有深度结构的前馈神经网络。The above detection model may be a convolutional neural network (Convolutional Neural Networks, also referred to as CNN), wherein the convolutional neural network may be a type of feedforward neural network that includes convolutional computation and has a deep structure.

上述的起点可以通过文字方向确定，在一种可选地实施例中，可以将文字正向时左上角顶点，视为起点。The above-mentioned starting point may be determined by the direction of the text. In an optional embodiment, the vertex of the upper left corner when the text is in the forward direction may be regarded as the starting point.

上述的顶点坐标为快递单据四个角上的点的坐标。The above-mentioned vertex coordinates are the coordinates of the points on the four corners of the express document.

在另一种可选地实施例中，可以基于像素信息从不同角度对目标对象进行全方位检测，以便得到目标图像对应的目标像素数据。进一步的，由于目标像素数据是待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系，因此，可以通过目标像素数据确定出4个顶点坐标和4个顶点坐标的起点，进一步的，可以根据4个顶点坐标和4个顶点坐标的起点对目标图像中的待识别对象进行校正，使得待识别对象处于正向。In another optional embodiment, the target object may be detected in all directions from different angles based on pixel information, so as to obtain target pixel data corresponding to the target image. Further, since the target pixel data is the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized, therefore, 4 vertex coordinates and the starting point of 4 vertex coordinates can be determined by the target pixel data, Further, the object to be recognized in the target image can be corrected according to the coordinates of the four vertices and the starting point of the coordinates of the four vertices, so that the object to be recognized is in the forward direction.

步骤S303，基于目标像素数据对目标图像进行校正，得到校正结果。Step S303, correcting the target image based on the target pixel data to obtain a correction result.

在一种可选地实施例中，可以根据目标像素数据进行逻辑计算得到目标图像中待识别对象的4个顶点和起点。In an optional embodiment, four vertices and starting points of the object to be recognized in the target image may be obtained by logical calculation according to the target pixel data.

在另一种可选地实施例中，可以根据4个顶点和起点对目标图像中的待识别对象进行校正，使得待识别对象可以正向显示，由于正向显示的待识别对象中的文字信息为正向，因此在对待识别对象中的文字信息进行检测时，可以提高对文字信息检测的准确度。In another optional embodiment, the object to be recognized in the target image can be corrected according to the four vertices and the starting point, so that the object to be recognized can be displayed in the forward direction, because the text information in the object to be recognized displayed in the forward direction Therefore, when detecting the text information in the object to be recognized, the detection accuracy of the text information can be improved.

在另一种可选的实施例中，可以通过仿射变换将目标图像中的待识别对象进行校正，得到校正结果，其中，仿射变换，是指在几何中一个向量空间进行一次线性变换并接上一个平移，变换为另一个向量空间。通过上述步骤能够大大降低正向文字的文字检测、识别的难度，可显著提升文字识别的精度，且不需要人工摆正。In another optional embodiment, the object to be recognized in the target image may be corrected through affine transformation to obtain a correction result, wherein the affine transformation refers to performing a linear transformation in a vector space in geometry and Following a translation, transform into another vector space. The above steps can greatly reduce the difficulty of character detection and recognition of positive characters, and can significantly improve the accuracy of character recognition, and manual correction is not required.

在另一种可选地实施例中，在得到校正后的目标图像之后，可以通过把快递单识别软件开发工具包(Software Development Kit也称为SDK)集成到手机、把枪、高拍仪等硬件上，该软件开发工具包可以自动实时提取快递单上的货单号、收件人信息等，既能保证较高的识别精度，又能够大大减少人工核算工作量。In another optional embodiment, after the corrected target image is obtained, a software development kit (Software Development Kit, also referred to as SDK) for express order recognition can be integrated into a mobile phone, a gun, a high-speed camera, etc. In terms of hardware, the software development kit can automatically extract the tracking number and recipient information on the express order in real time, which can not only ensure high recognition accuracy, but also greatly reduce the workload of manual accounting.

图5a是本公开中的一种快递单外框检测图，图5b是本公开中的一种快递单外框矫正图。如图5a和图5b所示，本公开中通过检测模型对目标图像进行检测，得到目标像素数据，能够快速检测出各个方向如，正向、倒置、倾斜、扭曲的快递单区域的4个顶点，并能按照文字方向确定顶点起点以及顺序，其中，顶点起点可以是文字正向前提下的左上角。以便将快递单区域至文字矫正至正向，提高后续文字检测识别的精度。Fig. 5a is a detection diagram of an outer frame of a courier slip in the present disclosure, and Fig. 5b is a correction diagram of an outer frame of a courier slip in the present disclosure. As shown in Figure 5a and Figure 5b, in the present disclosure, the target image is detected by the detection model to obtain target pixel data, which can quickly detect the four vertices of the express order area in various directions, such as forward, inverted, inclined, and twisted. , and the starting point and sequence of the vertex can be determined according to the direction of the text, wherein the starting point of the vertex can be the upper left corner under the premise that the text is positive. In order to correct the express order area to the text to be positive, and improve the accuracy of subsequent text detection and recognition.

根据本公开上述步骤S301至步骤S303，首先获取目标场景种的目标图像，其中，目标图像包括待识别对象；然后对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；最后基于目标像素数据对目标图像进行校正，得到校正结果。实现了提高对目标图像的识别效率。容易注意到的是，可以使用目标像素数据来表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系，然后基于目标像素数据对目标图像进行校正，可以进一步的提高识别的准确度，降低误检的情况，进而解决了相关技术中对快递对象进行检测的准确度较低的技术问题。According to the above-mentioned steps S301 to S303 of the present disclosure, a target image of the target scene is first obtained, wherein the target image includes the object to be recognized; then the target image is detected to obtain target pixel data, wherein the target pixel data is used to represent the to-be-recognized object. The positional relationship between at least one pixel in the object and the vertex coordinates of the object to be recognized; finally, the target image is corrected based on the target pixel data to obtain a correction result. The recognition efficiency of the target image is improved. It is easy to notice that the target pixel data can be used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized, and then the target image is corrected based on the target pixel data, which can further improve recognition. The accuracy of the method can reduce the false detection, and then solve the technical problem of the low accuracy of the express object detection in the related art.

图5c是根据本公开第二实施例的一种图像处理方法的流程图，如图5c所示，该方法包括如下步骤：Fig. 5c is a flowchart of an image processing method according to the second embodiment of the present disclosure. As shown in Fig. 5c, the method includes the following steps:

步骤S501，获取目标图像，其中，目标图像包括待识别对象。Step S501, acquiring a target image, wherein the target image includes an object to be recognized.

步骤S502，对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系。Step S502: Detect the target image to obtain target pixel data, wherein the target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

步骤S503，基于目标像素数据对目标图像进行校正，得到校正结果。Step S503, correcting the target image based on the target pixel data to obtain a correction result.

步骤S504，基于校正结果对目标图像进行识别，得到识别结果，其中，识别结果用于表示目标图像中的待识别对象的文本信息。Step S504: Identify the target image based on the correction result to obtain a recognition result, wherein the recognition result is used to represent the text information of the object to be recognized in the target image.

可选地，基于校正结果对目标图像进行识别，得到识别结果，其中，识别结果用于表示目标图像中的待识别对象的文本信息。Optionally, the target image is recognized based on the correction result to obtain a recognition result, wherein the recognition result is used to represent text information of the object to be recognized in the target image.

在一种可选地实施例中，可以根据校正结果确定出目标图像中待识别对象的正向图像，通过对待识别对象的正向图像进行识别，可以得到待识别对象中记载的文本信息，例如，收件人信息、快递单号等，从而可以实现对目标图像进行精确识别，进而可以得到精确度较高的待识别对象的文本信息。In an optional embodiment, the forward image of the object to be recognized in the target image can be determined according to the correction result, and the text information recorded in the object to be recognized can be obtained by recognizing the forward image of the object to be recognized, for example , recipient information, express order number, etc., so that the target image can be accurately identified, and then the text information of the object to be identified with high accuracy can be obtained.

在对目标图像进行矫正时，可根据检测主体4个顶点的坐标及起点坐标信息进行检测，通过仿射变换可将各个方向的快递单图片的主体区域矫正至文字正向图片。如此操作之后能够大大降低正向文字的文字检测、识别的难度，可显著提升文字识别的精度，且不需要人工摆正。得到校正结束之后，可以基于校正结果进行目标图像识别，得到的识别结果可以用来表示待识别对象的文本信息。在一种可选地实施例中，还可以根据精确度较高的对象图像进行识别，从而提高识别对象中信息的准确度。When correcting the target image, it can be detected according to the coordinates of the four vertices of the detection subject and the coordinate information of the starting point. Through affine transformation, the main area of the express order picture in each direction can be corrected to the text positive picture. After this operation, the difficulty of character detection and recognition of forward characters can be greatly reduced, the accuracy of character recognition can be significantly improved, and manual correction is not required. After the correction is obtained, target image recognition can be performed based on the correction result, and the obtained recognition result can be used to represent the text information of the object to be recognized. In an optional embodiment, the recognition may also be performed according to an object image with higher accuracy, thereby improving the accuracy of the information in the recognized object.

可选地，基于目标像素数据对目标图像进行校正，得到校正结果，包括：基于目标像素数据确定待识别对象的顶点坐标和顶点坐标的排序顺序；基于顶点坐标和顶点坐标的排序顺序对目标图像进行校正，得到校正结果。Optionally, correcting the target image based on the target pixel data to obtain a correction result, including: determining the vertex coordinates of the object to be recognized and the sorting order of the vertex coordinates based on the target pixel data; Correction is performed to obtain the correction result.

上述的排序顺序可以为顺时针或者逆时针。The above sorting order can be clockwise or counterclockwise.

在一种可选地实施例中，可以通过逻辑计算得到快递单图片主体区域的4顶点坐标x1、y1、x2、y2、x3、y3、x4、y4,其中，该坐标可按照顺时针或逆时针排序。其中，x1,y1为文字正向时左上角顶点，可以视为起点。In an optional embodiment, the coordinates x1, y1, x2, y2, x3, y3, x4, and y4 of the 4 vertices of the main area of the express note picture can be obtained through logical calculation, wherein the coordinates can be clockwise or counterclockwise Clock order. Among them, x1, y1 is the upper left corner vertex when the text is forward, which can be regarded as the starting point.

在另一种可选的实施例中，可以根据获取到的顶点信息以及相应的起点信息，通过仿射变换可将各个方向的快递单图片的主体区域矫正至文字正向，即得到校正后的结果。其中，仿射变换又称为仿射映射，是指在几何中，一个向量空间进行一次线性变换并接上一个平移，变换为另一个向量空间。正向文字能够大大降低文字检测、识别的难度，可显著提升文字识别的精度且不需要人工摆正。In another optional embodiment, according to the obtained vertex information and the corresponding starting point information, the main area of the express order picture in all directions can be corrected to the forward direction of the text through affine transformation, that is, the corrected image can be obtained. result. Among them, affine transformation is also called affine mapping, which means that in geometry, a vector space is transformed into another vector space by performing a linear transformation followed by a translation. Forward text can greatly reduce the difficulty of text detection and recognition, and can significantly improve the accuracy of text recognition without manual correction.

可选地，基于顶点坐标和顶点坐标的排序顺序对目标图像进行校正，得到校正结果，还包括：根据顶点坐标的排序顺序确定顶点坐标中的目标坐标，其中，目标坐标为顶点坐标中的起点坐标；基于目标坐标和顶点坐标对待识别对象进行校正，得到校正结果。Optionally, correcting the target image based on the vertex coordinates and the sorting order of the vertex coordinates to obtain a correction result, further comprising: determining the target coordinates in the vertex coordinates according to the sorting order of the vertex coordinates, wherein the target coordinates are the starting point in the vertex coordinates Coordinates: Correct the object to be recognized based on the target coordinates and vertex coordinates, and obtain the correction result.

上述的顶点中的目标坐标可以为起点坐标。其中，起点为文字正向前提下的左上角。The target coordinates in the above-mentioned vertices may be the origin coordinates. Among them, the starting point is the upper left corner under the premise that the text is positive.

将已经得到的4个顶点坐标x1、y1、x2、y2、x3、y3、x4、y4，和起点x1,y1通过仿射变换，即可将各个方向的快递单图片的主体区域矫正至文字正向图片，进一步的提高对快递对象的识别准确度。By affine transformation of the obtained four vertex coordinates x1, y1, x2, y2, x3, y3, x4, y4, and the starting point x1, y1, the main area of the express order image in all directions can be corrected to the correct text. To pictures, to further improve the recognition accuracy of express objects.

可选地，对目标图像进行检测，得到目标像素数据，包括：利用检测模型对目标图像进行检测，得到目标像素数据。Optionally, detecting the target image to obtain the target pixel data includes: using a detection model to detect the target image to obtain the target pixel data.

在一种可选的实施例中，可以利用检测模型对目标图像中待识别对象的像素进行检测，得到目标像素数据。In an optional embodiment, a detection model may be used to detect the pixels of the object to be recognized in the target image to obtain target pixel data.

在一种可选地实施例中，可以通过检测模型同时对多个目标图像进行检测，可以极大地提升检测效率。In an optional embodiment, the detection model can be used to detect multiple target images at the same time, which can greatly improve the detection efficiency.

在另一种可选地是实例中，还可以通过检测模型同时对一个目标图像中的多个包裹进行检测，进一步的提升对目标图像的检测效率。In another optional example, the detection model can be used to simultaneously detect multiple packages in a target image, so as to further improve the detection efficiency of the target image.

可选地，获取原始样本，其中，原始样本包括：样本图像，与样本图像对应的样本坐标，样本坐标为样本图像中待识别对象的顶点坐标；基于样本图像和样本坐标，确定样本像素数据，其中，样本像素数据用于表示样本图像中的像素与待识别对象之间的位置关系；基于样本像素数据和样本图像，确定训练数据；基于训练数据对初始模型进行训练，得到检测模型。Optionally, obtain an original sample, wherein the original sample includes: a sample image, sample coordinates corresponding to the sample image, and the sample coordinates are vertex coordinates of the object to be identified in the sample image; based on the sample image and the sample coordinates, the sample pixel data is determined, The sample pixel data is used to represent the positional relationship between the pixels in the sample image and the object to be recognized; the training data is determined based on the sample pixel data and the sample image; the initial model is trained based on the training data to obtain the detection model.

上述的样本图像可以为包含待识别对象的快递包裹，其中，待识别对象在样本图像中可以是正向、倒置、倾斜，甚至扭曲的。The above-mentioned sample image may be an express package containing an object to be recognized, wherein the object to be recognized in the sample image may be forward, upside-down, oblique, or even distorted.

上述的样本坐标可以为样本图像中待识别对象的顶点坐标。The above-mentioned sample coordinates may be vertex coordinates of the object to be recognized in the sample image.

上述的样本像素数据可以为待识别对象中至少一个像素与待识别对象的顶点坐标之间的位置关系。The above-mentioned sample pixel data may be the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

上述训练数据为对初始模型进行训练的数据。The above training data is the data for training the initial model.

在一种可选地实施例中，可以获取原始样本，其中，原始样本中可以包含有多个样本图像以及多个样本图像对应的样本坐标，需要说明的是，多个样本图像对应的样本坐标可以是通过人工进行标注的样本坐标，可选的，在获取到多个样本图像之后，人工可以对样本图像进行坐标标注，得到每个样本图像对应的样本坐标，并根据样本图像和样本图像对应的样本坐标生成原始样本。In an optional embodiment, an original sample may be obtained, wherein the original sample may include multiple sample images and sample coordinates corresponding to the multiple sample images. It should be noted that the sample coordinates corresponding to the multiple sample images It can be the sample coordinates that are manually annotated. Optionally, after obtaining multiple sample images, the coordinates of the sample images can be manually annotated to obtain the sample coordinates corresponding to each sample image. The sample coordinates of the original sample are generated.

进一步地，可以将样本图像中至少一个像素和样本坐标进行逻辑计算，确定出样本像素数据，为了减少计算量，可以将样本图像中目标区域的像素和样本坐标进行逻辑计算，确定出样本像素数据。可以根据多个样本图像和多个样本图像对应的样本像素数据确定出训练数据。可选的，可以将样本像素数据和样本图像构建成一个样本对，可以获取多个样本对，根据多个样本对生成训练数据，可以根据训练数据对初始模型进行训练，得到检测模型。Further, at least one pixel in the sample image and the sample coordinates can be logically calculated to determine the sample pixel data. In order to reduce the amount of calculation, the pixel of the target area in the sample image and the sample coordinates can be logically calculated to determine the sample pixel data. . The training data may be determined according to multiple sample images and sample pixel data corresponding to the multiple sample images. Optionally, the sample pixel data and the sample image may be constructed into a sample pair, multiple sample pairs may be obtained, training data may be generated according to the multiple sample pairs, and an initial model may be trained according to the training data to obtain a detection model.

进一步地，可利用该检测模型对目标图像进行检测，得到目标图像中的待识别对象的目标像素数据，由于目标像素数据中包含有像素与坐标之间的位置关系，因此，可以对目标图像中的待识别对象的像素进行识别，并根据目标像素数据对像素进行逻辑计算，得到与待识别对象中像素对应的顶点坐标的排序顺序和顶点坐标中的起点坐标，根据顶点坐标的排序顺序和顶点坐标的起点坐标可以对待识别对象进行校正，使得待识别对象能够正向显示，以便于对待识别对象的识别，从而能够极大提升对目标图像的检测效率。Further, the detection model can be used to detect the target image, and the target pixel data of the object to be identified in the target image can be obtained. Since the target pixel data contains the positional relationship between the pixel and the coordinates, it is possible to detect the target image in the target image. identify the pixels of the object to be identified, and perform logical calculations on the pixels according to the target pixel data to obtain the sorting order of the vertex coordinates corresponding to the pixels in the object to be identified and the starting point coordinates in the vertex coordinates, according to the sorting order of the vertex coordinates and the vertex coordinates The coordinates of the starting point of the coordinates can be corrected for the object to be recognized, so that the object to be recognized can be displayed in the forward direction, so as to facilitate the recognition of the object to be recognized, thereby greatly improving the detection efficiency of the target image.

可选地，基于样本图像和样本坐标，确定样本像素数据，包括：获取样本图像中待识别对象的目标区域；获取目标区域中的像素和样本坐标之间的差值，确定样本像素数据。Optionally, determining sample pixel data based on the sample image and sample coordinates includes: acquiring a target area of an object to be recognized in the sample image; acquiring a difference between pixels in the target area and sample coordinates to determine sample pixel data.

在一种可选地实施例中，上述的目标区域可以是待识别对象的中心区域，可以获取样本图像中待识别对象的中心区域，获取中心区域的像素和样本坐标之间的差值，可选的，可以确定像素所处的坐标和样本坐标之间的差值，根据该差值可以确定出像素与样本坐标之间的位置关系，进而可以确定出样本像素数据。In an optional embodiment, the above-mentioned target area may be the center area of the object to be recognized, the center area of the object to be recognized in the sample image may be obtained, and the difference between the pixels of the center area and the coordinates of the sample may be obtained, Optionally, the difference between the coordinates where the pixel is located and the sample coordinates can be determined, and the positional relationship between the pixel and the sample coordinates can be determined according to the difference, and then the sample pixel data can be determined.

其中，目标区域可以为待识别对象中文字较大的区域；目标区域也可以为待识别对象中文字较清晰区域、目标区域也可以是待识别对象的中心区域、信息很多的地方。在一种可选地实施例中，可以将上述的目标区域视为中心高斯分布区域，其中，高斯分布即为正态分布。The target area may be an area with larger characters in the object to be recognized; the target area may also be an area with relatively clear characters in the object to be recognized, and the target area may also be a central area of the object to be recognized and a place with a lot of information. In an optional embodiment, the above-mentioned target area may be regarded as a central Gaussian distribution area, where the Gaussian distribution is a normal distribution.

在图6a是本公开中的一种样本图像，图6b是该样本图像中的中心高斯分布区域图，在本公开中，通过仅提取样本图像中待识别对象的高斯分布区域的像素，去计算样本图像中待识别对象的高斯分布区域的像素和样本坐标中4个顶点的横纵坐标差值，去确定样本数据，极大减少了输出的候选正样本个数，从而降低计算量，提升检测性能。Figure 6a is a sample image in the present disclosure, and Figure 6b is a map of the central Gaussian distribution area in the sample image. In the present disclosure, by extracting only the pixels of the Gaussian distribution area of the object to be recognized in the sample image, to calculate The difference between the pixels of the Gaussian distribution area of the object to be recognized in the sample image and the horizontal and vertical coordinates of the four vertices in the sample coordinates is used to determine the sample data, which greatly reduces the number of output candidate positive samples, thereby reducing the amount of calculation and improving detection. performance.

本公开中通过获取目标图像，其中，目标图像包括待识别对象；对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；基于目标像素数据对目标图像进行校正，得到校正结果。在对目标图像中的待识别对象进行检测时，可以引入像素级的方向监督，通过检测模型对目标图像进行检测，能够得到待识别对象中的像素与待识别对象中顶点坐标之间的位置关系，通过目标像素数据中的位置信息对目标图像进行处理，可以得到更加精确的对象图像，从而提高获取得到的对象图像的准确度。In the present disclosure, a target image is acquired, wherein the target image includes an object to be recognized; the target image is detected to obtain target pixel data, wherein the target pixel data is used to represent at least one pixel in the object to be recognized and a vertex of the object to be recognized The positional relationship between the coordinates; the target image is corrected based on the target pixel data, and the correction result is obtained. When detecting the object to be recognized in the target image, pixel-level direction supervision can be introduced, and the target image can be detected through the detection model, and the positional relationship between the pixels in the object to be recognized and the coordinates of the vertices in the object to be recognized can be obtained. , by processing the target image through the position information in the target pixel data, a more accurate object image can be obtained, thereby improving the accuracy of the obtained object image.

图6c是根据本公开第三实施例的一种图像处理方法的流程图，如图6c所示，该方法包括如下步骤：Fig. 6c is a flowchart of an image processing method according to the third embodiment of the present disclosure. As shown in Fig. 6c, the method includes the following steps:

步骤S601，获取原始样本，其中，原始样本包括：样本图像，与样本图像对应的样本坐标，样本坐标为样本图像中待识别对象的顶点坐标。Step S601 , obtaining an original sample, wherein the original sample includes: a sample image and sample coordinates corresponding to the sample image, where the sample coordinates are vertex coordinates of an object to be recognized in the sample image.

步骤S602，基于样本图像和样本坐标，确定样本像素数据，其中，样本像素数据用于表示样本图像中的像素与待识别对象之间的位置关系。Step S602: Determine sample pixel data based on the sample image and the sample coordinates, where the sample pixel data is used to represent the positional relationship between the pixels in the sample image and the object to be recognized.

步骤S603，基于样本像素数据和样本图像，确定训练数据。Step S603: Determine training data based on the sample pixel data and the sample image.

步骤S604，基于训练数据对初始模型进行训练，得到检测模型。In step S604, the initial model is trained based on the training data to obtain a detection model.

步骤S605，获取目标图像，其中，目标图像包括待识别对象。Step S605, acquiring a target image, wherein the target image includes the object to be recognized.

步骤S606，对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系。Step S606, the target image is detected to obtain target pixel data, wherein the target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

步骤S607，基于目标像素数据对目标图像进行校正，得到校正结果。Step S607, correcting the target image based on the target pixel data to obtain a correction result.

步骤S608，基于校正结果对目标图像进行识别，得到识别结果，其中，识别结果用于表示目标图像中的待识别对象的文本信息。In step S608, the target image is recognized based on the correction result to obtain a recognition result, wherein the recognition result is used to represent the text information of the object to be recognized in the target image.

本公开的技术方案中，所涉及的用户个人信息的收集、存储、使用、加工、传输、提供和公开等处理，均符合相关法律法规的规定，且不违背公序良俗。In the technical solutions of the present disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of the user's personal information involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台终端设备(可以是手机，计算机，服务器，或者网络设备等)执行本公开各个实施例的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solutions of the present disclosure essentially or the parts that make contributions to the prior art can be embodied in the form of software products, and the computer software products are stored in a storage medium and include several instructions for making a A terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) executes the methods of the various embodiments of the present disclosure.

在本公开中还提供了一种图像处理装置，该装置用于实现上述实施例及优选实施方式，已经进行过说明的不再赘述。如以下所使用的，术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现，但是硬件，或者软件和硬件的组合的实现也是可能并被构想的。The present disclosure also provides an image processing apparatus, which is used to implement the above embodiments and preferred implementations, and what has been described will not be repeated. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, implementations in hardware, or a combination of software and hardware, are also possible and contemplated.

图7是根据本公开其中一实施例的一种图像处理装置的结构框图，如图7所示，一种数据处理装置700包括：获取模块701，检测模块702，校正模块703。FIG. 7 is a structural block diagram of an image processing apparatus according to an embodiment of the present disclosure. As shown in FIG. 7 , a data processing apparatus 700 includes: an acquisition module 701 , a detection module 702 , and a correction module 703 .

获取模块701，用于获取目标图像，其中，目标图像包括待识别对象；an acquisition module 701, configured to acquire a target image, wherein the target image includes an object to be recognized;

检测模块702，用于对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；The detection module 702 is used to detect the target image to obtain target pixel data, wherein the target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized;

校正模块703，用于基于目标像素数据对目标图像进行校正，得到校正结果。The correction module 703 is configured to correct the target image based on the target pixel data to obtain a correction result.

可选的，校正模块703，包括：第一确定单元，用于基于目标像素数据确定待识别对象的顶点坐标和顶点坐标的排序顺序；校正单元，用于基于顶点坐标和顶点坐标的排序顺序对目标图像进行校正，得到校正结果。Optionally, the correction module 703 includes: a first determination unit for determining the vertex coordinates of the object to be recognized and the sorting order of the vertex coordinates based on the target pixel data; a correction unit for determining the vertex coordinates based on the sorting order of the vertex coordinates. The target image is corrected to obtain the correction result.

可选的，校正单元，包括：确定子单元，用于根据顶点坐标的排序顺序确定顶点坐标中的目标坐标，其中，目标坐标为顶点坐标中的起点坐标；校正子单元，用于基于目标坐标和顶点坐标对待识别对象进行校正，得到校正结果。Optionally, the correction unit includes: a determination subunit for determining target coordinates in the vertex coordinates according to the sorting order of the vertex coordinates, where the target coordinates are the starting point coordinates in the vertex coordinates; a correction subunit for determining the target coordinates based on the target coordinates and the vertex coordinates to correct the object to be recognized to obtain the correction result.

可选的，检测模块，包括：检测单元，用于利用检测模型对目标图像进行检测，得到目标像素数据。Optionally, the detection module includes: a detection unit, configured to detect the target image by using the detection model to obtain target pixel data.

可选的，检测模块，包括：获取单元，用于获取原始样本，其中，原始样本包括：样本图像，与样本图像对应的样本坐标，样本坐标为样本图像中待识别对象的顶点坐标；第二确定单元，用于基于样本图像和样本坐标，确定样本像素数据，其中，样本像素数据用于表示样本图像中的像素与待识别对象之间的位置关系；第二确定单元还用于基于样本像素数据和样本图像，确定训练数据；第二确定单元还用于基于训练数据对初始模型进行训练，得到检测模型。Optionally, the detection module includes: an obtaining unit for obtaining an original sample, wherein the original sample includes: a sample image, sample coordinates corresponding to the sample image, and the sample coordinates are the vertex coordinates of the object to be identified in the sample image; the second a determining unit for determining sample pixel data based on the sample image and the sample coordinates, wherein the sample pixel data is used to represent the positional relationship between the pixels in the sample image and the object to be recognized; the second determining unit is also used for determining the sample pixel data based on the sample pixel data The data and sample images are used to determine training data; the second determining unit is further configured to train an initial model based on the training data to obtain a detection model.

可选的，第二确定单元，包括：获取子单元，用于获取样本图像中待识别对象的目标区域；获取子单元还用于获取目标区域中的像素和样本坐标之间的差值，确定样本像素数据。Optionally, the second determination unit includes: an acquisition subunit, used to acquire the target area of the object to be identified in the sample image; the acquisition subunit is also used to acquire the difference between the pixels in the target area and the sample coordinates, and determine Sample pixel data.

可选的，该装置还包括：识别模块，用于基于校正结果对目标图像进行识别，得到识别结果，其中，识别结果用于表示目标图像中的待识别对象的文本信息。Optionally, the device further includes: an identification module, configured to identify the target image based on the correction result to obtain a recognition result, wherein the recognition result is used to represent text information of the object to be recognized in the target image.

需要说明的是，上述各个模块是可以通过软件或硬件来实现的，对于后者，可以通过以下方式实现，但不限于此：上述模块均位于同一处理器中；或者，上述各个模块以任意组合的形式分别位于不同的处理器中。It should be noted that the above modules can be implemented by software or hardware, and the latter can be implemented in the following ways, but not limited to this: the above modules are all located in the same processor; or, the above modules can be combined in any combination The forms are located in different processors.

根据本公开的实施例，本公开还提供了一种电子设备，包括存储器和至少一个处理器，该存储器中存储有计算机指令，该处理器被设置为运行计算机指令以执行上述任一项方法实施例中的步骤。According to an embodiment of the present disclosure, the present disclosure also provides an electronic device, comprising a memory and at least one processor, where computer instructions are stored in the memory, and the processor is configured to execute the computer instructions to execute any one of the above method implementations steps in the example.

可选地，上述电子设备还可以包括传输设备以及输入输出设备，其中，该传输设备和上述处理器连接，该输入输出设备和上述处理器连接。Optionally, the electronic device may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

可选地，在本公开中，上述处理器可以被设置为通过计算机程序执行以下步骤：Optionally, in the present disclosure, the above-mentioned processor may be configured to perform the following steps through a computer program:

S1，获取目标图像，其中，目标图像包括待识别对象；S1, acquire a target image, wherein the target image includes an object to be recognized;

S2，对目标图像进行检测，得到目标像素数据，其中，目标像素数据用于表示待识别对象中的至少一个像素与待识别对象的顶点坐标之间的位置关系；S2, the target image is detected to obtain target pixel data, wherein the target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized;

S3，基于目标像素数据对目标图像进行校正，得到校正结果。S3, correcting the target image based on the target pixel data to obtain a correction result.

可选地，本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例，本实施例在此不再赘述。Optionally, for specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and optional implementation manners, and details are not described herein again in this embodiment.

根据本公开的实施例，本公开还提供了一种存储有计算机指令的非瞬时计算机可读存储介质，该非瞬时计算机可读存储介质中存储有计算机指令，其中，该计算机指令被设置为运行时执行上述任一项方法实施例中的步骤。According to an embodiment of the present disclosure, the present disclosure also provides a non-transitory computer-readable storage medium storing computer instructions, the non-transitory computer-readable storage medium having computer instructions stored therein, wherein the computer instructions are configured to run Steps in any one of the above method embodiments are executed.

可选地，在本实施例中，上述非易失性存储介质可以被设置为存储用于执行以下步骤的计算机程序：Optionally, in this embodiment, the above-mentioned non-volatile storage medium may be configured to store a computer program for executing the following steps:

可选地，在本实施例中，上述非瞬时计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。Optionally, in this embodiment, the above-mentioned non-transitory computer-readable storage medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or equipment, or any suitable combination. More specific examples of readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory ( EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

根据本公开的实施例，本公开还提供了一种计算机程序产品。用于实施本公开的音频处理方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器，使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行，作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。According to an embodiment of the present disclosure, the present disclosure also provides a computer program product. Program code for implementing the audio processing method of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

在本公开的上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present disclosure, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

在本公开所提供的几个实施例中，应该理解到，所揭露的技术内容，可通过其它的方式实现。其中，以上所描述的装置实施例仅仅是示意性的，例如所述单元的划分，可以为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，单元或模块的间接耦合或通信连接，可以是电性或其它的形式。In the several embodiments provided in the present disclosure, it should be understood that the disclosed technical content may be implemented in other manners. The device embodiments described above are only illustrative, for example, the division of the units may be a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外，在本公开各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、只读存储器(ROM)、随机存取存储器(RAM)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the part that contributes to the prior art, or all or part of the technical solutions, and the computer software product is stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, read only memory (ROM), random access memory (RAM), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

以上所述仅是本公开的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本公开原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本公开的保护范围。The above are only the preferred embodiments of the present disclosure. It should be pointed out that for those skilled in the art, without departing from the principles of the present disclosure, several improvements and modifications can be made. It should be regarded as the protection scope of the present disclosure.

Claims

1. An image processing method, comprising:

acquiring a target image, wherein the target image includes an object to be recognized;

Detecting the target image to obtain target pixel data, wherein the target pixel data is used to represent the positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized;

The target image is corrected based on the target pixel data to obtain a correction result.

2. The method according to claim 1, wherein the performing correction on the target image based on the target pixel data to obtain a correction result, comprising:

Determine the vertex coordinates of the object to be recognized and the sorting order of the vertex coordinates based on the target pixel data;

The target image is corrected based on the vertex coordinates and the sorting order of the vertex coordinates to obtain the correction result.

3. The method according to claim 2, wherein the correction of the target image based on the vertex coordinates and the sorting order of the vertex coordinates to obtain a correction result, comprising:

Determine the starting point coordinates in the vertex coordinates according to the sorting order of the vertex coordinates;

The object to be recognized is corrected based on the coordinates of the starting point and the coordinates of the vertex to obtain the correction result.

4. The method of claim 1, wherein the method further comprises:

acquiring an original sample, wherein the original sample includes: a sample image, sample coordinates corresponding to the sample image, and the sample coordinates are vertex coordinates of the object to be identified in the sample image;

determining sample pixel data based on the sample image and the sample coordinates, wherein the sample pixel data is used to represent the positional relationship between the pixels in the sample image and the object to be recognized;

determining training data based on the sample pixel data and the sample image;

The initial model is trained based on the training data to obtain a detection model;

Wherein, the target image is detected to obtain target pixel data, including:

The target image is detected by using the detection model to obtain target pixel data.

5. The method of claim 4, wherein the determining sample pixel data based on the sample image and the sample coordinates comprises:

obtaining the target area of the object to be identified in the sample image;

The difference between the pixels in the target area and the sample coordinates is acquired, and the sample pixel data is determined.

6. The method of claim 1, wherein the method further comprises:

The target image is recognized based on the correction result to obtain a recognition result, wherein the recognition result is used to represent text information of the object to be recognized in the target image.

7. An image processing device, comprising:

an acquisition module for acquiring a target image, wherein the target image includes an object to be recognized;

The detection module is configured to detect the target image to obtain target pixel data, wherein the target pixel data is used to represent the difference between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized. Positional relationship;

A correction module, configured to correct the target image based on the target pixel data to obtain a correction result.

8. The apparatus of claim 7, wherein the correction module comprises:

a first determining unit, configured to determine the vertex coordinates of the object to be identified and the sorting order of the vertex coordinates based on the target pixel data;

A correction unit, configured to correct the target image based on the vertex coordinates and the sorting order of the vertex coordinates to obtain the correction result.

9. The apparatus according to claim 8, wherein the correction unit comprises:

A determination subunit, configured to determine the target coordinates in the vertex coordinates according to the sorting order of the vertex coordinates, wherein the target coordinates are the starting point coordinates in the vertex coordinates;

A correction subunit, configured to correct the object to be recognized based on the target coordinates and the vertex coordinates to obtain the correction result.

10. The apparatus according to claim 7, wherein the detection module comprises:

an obtaining unit, configured to obtain an original sample, wherein the original sample includes: a sample image, sample coordinates corresponding to the sample image, and the sample coordinates are vertex coordinates of the object to be recognized in the sample image;

a second determining unit, configured to determine sample pixel data based on the sample image and the sample coordinates, wherein the sample pixel data is used to represent a position between a pixel in the sample image and the object to be recognized relation;

The second determining unit is further configured to determine training data based on the sample pixel data and the sample image;

The second determining unit is further configured to train an initial model based on the training data to obtain a detection model;

Wherein, the detection module further includes:

A detection unit, configured to detect the target image by using the detection model to obtain target pixel data.

11. The apparatus according to claim 10, wherein the second determining unit comprises:

an acquisition subunit for acquiring the target area of the object to be identified in the sample image;

The obtaining subunit is further configured to obtain the difference between the pixels in the target area and the sample coordinates, and determine the sample pixel data.

12. The apparatus of claim 7, wherein the apparatus further comprises:

The recognition module is configured to recognize the target image based on the correction result to obtain a recognition result, wherein the recognition result is used to represent the text information of the object to be recognized in the target image.

13. An electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any of claims 1-6 Methods.

14. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of any of claims 1-6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-6.