Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, the application provides a text image splicing method, a text image splicing device, electronic equipment and a storage medium.
According to an aspect of the embodiment of the present application, there is provided a text image stitching method, including:
Acquiring an initial text image to be spliced, and carrying out gray-scale processing on the initial text image to obtain a gray-scale image corresponding to the initial text image;
Acquiring a target gradient image corresponding to the initial character image and a target projection image corresponding to the gray level image;
Performing correlation matching on the gray level image, the target gradient image and the target projection image and the initial character image respectively to obtain target pixel points matched with the initial character image in the gray level image, the target gradient image and the target projection image respectively;
and splicing based on the target pixel points to obtain a target text image.
Further, the obtaining the target gradient image corresponding to the initial text image includes:
acquiring a first gradient image of the initial text image in the ordinate direction and a second gradient image of the initial text image in the abscissa direction;
the first gradient image and the second gradient image are determined as the target gradient image.
Further, the obtaining the target projection image corresponding to the gray level image includes:
And determining a projection image of the gray image in the abscissa direction as the target projection image.
Further, the performing correlation matching on the gray level image, the target gradient image and the target projection image with the initial text image to obtain target pixel points in the gray level image, the target gradient image and the target projection image, which are respectively matched with the initial text image, includes:
Acquiring a first local image set extracted from the gray level image, the target gradient image and the target projection image according to a plurality of different preset lengths, wherein the first local image set comprises local images extracted from the gray level image, the target gradient image and the target projection image according to each preset length;
acquiring a second local image set extracted from the initial text image according to a plurality of different preset lengths, wherein the second local image set comprises local images extracted from the initial text image according to each preset length;
And comparing the correlation between the local images in the first local image set and the local images in the second local image set under each preset length to obtain the target pixel point.
Further, the acquiring a first local image set respectively extracted from the gray level image, the target gradient image and the target projection image according to a plurality of different preset lengths includes:
Selecting a first local image from the gray level image, the target gradient image and the target projection image according to a first preset length;
Calculating a sum value between a first preset length and a plurality of step sizes arranged in a gradient manner, and determining the sum value as a second preset length;
Selecting a second local image according to the gray level image, the target gradient image and the target projection image respectively according to the second preset length;
the first set of partial images is derived based on the first partial image and the second partial image.
Further, the obtaining a second local image set extracted from the initial text image according to a plurality of different preset lengths includes:
selecting a third partial image from the initial text image according to a first preset length;
Calculating a sum value between a first preset length and a plurality of step sizes arranged in a gradient manner, and determining the sum value as a second preset length;
Selecting a fourth partial image from the initial text images according to the second preset length;
the second set of partial images is derived based on the third partial image and the fourth partial image.
Further, the comparing the correlation between the local image in the first local image set and the local image in the second local image set under each preset length to obtain the target pixel point includes:
comparing the correlation between the local images in the first local image set and the second local image set under each preset length to obtain candidate pixel points in each image under each preset length;
acquiring confidence degrees corresponding to the gray level image, the target gradient image, the target projection image and the initial text image respectively;
And calculating frequency data corresponding to each candidate pixel point based on the confidence coefficient, and determining the candidate pixel point with the highest frequency data as the target pixel point.
According to another aspect of the embodiment of the present application, there is also provided a text image stitching apparatus, including:
The first acquisition module is used for acquiring initial text images to be spliced, and carrying out gray-scale processing on the initial text images to obtain gray-scale images corresponding to the initial text images;
The second acquisition module is used for acquiring a target gradient image corresponding to the initial character image and acquiring a projection image corresponding to the gray level image;
the processing module is used for performing image correlation matching on the gray level image, the target gradient image and the projection image and the initial text image respectively to obtain target pixel points matched with the initial text image in the gray level image, the target gradient image and the projection image respectively;
And the splicing module is used for splicing based on the target pixel points to obtain a target text image.
According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program that performs the above steps when running.
According to another aspect of the embodiment of the application, there is also provided an electronic device, including a processor, a communication interface, a memory and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus, where the memory is used to store a computer program, and the processor is used to execute the steps in the above method by running the program stored on the memory.
Embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the above method.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the advantages that the method provided by the embodiment of the application carries out correlation matching on the gray level image, the gradient image and the projection image with the initial text image respectively, and carries out image stitching based on the matched target pixel points. The correlation matching is carried out by utilizing different characteristics carried by various images, so that the pixel points meeting the quantity can be accurately captured, and the accurate jigsaw under the complex condition goods background is realized. Meanwhile, compared with the traditional image splicing method, the method has the advantages of small calculated amount and high splicing efficiency.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments, illustrative embodiments of the present application and descriptions thereof are used to explain the present application and do not constitute undue limitations of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another similar entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The embodiment of the application provides a text image splicing method, a text image splicing device, electronic equipment and a storage medium. The method provided by the embodiment of the application can be applied to any needed electronic equipment, for example, the electronic equipment can be a server, a terminal and the like, is not particularly limited, and is convenient to describe and is called as the electronic equipment for short hereinafter.
According to an aspect of the embodiment of the present application, a method embodiment of a text image stitching method is provided, and fig. 1 is a flowchart of a method provided by the embodiment of the present application, as shown in fig. 1, where the method includes:
Step S11, obtaining an initial character image to be spliced, and carrying out gray processing on the initial character image to obtain a gray image corresponding to the initial character image.
The method provided by the embodiment of the application is applied to the intelligent terminal, and the intelligent terminal can be equipment such as a smart phone, a notebook computer, a tablet personal computer and the like. Specifically, the process of the intelligent terminal obtaining the initial text image to be spliced may be that the requester device sends the initial text image to the intelligent terminal according to its own processing requirement. Or the user directly uploads the initial text image on the intelligent terminal.
In the embodiment of the application, after the initial text image is obtained, the intelligent terminal performs gray processing on the initial text image to obtain a gray image corresponding to the initial text image, and the gray image is recorded as a gray image P1.
Step S12, obtaining a target gradient image corresponding to the initial character image and obtaining a target projection image corresponding to the gray level image.
In the embodiment of the application, the target gradient image corresponding to the initial text image is obtained, which comprises the following steps of A1-A2:
Step A1, a first gradient image of an initial character image in the ordinate direction and a second gradient image of the initial character image in the abscissa direction are obtained.
And step A2, determining the first gradient image and the second gradient image as target gradient images.
In the embodiment of the application, the initial text image can be regarded as a two-dimensional discrete function, and the process of the first gradient image of the initial text image in the ordinate direction is that the first gradient image in the ordinate direction is obtained by calculating the ordinate of the pixel point in the initial text image by utilizing the two-dimensional discrete function. Similarly, the two-dimensional discrete function and the abscissa of the pixel point in the initial text image are utilized for calculation, and the obtained calculation result is the second gradient image in the abscissa direction. The first gradient image and the second gradient image are taken as target gradient images, the first gradient image is denoted as P2, and the second gradient image is denoted as P3.
In the embodiment of the application, the acquisition of the target projection image corresponding to the gray image comprises the step of determining the projection image of the gray image in the abscissa direction as the target projection image. The target projection image is denoted as P4.
The gray level image can well retain the information of the original image, the gradient image in the abscissa direction can highlight the features in the abscissa direction, the gradient image in the ordinate direction can highlight the features in the ordinate direction, and the features of noise, which are used for highlighting characters, can be removed, so that the image matching is more accurate. The projection image can reflect that the statistical characteristics of the pixels of the image have good robustness of character deformation, so that matching errors caused by the character deformation due to picture distortion can be reduced, and misjudgment of matching caused by a plurality of repeated overlapping areas can be reduced. Therefore, the embodiment of the application provides a reliable basis for carrying out correlation matching on the follow-up images by acquiring the four images.
And step S13, respectively performing correlation matching on the gray level image, the target gradient image and the target projection image and the initial character image to obtain target pixel points in the gray level image, the target gradient image and the target projection image, which are respectively matched with the initial character image.
In the embodiment of the present application, step S13, performing correlation matching on a gray image, a target gradient image and a target projection image with an initial text image, to obtain target pixel points in the gray image, the target gradient image and the target projection image, which are respectively matched with the initial text image, includes the following steps B1-B3:
And B1, acquiring a first local image set which is respectively extracted from the gray level image, the target gradient image and the target projection image according to a plurality of different preset lengths, wherein the first local image set comprises local images which are extracted from the gray level image, the target gradient image and the target projection image according to each preset length.
In the embodiment of the present application, step B1, obtaining a first local image set extracted from a gray image, a target gradient image and a target projection image according to a plurality of different preset lengths, includes the following steps B101-B104:
And step B101, selecting a first local image from the gray level image, the target gradient image and the target projection image according to a first preset length.
And step B102, calculating the sum value between the first preset length and a plurality of step sizes arranged in a gradient manner, and determining the sum value as a second preset length.
And step B103, selecting a second local image according to the gray level image, the target gradient image and the target projection image respectively with a second preset length.
Step B104, obtaining a first local image set based on the first local image and the second local image.
In the embodiment of the application, first, a first partial image with a length t is selected from the left edges of the gray-scale image P1, the first gradient image P2, the second gradient image P3 and the target projection image P4, wherein the first partial image includes a partial image Q1 of the gray-scale image P1, a partial image Q2 of the first gradient image, a partial image Q3 of the second gradient image and a partial image Q4 of the target projection image.
Secondly, calculating the sum value between the first preset length and a plurality of step sizes arranged in a gradient way, wherein the sum value can be t+s, t+2s and the like. The second preset length includes t+s, t+2s, t+3s. Then, partial images with the length of t+s are selected from the left edges of the gray scale image P1, the first gradient image P2, the second gradient image P3 and the target projection image P4, partial images with the length of t+2s are selected from the left edges of the gray scale image P1, the first gradient image P2, the second gradient image P3 and the target projection image P4, and then the partial images with the length of t+s and the partial images with the length of t+2s are used as second partial images. A first local image set is obtained based on the first local image and the second local image.
And B2, acquiring a second local image set extracted from the initial text image according to a plurality of different preset lengths, wherein the second local image set comprises local images extracted from the initial text image according to each preset length.
In the embodiment of the present application, step B2, obtaining a second local image set extracted from an initial text image according to a plurality of different preset lengths, includes the following steps B201 to B204:
And step B201, selecting a third partial image from the initial text images according to the first preset length.
And step B202, calculating the sum value between the first preset length and a plurality of step sizes arranged in a gradient manner, and determining the sum value as a second preset length.
And step B203, selecting a fourth partial image from the initial text images according to the second preset length.
Step B204, obtaining a second local image set based on the third local image and the fourth local image.
In the embodiment of the application, first, the third partial images with the length t are selected from the left edge of the initial text image respectively. Secondly, calculating the sum value between the first preset length and a plurality of step sizes arranged in a gradient way, wherein the sum value can be t+s, t+2s and the like. The second preset length comprises t+s and t+2s. Then, a partial image with the length of t+s is selected from the left edge of the initial character image, a partial image with the length of t+2s is selected from the left edge of the initial character image, and then the partial image with the length of t+s and the partial image with the length of t+2s are used as fourth partial images. A second set of partial images is obtained based on the third partial image and the fourth partial image.
And B3, comparing the correlation between the local images in the first local image set and the local images in the second local image set under each preset length to obtain target pixel points.
In the embodiment of the present application, step B3, comparing the correlation between the partial images in the first partial image set and the partial images in the second partial image set under each preset length to obtain the target pixel point, includes the following steps B301-B303:
Step B301, comparing the correlation between the local images in the first local image set and the local images in the second local image set under each preset length to obtain candidate pixel points in each image under each preset length;
and step B302, obtaining the confidence degrees respectively corresponding to the gray level image, the target gradient image, the target projection image and the initial text image.
And step B303, calculating frequency data corresponding to each candidate pixel point based on the confidence coefficient, and determining the candidate pixel point with the highest frequency data as a target pixel point.
As an example, a first partial image in the first partial image set is compared with a third partial image in the corresponding second partial image set by correlation, and a pixel point having the greatest correlation among the partial image Q1 of the grayscale image P1, the partial image Q2 of the first gradient image, the partial image Q3 of the second gradient image, and the partial image Q4 of the target projection image is used as a first candidate pixel point, and the first candidate pixel point includes a pixel point A1 (XA 1, YA 1) in the partial image Q1, a pixel point B1 (XB 1, YB 1) in the partial image Q1, a pixel point C1 (XC 1, YC 1) in the partial image Q1, and a pixel point D1 (XD 1, YD 1) in the partial image Q1.
And secondly, performing correlation comparison on the partial images with the preset length of t+s in the first partial image set and the partial images with the preset length of t+s in the er partial image set respectively to obtain second candidate pixel points including a pixel point A2 (XA 2 and YA 2), a pixel point B2 (XB 2 and YB 2), a pixel point C2 (XC 2 and YC 2) and a pixel point D2 (XD 2 and YD 2).
The above-described processes were repeated to obtain (XA 3, YA 3), (XB 3, YB 3), (XC 3, YC 3), (XD 3, YD 3), (XA 4, YA 4), (XB 4, YB 4), (XC 4, YC 4), (XD 4, YD 4).
For the obtained plurality of candidate pixel points (XA 1, YA 1), (XB 1, YB 1), (XC 1, YC 1), (XD 1, YD 1), (XA 2, YA 2), (XB 2, YB 2), (XC 2, YC 2), (XD 2, YD 2)..a, B, C, D four pixel points are respectively from matching of gray scale image, first gradient image, second gradient image and projection image, because the image quality of the four images is different, the confidence of the four images is set to be a, B, C, D, the weighted frequency of occurrence of the candidate pixel points is calculated according to the confidence, the point with the highest frequency is the global optimal matching point (namely the target pixel point), and if the matching point is less or no repeated matching point, the global optimal point (target pixel point) is calculated in a weighted clustering mode.
The embodiment of the application utilizes the gray level image, the gradient image in the horizontal coordinate direction, the gradient image in the vertical coordinate direction and the projection image, and according to different confidence degrees of each image device, the best matching point is searched according to the weighting frequency or the clustering to realize the jigsaw, thereby ensuring the accuracy of the result and realizing the robustness of the result under the condition of low calculation force requirement.
And S14, splicing based on the target pixel points to obtain a target text image.
In the embodiment of the application, after the target pixel point is obtained, the text image can be spliced according to the optimal matching point.
According to the method provided by the embodiment of the application, the gray level image, the gradient image and the projection image are respectively subjected to correlation matching with the initial text image, and the image stitching is performed based on the matched target pixel points. The correlation matching is carried out by utilizing different characteristics carried by various images, so that the pixel points meeting the quantity can be accurately captured, and the accurate jigsaw under the complex condition goods background is realized. Meanwhile, compared with the traditional image splicing method, the method has the advantages of small calculated amount and high splicing efficiency.
Fig. 4 is a block diagram of a text image stitching device according to an embodiment of the present application, where the text image stitching device may be implemented as part or all of an electronic device through software, hardware, or a combination of both. As shown in fig. 4, the apparatus includes:
the first obtaining module 41 is configured to obtain an initial text image to be stitched, and perform graying processing on the initial text image to obtain a gray image corresponding to the initial text image;
a second obtaining module 42, configured to obtain a target gradient image corresponding to the initial text image, and obtain a projection image corresponding to the gray scale image;
The processing module 43 is configured to perform image correlation matching on the gray level image, the target gradient image, and the projection image with the initial text image, so as to obtain target pixel points in the gray level image, the target gradient image, and the projection image, which are respectively matched with the initial text image;
And the stitching module 44 is configured to stitch based on the target pixel point to obtain a target text image.
In the embodiment of the present application, the second obtaining module 42 is configured to obtain a first gradient image of the initial text image in the ordinate direction and a second gradient image of the initial text image in the abscissa direction, and determine the first gradient image and the second gradient image as the target gradient image.
In the embodiment of the present application, the second obtaining module 42 is configured to determine a projection image of the gray-scale image in the abscissa direction as the target projection image.
In the embodiment of the application, the processing module 43 is configured to obtain a first local image set extracted from the gray image, the target gradient image and the target projection image according to a plurality of different preset lengths, where the first local image set includes local images extracted from the gray image, the target gradient image and the target projection image according to each preset length, obtain a second local image set extracted from the initial text image according to a plurality of different preset lengths, where the second local image set includes local images extracted from the initial text image according to each preset length, and compare correlations between local images in the first local image set and local images in the second local image set under each preset length to obtain the target pixel.
In the embodiment of the present application, the processing module 43 is configured to select a first local image from the gray-scale image, the target gradient image, and the target projection image according to a first preset length;
the method comprises the steps of calculating the sum value between a first preset length and a plurality of step sizes which are arranged in a gradient mode, determining the sum value as a second preset length, selecting a second local image according to the gray level image, the target gradient image and the target projection image of the second preset length, and obtaining a first local image set based on the first local image and the second local image.
In the embodiment of the application, the processing module 43 is configured to select a third partial image from the initial text image according to a first preset length, calculate a sum value between the first preset length and a plurality of step sizes arranged in a gradient manner, determine the sum value as a second preset length, select a fourth partial image from the initial text image according to the second preset length, and obtain a second partial image set based on the third partial image and the fourth partial image.
In the embodiment of the application, the processing module 43 is configured to compare the correlation between the partial images in the first partial image set and the partial images in the second partial image set under each preset length to obtain candidate pixel points in each image under each preset length, obtain confidence levels corresponding to the gray level image, the target gradient image, the target projection image and the initial text image respectively, calculate frequency data corresponding to each candidate pixel point based on the confidence levels, and determine the candidate pixel point with the highest frequency data as the target pixel point.
The embodiment of the application also provides an electronic device, as shown in fig. 5, where the electronic device may include a processor 1501, a communication interface 1502, a memory 1503 and a communication bus 1504, where the processor 1501, the communication interface 1502 and the memory 1503 complete communication with each other through the communication bus 1504.
A memory 1503 for storing a computer program;
the processor 1501, when executing the computer program stored in the memory 1503, implements the steps of the above embodiments.
The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the terminal and other devices.
The memory may include random access memory (Random Access Memory, RAM) or may include non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central Processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a digital signal processor (DIGITAL SIGNAL Processing, DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable gate array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
In yet another embodiment of the present application, a computer readable storage medium is provided, where instructions are stored, which when executed on a computer, cause the computer to perform the method for stitching text images according to any one of the above embodiments.
In yet another embodiment of the present application, a computer program product containing instructions that, when executed on a computer, cause the computer to perform the method for stitching text images according to any of the above embodiments is also provided.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk Solid STATE DISK), etc.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application are included in the protection scope of the present application.
The foregoing is only a specific embodiment of the application to enable those skilled in the art to understand or practice the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.