CN117952860A - Mobile digital publishing method and system - Google Patents
Mobile digital publishing method and system Download PDFInfo
- Publication number
- CN117952860A CN117952860A CN202410355328.9A CN202410355328A CN117952860A CN 117952860 A CN117952860 A CN 117952860A CN 202410355328 A CN202410355328 A CN 202410355328A CN 117952860 A CN117952860 A CN 117952860A
- Authority
- CN
- China
- Prior art keywords
- suspected
- image
- probability
- processed
- gray
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000004043 dyeing Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000003708 edge detection Methods 0.000 claims description 11
- 238000012015 optical character recognition Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 14
- 230000005540 biological transmission Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000010422 painting Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000009792 diffusion process Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/1444—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/16—Image preprocessing
- G06V30/164—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/1801—Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/19007—Matching; Proximity measures
- G06V30/19093—Proximity measures, i.e. similarity or distance measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10008—Still image; Photographic image from scanner, fax or copier
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to the technical field of image data processing, in particular to a mobile digital publishing method and system, comprising the following steps: acquiring an initial image; acquiring a text region and an image to be processed according to the initial image; obtaining noise probability according to gray level difference of pixel points of an image to be processed; acquiring a first edge, a suspected attachment area and a characteristic distance; obtaining gradient halation probability according to the gradient amplitude and the characteristic distance of the pixel points at the first edge in the suspected attachment area; obtaining gray level halation probability according to the gray level value of the pixel point of the suspected attachment area; obtaining the halation dyeing probability according to the gradient halation dyeing probability and the gray halation dyeing probability; obtaining a replacement coefficient according to the corona dyeing probability and the noise probability; obtaining a final image according to the replacement coefficient; a complete enhanced scan image is obtained from the final image. The invention improves the accuracy of the scanning result by denoising the scanning result from the traditional publication to the mobile digital publication.
Description
Technical Field
The invention relates to the technical field of image data processing, in particular to a mobile digital publishing method and system.
Background
Along with the continuous popularization of the Internet and intelligent digital equipment, publications existing in traditional paper documents and paintings are gradually eliminated and gradually replaced by mobile digital publishing methods, and the information on the traditional paper documents and paintings is efficiently transmitted through the Internet after scanning and servicing, so that the information is not dependent on physical transmission of paper media, the update transmission speed of social and personal information is faster, and the physical transmission cost is reduced.
In the process of converting the traditional paper documents and paintings into digital publications, a scanning technology is needed, characters in the documents are extracted by using an OCR technology after the traditional paper documents and paintings are scanned by scanning equipment, scanning deformation, distortion and the like are processed, and the processed documents and image information are transmitted to the Internet to be transmitted. In the processing and scanning process, in order to obtain the information contained in the publications as much as possible, the resolution of the scanning device is higher, and the scanning process is easily affected by halation noise, so that the scanning result is inaccurate, and the quality of the mobile digital publications is reduced.
Disclosure of Invention
The invention provides a mobile digital publishing method and a system, which are used for solving the existing problems.
The invention relates to a mobile digital publishing method and a system, which adopt the following technical scheme:
One embodiment of the present invention provides a mobile digital publishing method comprising the steps of:
Acquiring an initial image of a scanned paper document; according to the initial image, acquiring a text region and other regions by utilizing an optical character recognition technology, and marking the other regions as images to be processed;
in an image to be processed, according to the gray level difference between each pixel point and the adjacent pixel points, obtaining the noise probability of each pixel point;
The boundary of the text area is marked as a strong edge, and a connected domain formed by any one strong edge is marked as a strong edge connected domain; performing edge detection on the image to be processed to obtain a plurality of first edges and suspected attachment areas of the strong edge connected domain; the nearest distance from each edge pixel point of the first edge in the suspected attached area to the boundary of the strong edge connected domain adjacent to the suspected attached area is recorded as a characteristic distance; obtaining the gradient corona dyeing probability of each suspected attachment area according to the gradient amplitude values and the characteristic distances of all pixel points at the first edge of the suspected attachment area;
according to the gray values of all the pixel points of each suspected attachment area, gray vignetting probability of each suspected attachment area is obtained;
obtaining the halation probability of each pixel point in the suspected attachment area according to the gradient halation probability and the gray halation probability of each suspected attachment area;
Obtaining a replacement coefficient of each pixel point in the image to be processed according to the vignetting probability of each pixel point in the suspected attachment area and the noise probability of each pixel point in the image to be processed; obtaining a final image after denoising of the image to be processed according to the replacement coefficient of each pixel point in the image to be processed;
And merging the final image after denoising the image to be processed with the text region to obtain a complete enhanced scanning image.
Further, in the image to be processed, according to the gray level difference between each pixel point and the adjacent pixel points, the noise probability of each pixel point is obtained, which includes the following specific calculation modes:
Wherein, For the/>, in the image to be processedNoise probability of individual pixel points,/>For the/>, in the image to be processedGray value of each pixel/(For the/>, in the image to be processedFirst/>, within 8 neighborhood of each pixel pointGray value of each pixel/(For the/>, in the image to be processedNumber of all pixels in 8 neighborhood of each pixel,/>Representing a linear normalization function; /(I)As a function of absolute value.
Further, the edge detection is performed on the image to be processed to obtain a plurality of first edges and suspected attachment areas of the strong edge connected domain, and the method comprises the following specific steps:
each edge line segment obtained by edge detection of the image to be processed is marked as a first edge;
In the image to be processed, marking a closed area formed by any one first edge and the boundary of the image to be processed as an initial suspected attachment area; to be processed in the image and the first The m-th initial suspected attachment region adjacent to the strong edge connected domain is marked as the/>Mth suspected attachment region/>, of strong edge connected domain。
Further, the obtaining the gradient corona probability of each suspected attachment region according to the gradient magnitudes and the feature distances of all the pixel points at the first edge of the suspected attachment region includes the following specific calculation modes:
Wherein, For/>Gradient corona probability,/>As an exponential function with a base of natural constant,For/>Number of edge pixel points of the first edge of/>For/>First edge of/>Gradient amplitude of each edge pixel point,/>For/>First edge of/>Characteristic distance of each edge pixel point,/>Is thatFirst edge of/>Gradient amplitude of each edge pixel point,/>For/>First edge of/>Characteristic distances of the edge pixel points;
For/> First/>, of the strong edge connected domainAnd a suspected attachment area.
Further, the gray level halation probability of each suspected attachment area is obtained according to the gray level values of all the pixel points of each suspected attachment area, which comprises the following specific steps:
the product of the average value of the gray values of all the pixels in each suspected attachment area and the kurtosis of the gray values of all the pixels is recorded as a first gray characteristic of each suspected attachment area;
and obtaining the gray scale halation probability of each suspected attachment region according to the difference of the first gray scale characteristics of all the suspected attachment regions corresponding to each strong edge connected region.
Further, the gray scale halation probability of each suspected attachment region is obtained according to the difference of the first gray scale features of all the suspected attachment regions corresponding to each strong edge connected region, which comprises the following specific calculation modes:
Wherein, For/>Gray scale halation probability,/>For/>The gray average value of all the pixel points in the image,For/>Kurtosis of gray values of all pixel points in (a)/>For/>Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaGray average value of all pixel points in each suspected attachment area,/>For/>Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaKurtosis of gray values of all pixels in each suspected attachment region,For/>Divide the strong edge connected domain byThe number of suspected attachment areas outside the respective suspected attachment areas,/>Is an exponential function with a natural constant as a base; /(I)As a function of absolute value;
For/> First/>, of the strong edge connected domainA plurality of suspected attachment areas;
For/> First/>, of the strong edge connected domainA first feature of the suspected attachment region;
For/> Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaA first feature of the suspected attachment region.
Further, the method for obtaining the halation probability of each pixel point in the suspected attachment area according to the gradient halation probability and the gray halation probability of each suspected attachment area comprises the following specific steps:
Will be The gray scale halation probability of (2) is denoted as/>Middle/>Gray level halation probability of each pixel point;
Will be The gradient vignetting probability of (2) is denoted/>Middle/>Gradient halation probability of each pixel point;
Will be Middle/>The product of the gradient halation probability and the gray halation probability of each pixel point is recorded as/>Middle (f)The vignetting probability of each pixel point;
For/> First/>, of the strong edge connected domainAnd a suspected attachment area.
Further, the method for obtaining the replacement coefficient of each pixel point in the image to be processed according to the vignetting probability of each pixel point in the suspected attachment area and the noise probability of each pixel point in the image to be processed comprises the following specific steps:
In the image to be processed, the vignetting probability of the pixel points which are not in the suspected attachment area is set to be a preset probability 0;
in the image to be processed, the maximum value of the noise probability and the vignetting probability of each pixel point is recorded as the replacement coefficient of each pixel point.
Further, the obtaining the final image after denoising the image to be processed according to the replacement coefficient of each pixel point in the image to be processed comprises the following specific steps:
In the image to be processed, if Replacement coefficient of individual pixel points/>Greater than or equal to a preset replacement threshold/>Replacing the gray value of the ith pixel point with the distance/>, in the image to be processedThe pixel points are nearest and the replacement coefficient is smaller than the gray value of the pixel point with the preset replacement threshold value;
If at first Replacement coefficient of individual pixel points/>Less than a preset replacement threshold/>The gray value of the ith pixel point is not replaced; and merging all the pixel points with replaced gray values and the pixel points without replaced gray values to form a final image after denoising the image to be processed.
The invention also provides a mobile digital publishing system, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program stored in the memory to realize the steps of the mobile digital publishing method.
The technical scheme of the invention has the beneficial effects that: according to the invention, the noise probability of each pixel point is obtained through the gray level difference between each pixel point and the adjacent pixel points, so that the influence of noise on the scanning process of the mobile digital publication is reduced, and the accuracy of the scanning result is improved; according to the gradient halation probability and the gray halation probability of each suspected attachment area, the halation probability of each pixel point in the suspected attachment area is obtained, the influence degree of the halation phenomenon on each pixel point in each suspected attachment area is judged, the influence of the halation pixel points on the scanning process of the mobile digital publication is reduced, and the accuracy of the scanning result is further improved; according to the vignetting probability of each pixel point in the suspected attachment area and the noise probability of each pixel point in the image to be processed, the replacement coefficient of each pixel point in the image to be processed is obtained, so that a basis is provided for gray level replacement of each pixel point in the image to be processed, and the scanning result is more accurate. The invention obtains the final image through the replacement coefficient, obtains more accurate and complete enhanced scanning image by combining the text region, reduces noise interference in the scanning process, improves the accuracy of the scanning result, and ensures the quality of the mobile digital publication.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of steps of a mobile digital publishing method of the present invention;
FIG. 2 is a schematic diagram of a text blushing pattern with a thick font according to the present embodiment;
fig. 3 is a schematic diagram of a text blushing effect with a fine font according to the present embodiment.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to the specific implementation, structure, characteristics and effects of a mobile digital publishing method and system according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of a mobile digital publishing method and system provided by the invention with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of steps of a mobile digital publishing method according to an embodiment of the present invention is shown, the method includes the steps of:
S001, acquiring an initial image of a scanned paper document; and obtaining a text region and other regions by utilizing an optical character recognition technology according to the initial image, and recording the other regions as images to be processed.
In the embodiment, the digitally scanned document is analyzed, and the pixel points serving as interference are smoothly wiped out, so that the aim of reducing the processing workload in subsequent processing and transmission is fulfilled. The document with the halation phenomenon after digital scanning is shown in fig. 2 and 3.
In this embodiment, the document after the digital scanning is analyzed, so that it is first necessary to collect the scanned image of the paper document and perform the preprocessing operation.
Specifically, a large-format scanner is adopted to scan a paper document to be digitally stored, an initial image is obtained, an OCR technology is used to extract a text region in the initial image, pixel points of the text region are marked as 1, other regions except the text region in the initial image are marked as 0, and the initial image is subjected to graying to obtain a gray image. The gray image formed by the pixel points marked with 0 is recorded as an image to be processed. Graying and OCR are well known techniques and specific methods are not described here. Chinese for OCR is known as optical character recognition and english is known as Optical Character Recognition.
So far, the image to be processed after preprocessing is obtained.
Step S002, in the image to be processed, according to the gray level difference between each pixel point and the adjacent pixel points, the noise probability of each pixel point is obtained.
It should be noted that, the paper has the possibility of toner dissipation in the copying process, so that free noise exists on the surface of the paper document, and the paper document stored for a long time is eroded by water vapor in the air and oxidized by oxygen, so that the texture of the paper document is more obvious, and the gray values on the surface of the paper document are different due to different oxidation degrees of different areas. Therefore, the embodiment obtains the noise probability of each pixel point by using the gray value difference between each pixel point in the image to be processed and other pixel points in the neighborhood.
Specifically, the first image to be processedThe method for calculating the noise probability of each pixel point comprises the following steps:
in the method, in the process of the invention, For the/>, in the image to be processedNoise probability of individual pixel points,/>For the/>, in the image to be processedGray value of each pixel/(For the/>, in the image to be processedFirst/>, within 8 neighborhood of each pixel pointGray value of each pixel/(For the/>, in the image to be processedNumber of all pixels in 8 neighborhood of each pixel,/>Representing a linear normalization function; /(I)As a function of absolute value.
In the method, in the process of the invention,Representing the/>, in the image to be processedThe larger the difference between the gray value of each pixel point and the gray average value of the pixel points marked with 0 in the 8 neighborhood range, the more the gray value is, which indicates the/>The more prominent the pixel points are, the more/>, the image to be processed isThe more likely that a pixel is a noise pixel, the/>, the more/>, in the image to be processedThe greater the noise probability for a pixel.
According to the method, the noise probability of each pixel point in the image to be processed is obtained.
Step S003, marking the boundary of the text area as a strong edge, and marking the connected domain formed by any one strong edge as a strong edge connected domain; performing edge detection on the image to be processed to obtain a plurality of first edges and suspected attachment areas of the strong edge connected domain; the nearest distance from each edge pixel point of the first edge in the suspected attached area to the boundary of the strong edge connected domain adjacent to the suspected attached area is recorded as a characteristic distance; and obtaining the gradient corona dyeing probability of each suspected attachment area according to the gradient amplitude values and the characteristic distances of all the pixel points at the first edge of the suspected attachment area.
It should be noted that, when the paper document is corroded by water vapor during the storage process, a defect similar to cloth dyeing and halation is formed around the text, so that the ink for printing the text is diffused to the text peripheral area, the gray value of the diffused area is lower than that of the text and is attached to the text area, therefore, the embodiment obtains all the first edges of the image to be processed through the edge detection algorithm, extracts the first connected domain formed by the strong edge and the first edge of each text, and obtains the halation dyeing probability of each pixel point in the first connected domain according to the gray value of the pixel point and the gradient amplitude of the first edge in all the first connected domains by utilizing the strong edge and the first connected domain formed by the first edge of a single text.
Specifically, each edge line segment obtained by the Canny edge detection algorithm of the image to be processed is recorded as a first edge, and the gradient amplitude of each edge pixel point of the first edge of the image to be processed is obtained according to the sobel operator, wherein the Canny algorithm and the sobel operator are known techniques, and specific methods are not described herein. The Canny algorithm is called Canny edge Detection for Chinese and CANNY EDGE Detection for English. Each closed boundary of the text region marked 1 in the initial image is noted as a strong edge.
It should be noted that, because the halation is obtained by character diffusion and the strong edge is the edge of the character, the halation-dyeing area is attached to the strong edge, and the halation-dyeing probability of each connected domain is obtained according to the gray values of the pixel points in all the connected domains associated with each strong edge.
Further, the connected domain formed by any one strong edge is marked as a strong edge connected domain; in the image to be processed, marking a closed area formed by any one first edge and the boundary of the image to be processed as an initial suspected attachment area; what needs to be described is: if the closed region cannot be formed, the suspected attachment region does not exist. To be processed in the image and the firstThe m-th initial suspected attachment region adjacent to the strong edge connected domain is marked as the/>Mth suspected attachment region/>, of strong edge connected domain。
It should be noted that, because the halation is a diffusion process, the gray value of the halation-dyeing area farther from the text is lower, the gray amplitude of the pixel point at the far edge is smaller, and the difference between the gradient amplitude of the pixel point of the strong edge connected domain and the normal gray value of the paper is larger.
The nearest distance from each edge pixel point of the first edge in the suspected attached area to the boundary of the strong edge connected domain adjacent to the suspected attached area is recorded as a characteristic distance; calculate the firstFirst/>, of the strong edge connected domainSuspected attachment area/>Gradient blushing probability:
in the method, in the process of the invention, For/>Gradient corona probability,/>As an exponential function with a base of natural constant,For/>Number of edge pixel points of the first edge of/>For/>First edge of/>Gradient amplitude of each edge pixel point,/>For/>First edge of/>Characteristic distance of each edge pixel point,/>Is thatFirst edge of/>Gradient amplitude of each edge pixel point,/>For/>First edge of/>Characteristic distance of each edge pixel point,/>For/>First/>, of the strong edge connected domainAnd a suspected attachment area.
In the method, in the process of the invention,For/>The smaller the value of the degree of blushing deviation, the description/>The more the characteristics of the dyeing halation are satisfied, the/>The more likely it is that the text will be blurry.
So far, the gradient halation probability of the suspected attachment area is obtained.
Step S004, according to the gray values of all the pixel points of each suspected attachment area, gray vignetting probability of each suspected attachment area is obtained.
It should be noted that, since the edge of each text may have more than one direction when the edge is stained after being corroded, that is, each strong edge connected domain has a plurality of suspected attached areas, the gray values of the pixels in the suspected attached areas are similar and all show the same gradient change, so the gray staining probability of each suspected attached area is obtained according to the gray difference of the suspected attached areas associated with the same strong edge connected domain.
Specifically, the calculation method for calculating the kurtosis of the gray value of each suspected attachment area is a known technique, and the specific method is not described herein. First, theFirst/>, of the strong edge connected domainSuspected attachment area/>The gray level halation probability calculation method is as follows:
Wherein, For/>Gray scale halation probability,/>For/>The gray average value of all the pixel points in the image,For/>Kurtosis of gray values of all pixel points in (a)/>For/>Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaGray average value of all pixel points in each suspected attachment area,/>For/>Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaKurtosis of gray values of all pixels in each suspected attachment region,For/>Divide the strong edge connected domain byThe number of suspected attachment areas outside the respective suspected attachment areas,/>Is an exponential function with a natural constant as a base; /(I)As an absolute value function,/>For/>First/>, of the strong edge connected domainAnd a suspected attachment area.
Wherein,For/>First/>, of the strong edge connected domainA first feature of the suspected attachment region;
For/> Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaA first feature of the suspected attachment region.
It should be further noted that, the kurtosis of the gray values describes the aggregation degree of the gray values to the gray mean value in the single suspected attachment region, thenSuspected attachment area and the/>Divide the strong edge connected domain byThe closer the gray average value of other areas outside the suspected attachment area is, the more similar the gray average value of the other areas outside the suspected attachment area is, which indicates the/>The more similar the gray distribution of each suspected attachment area to other areas, the more likely it is that gray vignetting will occur, the greater the gray aggregate kurtosis will be to indicate a higher aggregate level, the more similar the kurtosis of the gray values of other areas will be to indicate that the gray value distribution in each suspected attachment area is similar, and the more likely it will be that a text will produce vignetting defect.
Thus, the gray level halation probability of the suspected attachment area is obtained.
And S005, obtaining the vignetting probability of each pixel point in the suspected attachment area according to the gradient vignetting probability and the gray vignetting probability of each suspected attachment area.
Further, it will belong toFirst/>, of the strong edge connected domainSuspected attachment area/>Middle/>The probability of halation of a single pixel point uses the/>First/>, of the strong edge connected domainThe calculation mode of the suspected attachment area is replaced, and the/>First/>, of the strong edge connected domainSuspected attachment area/>Middle/>Vignetting probability/>, of individual pixel pointsThe calculation method is as follows:
in the method, in the process of the invention, For/>Middle/>Vignetting probability of individual pixel points,/>For/>Middle/>Gray level halation probability of each pixel point,/>For/>Middle/>Gradient vignetting probability of each pixel point,/>For/>Is used for the gray level halation probability of (1),For/>Gradient corona probability,/>For/>First/>, of the strong edge connected domainAnd a suspected attachment area.
Further, in the image to be processed, the vignetting probability of the pixel point in the suspected attachment area is set to be 0, and the vignetting probability of each pixel point in the suspected attachment area is combined to obtain the vignetting probability of each pixel point in the image to be processed, and the vignetting probability of the ith pixel point in the image to be processed is recorded as. Note that, since the non-suspected attachment region does not have a halation phenomenon, the halation probability of the pixel points in the non-suspected attachment region is 0.
So far, the vignetting probability of each pixel point in the image to be processed is obtained.
Step S006, obtaining a replacement coefficient of each pixel point in the image to be processed according to the vignetting probability of each pixel point in the suspected attachment area and the noise probability of each pixel point in the image to be processed; and obtaining a final image after denoising the image to be processed according to the replacement coefficient of each pixel point in the image to be processed.
Specifically, the first image to be processedThe calculation mode of the replacement coefficient of each pixel point is as follows:
in the method, in the process of the invention, For the/>, in the image to be processedReplacement coefficient of each pixel,/>For the noise probability of the ith pixel point in the image to be processed,/>First/>, in the image to be processedVignetting probability of individual pixel points,/>Representing a maximizing function.
It should be further noted that when the coefficient is replacedThe value of (a) depends on/>Description of the first embodimentThe probability that each pixel belongs to a noise pixel is high; when the substitution coefficient/>The value of (a) depends on/>Description of the first embodimentThe probability that each pixel belongs to a halation defect is high.
Specifically, in the image to be processed, if the firstReplacement coefficient of individual pixel points/>Greater than or equal to a preset replacement threshold/>Replacing the gray value of the ith pixel point with the distance/>, in the image to be processedThe nearest pixel point has a gray value of which the replacement coefficient is smaller than a preset replacement threshold value; if there are a plurality of distances/>, in the image to be processedThe nearest pixel point with the replacement coefficient smaller than the preset replacement threshold value is selected as a pixel point with a distance/>The gray value of the i-th pixel point is replaced by the gray value of the pixel point which is nearest to the pixel point and has the replacement coefficient smaller than the preset replacement threshold value; if/>Replacement coefficient of individual pixel points/>Less than a preset replacement threshold/>The gray value of the i-th pixel is not replaced. Preset replacement threshold/>This is described as an example, but other values may be set in other embodiments, and the present embodiment is not limited thereto. And merging all the pixel points with replaced gray values and the pixel points without replaced gray values to form a final image after denoising the image to be processed.
And obtaining a final image after denoising the image to be processed.
And step S007, merging the final image after denoising the image to be processed with the text area to obtain a complete enhanced scanning image.
Specifically, the final image after denoising the image to be processed is an area marked as 0in the initial image, the final image is combined with a character area marked as 1in the initial image to obtain a complete scanning image, the complete scanning image is transmitted to a digital publishing system and is stored in a database of the publishing system, and therefore digital complete scanning is achieved.
The invention also provides a mobile digital publishing system comprising a memory, a processor and a computer program stored on the memory and operable on the processor, the processor executing the computer program stored in the memory to implement the steps of a mobile digital publishing method as described above.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (10)
1. A mobile digital publishing method, comprising the steps of:
Acquiring an initial image of a scanned paper document; according to the initial image, acquiring a text region and other regions by utilizing an optical character recognition technology, and marking the other regions as images to be processed;
in an image to be processed, according to the gray level difference between each pixel point and the adjacent pixel points, obtaining the noise probability of each pixel point;
The boundary of the text area is marked as a strong edge, and a connected domain formed by any one strong edge is marked as a strong edge connected domain; performing edge detection on the image to be processed to obtain a plurality of first edges and suspected attachment areas of the strong edge connected domain; the nearest distance from each edge pixel point of the first edge in the suspected attached area to the boundary of the strong edge connected domain adjacent to the suspected attached area is recorded as a characteristic distance; obtaining the gradient corona dyeing probability of each suspected attachment area according to the gradient amplitude values and the characteristic distances of all pixel points at the first edge of the suspected attachment area;
according to the gray values of all the pixel points of each suspected attachment area, gray vignetting probability of each suspected attachment area is obtained;
obtaining the halation probability of each pixel point in the suspected attachment area according to the gradient halation probability and the gray halation probability of each suspected attachment area;
Obtaining a replacement coefficient of each pixel point in the image to be processed according to the vignetting probability of each pixel point in the suspected attachment area and the noise probability of each pixel point in the image to be processed; obtaining a final image after denoising of the image to be processed according to the replacement coefficient of each pixel point in the image to be processed;
And merging the final image after denoising the image to be processed with the text region to obtain a complete enhanced scanning image.
2. The method for mobile digital publishing according to claim 1, wherein the specific calculation method for obtaining the noise probability of each pixel point according to the gray level difference between each pixel point and the adjacent pixel points in the image to be processed is as follows:
Wherein, For the/>, in the image to be processedNoise probability of individual pixel points,/>For the/>, in the image to be processedGray value of each pixel/(For the/>, in the image to be processedFirst/>, within 8 neighborhood of each pixel pointGray value of each pixel/(For the/>, in the image to be processedNumber of all pixels in 8 neighborhood of each pixel,/>Representing a linear normalization function; /(I)As a function of absolute value.
3. The method for mobile digital publishing according to claim 1, wherein the step of performing edge detection on the image to be processed to obtain a plurality of first edges and suspected attachment areas of the strong edge connected domain comprises the following specific steps:
each edge line segment obtained by edge detection of the image to be processed is marked as a first edge;
In the image to be processed, marking a closed area formed by any one first edge and the boundary of the image to be processed as an initial suspected attachment area; to be processed in the image and the first The m-th initial suspected attachment region adjacent to the strong edge connected domain is marked as the/>Mth suspected attachment region/>, of strong edge connected domain。
4. The mobile digital publishing method of claim 1, wherein the obtaining the gradient corona probability of each suspected attachment area according to the gradient magnitudes and the feature distances of all the pixels at the first edge of the suspected attachment area comprises the following specific calculation modes:
Wherein, For/>Gradient corona probability,/>As an exponential function based on natural constants,/>For/>Number of edge pixel points of the first edge of/>For/>First edge of/>Gradient amplitude of each edge pixel point,/>For/>First edge of/>Characteristic distance of each edge pixel point,/>For/>First edge of/>Gradient amplitude of each edge pixel point,/>For/>First edge of/>Characteristic distances of the edge pixel points;
For/> First/>, of the strong edge connected domainAnd a suspected attachment area.
5. The method for mobile digital publishing according to claim 1, wherein the step of obtaining gray level halation probability of each suspected attachment area according to gray level values of all pixels of each suspected attachment area comprises the following specific steps:
the product of the average value of the gray values of all the pixels in each suspected attachment area and the kurtosis of the gray values of all the pixels is recorded as a first gray characteristic of each suspected attachment area;
and obtaining the gray scale halation probability of each suspected attachment region according to the difference of the first gray scale characteristics of all the suspected attachment regions corresponding to each strong edge connected region.
6. The mobile digital publishing method of claim 5, wherein the obtaining gray level halation probability of each suspected attachment area according to the difference of the first gray level characteristics of all the suspected attachment areas corresponding to each strong edge connected domain comprises the following specific calculation modes:
Wherein, For/>Gray scale halation probability,/>For/>Gray average value of all pixel points in (1)/>For/>Kurtosis of gray values of all pixel points in (a)/>For/>Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaGray average value of all pixel points in each suspected attachment area,/>For/>Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaKurtosis of gray values of all pixel points in each suspected attachment area,/>For/>Divide the strong edge connected domain byThe number of suspected attachment areas outside the respective suspected attachment areas,/>Is an exponential function with a natural constant as a base; /(I)As a function of absolute value;
For/> First/>, of the strong edge connected domainA plurality of suspected attachment areas;
For/> First/>, of the strong edge connected domainA first feature of the suspected attachment region;
For/> Divide the strong edge connected domain byFirst/>, outside the suspected attachment areaA first feature of the suspected attachment region.
7. The method for mobile digital publishing according to claim 1, wherein the obtaining the blushing probability of each pixel point in the suspected attachment area according to the gradient blushing probability and the gray blushing probability of each suspected attachment area comprises the following specific steps:
Will be The gray scale halation probability of (2) is denoted as/>Middle/>Gray level halation probability of each pixel point;
Will be The gradient vignetting probability of (2) is denoted/>Middle/>Gradient halation probability of each pixel point;
Will be Middle/>The product of the gradient halation probability and the gray halation probability of each pixel point is recorded as/>Middle/>The vignetting probability of each pixel point;
For/> First/>, of the strong edge connected domainAnd a suspected attachment area.
8. The method for mobile digital publishing according to claim 1, wherein the obtaining the replacement coefficient of each pixel in the image to be processed according to the vignetting probability of each pixel in the suspected attachment area and the noise probability of each pixel in the image to be processed comprises the following specific steps:
In the image to be processed, the vignetting probability of the pixel points which are not in the suspected attachment area is set to be a preset probability 0;
in the image to be processed, the maximum value of the noise probability and the vignetting probability of each pixel point is recorded as the replacement coefficient of each pixel point.
9. The method for mobile digital publishing according to claim 1, wherein the obtaining the final image after denoising the image to be processed according to the replacement coefficient of each pixel point in the image to be processed comprises the following specific steps:
In the image to be processed, if Replacement coefficient of individual pixel points/>Greater than or equal to a preset replacement threshold/>Replacing the gray value of the ith pixel point with the distance/>, in the image to be processedThe pixel points are nearest and the replacement coefficient is smaller than the gray value of the pixel point with the preset replacement threshold value;
If at first Replacement coefficient of individual pixel points/>Less than a preset replacement threshold/>The gray value of the ith pixel point is not replaced; and merging all the pixel points with replaced gray values and the pixel points without replaced gray values to form a final image after denoising the image to be processed.
10. A mobile digital publishing system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the computer program when executed by the processor performs the steps of a mobile digital publishing method as claimed in any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410355328.9A CN117952860B (en) | 2024-03-27 | 2024-03-27 | Mobile digital publishing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410355328.9A CN117952860B (en) | 2024-03-27 | 2024-03-27 | Mobile digital publishing method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117952860A true CN117952860A (en) | 2024-04-30 |
CN117952860B CN117952860B (en) | 2024-06-21 |
Family
ID=90792618
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410355328.9A Active CN117952860B (en) | 2024-03-27 | 2024-03-27 | Mobile digital publishing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117952860B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114062961A (en) * | 2021-11-17 | 2022-02-18 | 吉林大学 | OCC-based multi-feature demodulation method for automatic driving vehicle |
CN114332895A (en) * | 2021-12-27 | 2022-04-12 | 上海浦东发展银行股份有限公司 | Text image synthesis method, apparatus, device, storage medium and program product |
CN115330925A (en) * | 2022-08-19 | 2022-11-11 | 北京字跳网络技术有限公司 | Image rendering method and device, electronic equipment and storage medium |
CN115633259A (en) * | 2022-11-15 | 2023-01-20 | 深圳市泰迅数码有限公司 | Automatic regulation and control method and system for intelligent camera based on artificial intelligence |
CN115719359A (en) * | 2022-11-29 | 2023-02-28 | 中国石油大学(华东) | Rolling shutter image processing method based on visible light communication |
CN116310360A (en) * | 2023-05-18 | 2023-06-23 | 实德电气集团有限公司 | Reactor surface defect detection method |
CN116958124A (en) * | 2023-09-12 | 2023-10-27 | 地立(苏州)智能装备有限公司 | Automatic packagine machine anomaly monitoring system |
CN117082690A (en) * | 2023-10-17 | 2023-11-17 | 深圳市帝狼光电有限公司 | Control method and system of intelligent table lamp |
CN117252872A (en) * | 2023-11-16 | 2023-12-19 | 深圳市华海电子科技有限公司 | Mobile phone screen defect visual detection method and system |
CN117408890A (en) * | 2023-12-14 | 2024-01-16 | 武汉泽塔云科技股份有限公司 | Video image transmission quality enhancement method and system |
CN117437600A (en) * | 2023-12-20 | 2024-01-23 | 山东海纳智能装备科技股份有限公司 | Coal flow monitoring system based on image recognition technology |
-
2024
- 2024-03-27 CN CN202410355328.9A patent/CN117952860B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114062961A (en) * | 2021-11-17 | 2022-02-18 | 吉林大学 | OCC-based multi-feature demodulation method for automatic driving vehicle |
CN114332895A (en) * | 2021-12-27 | 2022-04-12 | 上海浦东发展银行股份有限公司 | Text image synthesis method, apparatus, device, storage medium and program product |
CN115330925A (en) * | 2022-08-19 | 2022-11-11 | 北京字跳网络技术有限公司 | Image rendering method and device, electronic equipment and storage medium |
CN115633259A (en) * | 2022-11-15 | 2023-01-20 | 深圳市泰迅数码有限公司 | Automatic regulation and control method and system for intelligent camera based on artificial intelligence |
CN115719359A (en) * | 2022-11-29 | 2023-02-28 | 中国石油大学(华东) | Rolling shutter image processing method based on visible light communication |
CN116310360A (en) * | 2023-05-18 | 2023-06-23 | 实德电气集团有限公司 | Reactor surface defect detection method |
CN116958124A (en) * | 2023-09-12 | 2023-10-27 | 地立(苏州)智能装备有限公司 | Automatic packagine machine anomaly monitoring system |
CN117082690A (en) * | 2023-10-17 | 2023-11-17 | 深圳市帝狼光电有限公司 | Control method and system of intelligent table lamp |
CN117252872A (en) * | 2023-11-16 | 2023-12-19 | 深圳市华海电子科技有限公司 | Mobile phone screen defect visual detection method and system |
CN117408890A (en) * | 2023-12-14 | 2024-01-16 | 武汉泽塔云科技股份有限公司 | Video image transmission quality enhancement method and system |
CN117437600A (en) * | 2023-12-20 | 2024-01-23 | 山东海纳智能装备科技股份有限公司 | Coal flow monitoring system based on image recognition technology |
Non-Patent Citations (1)
Title |
---|
江泽涛;赵荣椿;黎明;: "基于相关的自适应阈值图像序列跟踪方法", 西北工业大学学报, no. 06, 30 December 2005 (2005-12-30) * |
Also Published As
Publication number | Publication date |
---|---|
CN117952860B (en) | 2024-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111275129B (en) | Image data augmentation policy selection method and system | |
US20070253040A1 (en) | Color scanning to enhance bitonal image | |
JP4118749B2 (en) | Image processing apparatus, image processing program, and storage medium | |
US20030063802A1 (en) | Image processing method, apparatus and system | |
CN105374015A (en) | Binary method for low-quality document image based on local contract and estimation of stroke width | |
JP5934762B2 (en) | Document modification detection method by character comparison using character shape characteristics, computer program, recording medium, and information processing apparatus | |
CN112183038A (en) | Form identification and typing method, computer equipment and computer readable storage medium | |
CN114926839B (en) | Image identification method based on RPA and AI and electronic equipment | |
CN113888536B (en) | Printed matter double image detection method and system based on computer vision | |
US8194941B2 (en) | Character noise eliminating apparatus, character noise eliminating method, and character noise eliminating program | |
CN111046760B (en) | Handwriting identification method based on domain countermeasure network | |
CN109784342A (en) | A kind of OCR recognition methods and terminal based on deep learning model | |
US8456711B2 (en) | SUSAN-based corner sharpening | |
CN113591831A (en) | Font identification method and system based on deep learning and storage medium | |
CN115995080B (en) | Archive intelligent management system based on OCR (optical character recognition) | |
CN114708601A (en) | Handwritten character erasing method based on deep learning | |
CN117952860A (en) | Mobile digital publishing method and system | |
CN113421256B (en) | Dot matrix text line character projection segmentation method and device | |
Esmaile et al. | Optical character recognition using active contour segmentation | |
Lin et al. | High-Accuracy Skew Estimation of Document Images. | |
CN118172788B (en) | OCR intelligent recognition and management system for BCG vaccine inoculation record | |
Konya et al. | Adaptive methods for robust document image understanding | |
CN118334674B (en) | Automatic identification method and system for document shooting image | |
CN112785508B (en) | Method and device for denoising electronic document picture | |
CN118711200B (en) | Mobile auxiliary examination paper reading method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A mobile digital publishing method and system Granted publication date: 20240621 Pledgee: Rizhao Bank Co.,Ltd. Jining Liangshan Branch Pledgor: Shandong zhengheda Education Technology Co.,Ltd. Registration number: Y2024980058052 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |