CN103136171B

CN103136171B - E-book making method based on shock wave flash (SWF)

Info

Publication number: CN103136171B
Application number: CN201110396507.XA
Authority: CN
Inventors: 康凯
Original assignee: Sichuan Wenxuan Education Science & Technology Co Ltd; MAINBO EDUCATION TECHNOLOGY Co Ltd
Current assignee: Mainbo Education Technology Co ltd; Sichuan Winshare Education Science & Technology Co ltd
Priority date: 2011-12-02
Filing date: 2011-12-02
Publication date: 2017-04-12
Anticipated expiration: 2031-12-02
Also published as: CN103136171A

Abstract

The invention discloses a method for making an electronic book based on SWF, and relates to the technical field of information data processing. This method first classifies the content of each page of the PDF document to be processed, then sets the corresponding conversion parameters for each page type, and finally converts the PDF file into a SWF file according to the set conversion parameters, and adds The upper data header forms the final electronic book. Better results are obtained due to the adoption of different conversion strategies for different content types when converting to SWF. In addition, after the conversion is completed, unsatisfactory pages can be processed separately through manual inspection to further improve the overall effect. The method described in the invention ensures high document definition without reducing the color bit depth and achieves a smaller size at the same time.

Description

A Method of Making Electronic Books Based on SWF

技术领域technical field

本发明涉及信息数字化处理技术领域，具体涉及一种适合在网络传输的基于SWF的高保真的电子书籍的制作方法，特别适用于对视觉效果和体积均要求较高的场合。The invention relates to the technical field of information digital processing, in particular to a method for making high-fidelity electronic books based on SWF suitable for network transmission, especially suitable for occasions requiring high visual effects and volume.

背景技术Background technique

电子书籍的使用极为广泛，电子书籍的获取可通移动存储介质拷贝到本地，例如u盘、移动硬盘、光盘等进行拷贝。但在很多情况下，电子书籍是通过网络进行下载。在一个分布式的系统中，书籍在下发到最终的客户端前，可能需要在多个服务器间进行多次传输。因此如何降低电子书籍的尺寸就显得极其重要了。E-books are widely used, and the acquisition of e-books can be copied to the local through mobile storage media, such as U disk, mobile hard disk, CD, etc. for copying. But in many cases, e-books are downloaded through the network. In a distributed system, books may need to be transmitted multiple times between multiple servers before being delivered to the final client. Therefore, how to reduce the size of electronic books is extremely important.

在现有的技术下，电子书籍的体积减少，很多情况是通过大幅度降低其清晰度、色彩为代价的。但在很多场合，我们不希望降低其清晰度，例如针对中小学生使用的电子教材，基于中小学生的特点，我们希望保持教材的高清晰度、不显著损失书的色彩。目前，各出版社发行的电子书籍，特别是中小学的电子教材(课本)，基本都是PDF格式的，其特点为色彩绚丽、字符清楚、体积庞大。对于需要将电子书籍进行网络传输的系统，上述体积庞大的电子无疑是不适当的。Under the existing technology, the volume of e-books is reduced, in many cases at the cost of greatly reducing their clarity and color. But in many occasions, we don't want to reduce its clarity, such as electronic textbooks for primary and middle school students. Based on the characteristics of primary and middle school students, we hope to maintain the high definition of the textbooks without significantly losing the color of the books. At present, the electronic books issued by various publishing houses, especially the electronic teaching materials (textbooks) for primary and secondary schools, are basically in PDF format, which is characterized by bright colors, clear characters, and large volume. For a system that needs to transmit electronic books through the network, the above-mentioned bulky electronics is undoubtedly inappropriate.

Adobe公司的SWF(shock wave flash))文件使用了矢量化技术，在特定的条件下，可显著减少电子书籍的体积，因此将PDF转为SWF是一种较好的方案。目前市面上支持PDF转换为SWF的软件很多，例如gpdf2swf、PDFZilla、FlashPaper2、macromedia flashpaper等，但这些软件不支持程序调用，只能手工操作。这样，通常很难将这些转换软件集成进自己的电子书籍制作软件中。要在书籍制作系统中使用PDF转换SWF，可以有两种方案：1)由于PDF与SWF文件都是格式公开的文件，也有较多的开源解析库，例如XPdfLib等，可自己对PDF文件进行解析，并按SWF文件格式进行生产。2)调用一些组件或后台程序，例如PDF2SWF等。无论是哪种方式，都需要针对不同的PDF内容设置合理的转换参数，才能获得最佳效果。此外，目前已有的转换工具中，大都采用统一的策略进行转换，针对性也较差。Adobe's SWF (shock wave flash) file uses vectorization technology, which can significantly reduce the volume of e-books under certain conditions, so converting PDF to SWF is a better solution. Currently, there are many softwares on the market that support converting PDF to SWF, such as gpdf2swf, PDFZilla, FlashPaper2, macromedia flashpaper, etc., but these softwares do not support program calls and can only be operated manually. Thus, it is often difficult to integrate these conversion software into one's own e-book authoring software. To use PDF to convert SWF in the book production system, there are two options: 1) Since both PDF and SWF files are open files, there are also many open source analysis libraries, such as XPdfLib, etc., which can be used to analyze PDF files by yourself , and produced in SWF file format. 2) Call some components or background programs, such as PDF2SWF and so on. Either way, it is necessary to set reasonable conversion parameters for different PDF contents in order to obtain the best results. In addition, most of the currently existing conversion tools use a unified strategy for conversion, and the pertinence is poor.

在目前的电子文档中，文字通常有几种表现形式：1)使用矢量字库；2)使用点阵字库；3)将字符转换为位图显示。In current electronic documents, there are usually several forms of expression for text: 1) using vector fonts; 2) using dot matrix fonts; 3) converting characters into bitmap displays.

其中，矢量字库也称font矢量、矢量字体(Vector font)，本文中都称为font矢量。font矢量是现今使用最为广泛的一种形式。其每一个字形是通过数学曲线来描述的，它包含了字形边界上的关键点，连线的导数信息等，字体的渲染引擎通过读取这些数学矢量，然后进行一定的数学运算来进行渲染。相对这类字体的优点是字体实际尺寸可以任意缩放而不变形、变色。相对点阵字体而言，font矢量具有占用数据少，缩放不不变形等优点。Among them, the vector font library is also called font vector, vector font (Vector font), and both are called font vector in this article. Font vectors are the most widely used form today. Each glyph is described by a mathematical curve, which contains the key points on the glyph boundary, the derivative information of the connection, etc. The rendering engine of the font reads these mathematical vectors and then performs certain mathematical operations for rendering. Compared with this type of font, the advantage is that the actual size of the font can be scaled arbitrarily without deformation or discoloration. Compared with dot-matrix fonts, font vectors have the advantages of occupying less data and not being deformed when zoomed.

点阵字体是把每一个字符都分成16×16或24×24个点，然后用每个点的虚实来表示字符的轮廓。优点是显示速度快，不像矢量字体需要计算；最大的缺点是不能放大，一旦放大后就会发现文字边缘的锯齿。现今该种点阵字体主要只作为“辅助”的部分，使用较少。Dot-matrix fonts divide each character into 16×16 or 24×24 dots, and then use the virtual reality of each dot to represent the outline of the character. The advantage is that the display speed is fast, unlike vector fonts that need to be calculated; the biggest disadvantage is that it cannot be enlarged, and once enlarged, you will find jagged edges of the text. Nowadays, this kind of dot matrix font is mainly used as an "auxiliary" part and is rarely used.

将字符转换为位图显示的形式，其实已经与字符无关了。例如将纸质书进行扫描、数码相机拍照变成图像而生成的电子书，就属于这个类型。Converting characters to bitmap display has nothing to do with characters. For example, e-books generated by scanning paper books and taking photos with digital cameras into images belong to this type.

此外，在一些特定的场合，例如广告宣传中使用的一些艺术字，也常以矢量图的形式进行设计，其本质不是字符，而是一副图像。In addition, in some specific occasions, such as some artistic words used in advertisements, they are often designed in the form of vector graphics, which are not characters in essence, but an image.

在目前的电子书中，图像通常有2中表现形式：1)使用矢量图；2)使用点阵图。In current e-books, images usually have 2 representations: 1) using vector graphics; 2) using bitmaps.

矢量图又称Polygon图、向量图、绘图图像，本文中都称为Polygon图或Polygon。Polygon图是计算机图形学中用点、直线或者多边形等基于数学方程的几何图元表示图像。矢量图形优点是文件占用空间较小，且无论放大、缩小或旋转等不会失真；缺点是难以表现色彩层次丰富的逼真图像效果，且绘制效率不如点阵图高。Vector diagrams are also called Polygon diagrams, vector diagrams, and drawing images, all of which are referred to as Polygon diagrams or Polygons in this article. Polygon graphs represent images in computer graphics with geometric primitives such as points, lines, or polygons based on mathematical equations. The advantage of vector graphics is that the file takes up less space and will not be distorted no matter whether it is enlarged, reduced or rotated; the disadvantage is that it is difficult to express realistic image effects with rich color levels, and the drawing efficiency is not as high as that of bitmap images.

点阵图又称位图(Bitmap)、栅格图、像素图，简单的说，就是最小单位由像素构成的图，缩放会失真。构成位图的最小单位是像素，位图就是由像素阵列的排列来实现其显示效果的。其优点是可产生色彩艳丽、复杂多变的图像；缺点是体积庞大。Bitmaps are also called bitmaps, raster images, and pixel images. Simply put, they are images whose smallest unit is composed of pixels, and will be distorted when zoomed. The smallest unit that constitutes a bitmap is a pixel, and the bitmap realizes its display effect by the arrangement of the pixel array. Its advantage is that it can produce colorful, complex and changeable images; its disadvantage is that it is bulky.

从上述论述可看出，从减少电子数体积的角度，可考虑将电子书中的文字用矢量字库表示(font矢量)；图像以矢量图的形式表现较好。但实际情况中，最终的体积还与内容相关。例如，矢量图适合表达简单的图形，因此对于简单图形，矢量图数据比位图数据体积小得多。但如果原始的图像是位图，且位图包含复杂的形状和许多颜色，则转换后的矢量图形的体积会比原来的位图更大！因此对包含复杂图像的电子书，一律将之矢量化并不合适。对复杂位图，将之转换为jpeg等压缩格式，往往可获得更小得体积。必要时，将某些复杂位图转为矢量，再人工用Flush等编辑工具等对矢量进行平滑处理，可获得较好的色彩、效果、体积的平衡；但对于某些复杂的内容，强制平滑后会与原图有较大的差距，损失了美观度。因此电子书籍的体积减小是一个复杂的过程，需要针对不同的情况进行设计，必要时辅助以人工，才能获得最佳的效果。From the above discussion, it can be seen that from the perspective of reducing the volume of electrons, it can be considered to represent the text in the e-book with a vector font (font vector); the image is better represented in the form of a vector diagram. But in reality, the final volume is also related to the content. For example, vector graphics are suitable for expressing simple graphics, so for simple graphics, vector graphics data is much smaller than bitmap data. But if the original image is a bitmap, and the bitmap contains complex shapes and many colors, the converted vector graphics will be larger than the original bitmap! Therefore, it is not appropriate to vectorize all e-books containing complex images. For complex bitmaps, converting them to compressed formats such as jpeg can often obtain smaller volumes. If necessary, convert some complex bitmaps into vectors, and then manually smooth the vectors with editing tools such as Flush to obtain a better balance of color, effect, and volume; but for some complex content, forced smoothing Afterwards, there will be a large gap with the original image, which will lose the aesthetics. Therefore, the volume reduction of e-books is a complicated process, which needs to be designed according to different situations, and if necessary, it needs to be assisted manually to obtain the best results.

发明内容Contents of the invention

针对现有技术中存在的缺陷，本发明的目的在于提供一种基于SWF的电子书籍制作方法，针对不同的PDF内容，在转换为SWF时采用不同的转换设置，使转换后的文件保证了较高的清晰度，同时达到了较小的尺寸，获得更佳的转换效果。Aiming at the defects existing in the prior art, the purpose of the present invention is to provide a method for making electronic books based on SWF. For different PDF contents, different conversion settings are adopted when converting to SWF, so that the converted files are guaranteed to be relatively stable. High definition, while achieving a smaller size for better conversion effects.

为实现上述目的，本发明采用的技术方案是：In order to achieve the above object, the technical scheme adopted in the present invention is:

一种基于SWF的电子书籍制作方法，包括以下步骤：A method for making an electronic book based on SWF, comprising the following steps:

(1)打开待处理的PDF文件，分析PDF文件中每一页的内容并将每一页的内容进行分类；将PDF文件中每一页的内容分为5大类型：文字为主、图像为辅类型，转换为图像的文字为主、图像为辅类型，图像为主类型，文字为主、图像为背景类型和综合的图文混排类型；(1) Open the PDF file to be processed, analyze the content of each page in the PDF file and classify the content of each page; divide the content of each page in the PDF file into 5 types: text-based, image-based Auxiliary type, the text converted into an image is the main type, the image is the auxiliary type, the image is the main type, the text is the main type, the image is the background type, and a comprehensive graphic and text mixed type;

(2)根据PDF文件每一页内容的类型，分别设置将PDF转换为SWF时的参数；(2) According to the type of content of each page of the PDF file, the parameters when converting PDF to SWF are respectively set;

(3)根据设置的转换参数，将PDF文件转换为SWF文件；(3) Convert the PDF file into a SWF file according to the conversion parameters set;

(4)将转换后的SWF文件压缩并加上文件头，形成最终的电子书籍。(4) Compressing the converted SWF file and adding a file header to form the final electronic book.

进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(2)中，将PDF转换为SWF时的参数设置如下：Further, a kind of electronic book production method based on SWF as above, in step (2), the parameter setting when PDF is converted into SWF is as follows:

当PDF文件中的内容为文字为主、图像为辅类型时，文字保持为font矢量，图像转为polygon图；When the content in the PDF file is text-based and images are auxiliary types, the text remains as a font vector, and the image is converted to a polygon image;

当PDF文件中的内容为转换为图像的文字为主、图像为辅类型时，polygon图保持为polygon图，其它图像转为Jpeg格式；When the content in the PDF file is text converted to images as the main type and images as the auxiliary type, the polygon image remains as a polygon image, and other images are converted to Jpeg format;

当PDF文件中的内容为图像为主类型时，polygon图保持为polygon图，其它图像转为Jpeg格式；When the content in the PDF file is image-based, the polygon image remains as a polygon image, and other images are converted to Jpeg format;

当PDF文件中的内容为文字为主、图像为背景类型时，将图像转为Jpeg格式，文字保持为font矢量；When the content in the PDF file is mainly text and the image is the background type, convert the image to Jpeg format, and keep the text as a font vector;

当PDF文件中的内容为综合的图文混排类型时，polygon图保持为polygon图，其它图像转为Jpeg格式。When the content in the PDF file is a comprehensive image-text type, the polygon image remains as a polygon image, and other images are converted to Jpeg format.

进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(2)中，在设置参数时，当PDF文件中的内容为转换为图像的文字为主、图像为辅类型时，转为Jpeg格式的图像设置为中等清晰度；所述中等清晰度是指Jpeg的品质参数范围是70～80。Further, a kind of SWF-based e-book making method as mentioned above, in step (2), when setting parameters, when the content in the PDF file is converted into the text of the image as the main type, and the image as the auxiliary type, convert to The image in the Jpeg format is set as medium definition; the medium definition means that the quality parameter range of Jpeg is 70-80.

进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(2)中，在设置参数时，当PDF文件中的内容为图像为主类型时，转为Jpeg格式的图像设置为高清晰度；所述高清晰度是指Jpeg品质参数为95～100。Further, a kind of SWF-based e-book making method as described above, in step (2), when setting parameters, when the content in the PDF file is the main type of image, the image converted to Jpeg format is set as high-definition The high definition means that the Jpeg quality parameter is 95-100.

进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(2)中，在设置参数时，当PDF文件中的内容为文字为主、图像为背景类型时，转为Jpeg格式的图像设置为低清晰度；所述低清晰度是指Jpeg品质参数为60～65。Further, a kind of electronic book production method based on SWF as mentioned above, in step (2), when setting parameters, when the content in the PDF file is mainly text, and when the image is the background type, it is converted into an image in Jpeg format Set to low definition; the low definition means that the Jpeg quality parameter is 60-65.

进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(2)中，在设置参数时，当PDF文件中的内容为综合的图文混排类型时，转为Jpeg格式的图像设置为中等清晰度。Further, a kind of SWF-based e-book production method as described above, in step (2), when setting parameters, when the content in the PDF file is a comprehensive graphic and text mixed type, it is converted to an image setting in Jpeg format for medium sharpness.

进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(3)中，将PDF文件转换为SWF文件后，对转换后的文件进行人工检查，对不符合要求的页面进行单独处理。Further, in the method for making an electronic book based on SWF as described above, in step (3), after the PDF file is converted into a SWF file, manual inspection is performed on the converted file, and pages that do not meet the requirements are processed separately.

再进一步，如上所述的一种基于SWF的电子书籍制作方法，步骤(4)中，采用Zip算法对转换后的文件进行压缩。Still further, in the aforementioned SWF-based e-book production method, in step (4), the converted file is compressed using the Zip algorithm.

更进一步，如上所述的一种基于SWF的电子书籍制作方法，对压缩后的文件采用密钥进行加密处理。Furthermore, in the above-mentioned method for making an electronic book based on SWF, the compressed file is encrypted using a key.

本发明的效果在于：本发明所述的方法首先区分PDF的不同内容，并结合不同的内容在转换为SWF时采用不同的转换设置，从而获得最佳效果。此外，电子书籍制作工具还使用了Zip等压缩技，进一步缩小书籍尺寸。用密钥进行加密，获得较好的保密性。该文件在保证了较高的文档清晰度、不降低色彩位深度、同时达到了较小的尺寸，特别适用于对视觉效果和体积均要求较高的场合。The effect of the present invention is that the method of the present invention firstly distinguishes different contents of PDF, and adopts different conversion settings when converting to SWF in combination with different contents, so as to obtain the best effect. In addition, the electronic book production tool also uses compression techniques such as Zip to further reduce the size of the book. Encrypt with a key to obtain better confidentiality. This file ensures high document clarity, does not reduce the color bit depth, and at the same time achieves a small size, and is especially suitable for occasions that require high visual effects and volume.

附图说明Description of drawings

图1为本发明一种基于SWF的电子书籍制作方法的流程图；Fig. 1 is a kind of flowchart of the electronic book production method based on SWF of the present invention;

图2为具体实施例中文字为主要内容、图像为背景类型的一页PDF文件；Fig. 2 is the one-page PDF file that text is main content, image is background type in the specific embodiment;

图3为将图2中的文件采用缺省参数转换后生成的SWF文件的局部图；Figure 3 is a partial view of the SWF file generated after the file in Figure 2 is converted using default parameters;

图4为将图2中的文件按本发明所述的方法进行转换后生成的SWF文件的局部图；Fig. 4 is the partial figure of the SWF file that generates after the file among Fig. 2 is converted by the method for the present invention;

图5为图4的局部放大图。FIG. 5 is a partially enlarged view of FIG. 4 .

具体实施方式detailed description

下面结合说明书附图与具体实施方式对本发明做进一步的详细说明。The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

图1示出了本发明一种基于SWF的电子书籍制作方法的流程图，由图中看出该方法主要包括以下步骤：Fig. 1 has shown the flow chart of a kind of electronic book making method based on SWF of the present invention, find out that this method mainly comprises the following steps from the figure:

步骤S11：将待处理的PDF文件的每一页内容分类；Step S11: classify the content of each page of the PDF file to be processed;

打开待处理的PDF文件，分析PDF文件中每一页的内容并将每一页的内容进行分类；将PDF文件中每一页的内容分为5大类型：文字(内嵌/外连)为主+图像为辅类型，转换为图像的文字为主+图像为辅类型，图像为主类型，文字为主+图像为背景类型和综合的图文混排类型。Open the PDF file to be processed, analyze the content of each page in the PDF file and classify the content of each page; divide the content of each page in the PDF file into 5 types: text (embedded/external link) is The main + image is the auxiliary type, the text converted into an image is the main + image is the auxiliary type, the image is the main type, the text is the main + image is the background type, and a comprehensive graphic and text mixed type.

步骤S12：根据每页PDF内容的类型设置转换参数；Step S12: setting conversion parameters according to the type of PDF content of each page;

在步骤S11中将PDF文件的每一页内容分类后，根据PDF文件每页内容的类型，分别设置将PDF转换为SWF时的参数。具体的设置方式如下：After the content of each page of the PDF file is classified in step S11, the parameters for converting PDF to SWF are respectively set according to the type of content of each page of the PDF file. The specific setting method is as follows:

1)文字(内嵌/外连)为主、图像为辅类型；1) Text (embedded/external link) is the main type, and images are the auxiliary type;

在参数设置时，文字保持为font矢量，将图像转换为polygon图。When setting parameters, the text is kept as a font vector, and the image is converted into a polygon map.

2)文字已被转换为图像为主、图像为辅类型；2) The text has been converted to an image-based and image-assisted type;

在参数设置时，polygon图保持polygon图，其它图像转换为Jpeg格式，Jpeg格式图像设置为中等清晰度。When the parameters are set, the polygon image remains the polygon image, and other images are converted to Jpeg format, and the Jpeg format image is set to medium definition.

3)图像为主要内容、文字较少类型；3) Images are the main content and less text;

在参数设置时，polygon图保持polygon图、图像转为Jpeg格式，且将Jpeg格式的图像设置为高清晰度。When setting the parameters, the polygon image remains the polygon image, the image is converted to Jpeg format, and the image in Jpeg format is set to high-definition.

4)文字为主要内容、图像为背景的类型；4) Types in which text is the main content and images are the background;

在参数设置时，图像转为Jpeg格式，且设置为低清晰度，文字保持font矢量。When setting parameters, the image is converted to Jpeg format and set to low definition, and the text remains font vector.

5)综合的图文混排类型。5) Comprehensive graphics and text mixed type.

在参数设置时，Polygon图保持polygon图像，图像转为Jpeg格式，Jpeg格式图像设置为中等清晰度。When setting the parameters, the Polygon image keeps the polygon image, the image is converted to Jpeg format, and the Jpeg format image is set to medium resolution.

本具体实施方式中5种是比较常见的类型。当然，对于PDF文档内容的分类，用户可以根据实际中的需要，进行不同的分类。对于5种常见类型中，所设置的中等清晰度、高清晰度和低清晰度指的是图像品质参数的不同。Jpeg格式的图像是一种有损压缩图像，其压缩率通常用品质参数(简称品质)，也称为Q因子、压缩因子来表示。品质因子的值从1到100。值越小，压缩度越高，也即像素质量损失得也越大。常见的Jpeg图像的品质在60-80％。在转为Jpeg的过程中，可根据不同的情况设置jpeg的品质，例如对图像只是不重要的背景图时，可将其品质设置较低的品质；对图像是主要、重要的内容时，可将其品质设置较高品质等。本具体实施方式中中等清晰度是指Jpeg的品质范围是70～80，高清晰度是指Jpeg的品质范围是95～100，而低清晰度Jpeg的品质范围是60～65。In this specific embodiment, 5 kinds are relatively common types. Of course, for the classification of PDF document content, the user can perform different classifications according to actual needs. Among the five common types, the medium definition, high definition and low definition set refer to the difference in image quality parameters. The image in Jpeg format is a lossy compressed image, and its compression rate is usually represented by a quality parameter (referred to as quality), also known as Q factor or compression factor. The quality factor has a value from 1 to 100. The smaller the value, the higher the compression, that is, the greater the loss of pixel quality. Common Jpeg images have a quality of 60-80%. In the process of converting to Jpeg, you can set the quality of jpeg according to different situations. For example, if the image is only an unimportant background image, you can set its quality to a lower quality; Set its quality to a higher quality, etc. In this specific embodiment, medium definition means that the Jpeg quality range is 70-80, high definition means that the Jpeg quality range is 95-100, and low-resolution Jpeg quality range is 60-65.

步骤S13：根据设置的转换参数将PDF文件转换为SWF文件；Step S13: converting the PDF file into a SWF file according to the set conversion parameters;

根据步骤S12中根据不同的内容类型所设置的不同转换参数，将PDF文件转换为SWF文件。将PDF转换为SWF属于现有的技术，用户可以根据需要选用所需的转换软件。将PDF文件进行SWF矢量化，就是将PDF文件转换为SWF文件，可采用gpdf2swf或者其它已有的软件进行；此外，由于PDF与SWF文件格式都是公开的，因此也可以自行编写程序进行读、写转换处理。Convert the PDF file into a SWF file according to different conversion parameters set according to different content types in step S12. Converting PDF to SWF belongs to the existing technology, and users can choose the required conversion software according to their needs. To convert PDF files to SWF vectorization is to convert PDF files into SWF files, which can be carried out by using gpdf2swf or other existing software; in addition, since the PDF and SWF file formats are open, you can also write programs to read, Write conversion processing.

必要时，在转换完成后进行人工检查，可以对不满意的页面再进行单独的处理，采用这种程序判读与人工检查结合的方法，能够得到更好的效果。If necessary, manual inspection is performed after the conversion is completed, and unsatisfactory pages can be processed separately. Using this method of combining program interpretation and manual inspection can get better results.

下表中为实际应用中采用本发明所述方法，针对每页不同类型的PDF内容进行SWF转换时的不同参数设置，及转换前后的文件大小对比。由表中看出，采用本发明的方法转换后，转换后的文件大小较原始文档大幅度减小，且对于后三种类型来说，采用本发明所述的方法相比于缺省参数转换方法，其效果也是很明显的，文件的大小也是很明显的变小了。The following table shows the different parameter settings when using the method of the present invention in practical application for SWF conversion of different types of PDF content on each page, and the file size comparison before and after conversion. As can be seen from the table, after adopting the method of the present invention to convert, the size of the converted file is significantly reduced compared with the original document, and for the latter three types, the method of the present invention is compared to the default parameter conversion method, its effect is also very obvious, and the file size is also obviously reduced.

表1Table 1

步骤S14：将SWF文件加上数据头形成最终电子书籍。Step S14: Adding a data header to the SWF file to form a final electronic book.

将转换后的SWF文件压缩并加上文件头，形成最终的电子书籍。在实际的应用中，对转换后的SWF进行压缩后，如采用Zip算法进行压缩后，可以采用密钥对压缩后的数据进行加密，获得较好的保密性，然后为文件加上数据头，形成最终的电子书籍。Compress the converted SWF file and add a file header to form the final e-book. In practical applications, after compressing the converted SWF, such as using the Zip algorithm, you can use a key to encrypt the compressed data to obtain better confidentiality, and then add a data header to the file, form the final e-book.

采用本发明上述的方法进行电子书籍制作时，克服了现有技术中敬爱那个PDF文件转换为SWF文件时，针对性差的问题，尤其是出现文档清晰度、大小和色彩冲突较严重的问题，一般突出其中某个特性时，常以大幅度牺牲别的特性为代价的问题，本发明的方法这对不同的内容类型，分别设置了不同的转换参数，保证较高的文档清晰度、色彩保真的条件下，同时保持较小的文件尺寸。When the above-mentioned method of the present invention is used to make electronic books, the problem of poor pertinence when the PDF file is converted into a SWF file in the prior art is overcome, especially the serious problem of document clarity, size and color conflicts, generally When one of the characteristics is highlighted, it is often at the expense of other characteristics. The method of the present invention sets different conversion parameters for different content types to ensure higher document clarity and color fidelity. conditions while maintaining a small file size.

为了更好的理解本发明的技术方案，下面结合具体的实施例，对本发明的方法进行进一步详细的介绍。In order to better understand the technical solution of the present invention, the method of the present invention will be further described in detail below in conjunction with specific embodiments.

实施例Example

本实施例中以第4种类型的PDF文件的内容(文字为主要内容、图像为背景类型)为例进行说明。In this embodiment, the content of the fourth type of PDF file (text is the main content, and the image is the background type) is taken as an example for illustration.

如图2所示，该原始的PDF文档很清晰，其大小为683k。如果对该页文档采用缺省的配置进行转换，其转换结果的局部如图3所示，可以看出转换出的图像也很清晰，但是大小比较大，为121k，对于需要降低体积尺寸的要求，该结果不理想。As shown in Figure 2, the original PDF document is very clear, and its size is 683k. If the document on this page is converted using the default configuration, the part of the conversion result is shown in Figure 3. It can be seen that the converted image is also very clear, but the size is relatively large, 121k, which is required to reduce the volume size. , the result is not ideal.

下面采用本发明所述的方法对该PDF内容进行转换。The PDF content is converted using the method described in the present invention below.

用本发明所述方法首先进行判断，获知该PDF内容，其文字为主要内容，图像部分较少，因此采用“将图像转为Jpeg格式，且设置为很低的清晰度，文字保持font矢量”的设置方案。按照上述参数进行设置后，采用现有技术将PDF文件转换为SWF文件，转换后的结果局部如图4所示。可看到在转出的SWF文件中，图像部分较为模糊，但文字部分很清晰，而转换后的文件大小为45K。显然，在这种情况下，读者并不关心辅助的图像部分，图像部分显示较为模糊是可以接受的。但文字部分非常重要，是读者阅读的重点，此时应保证字体用矢量表示，获得高清晰的文字，且大小结果也很理想。First judge with the method described in the present invention, know this PDF content, its text is main content, and image part is less, therefore adopt " image is converted into Jpeg form, and be set to very low definition, text keeps font vector " setting scheme. After setting according to the above parameters, the existing technology is used to convert the PDF file into a SWF file, and the converted result is partially shown in Figure 4. It can be seen that in the converted SWF file, the image part is blurred, but the text part is very clear, and the converted file size is 45K. Obviously, in this case, the reader does not care about the auxiliary image part, and it is acceptable for the image part to appear blurry. But the text part is very important and is the focus of readers' reading. At this time, it should be ensured that the font is represented by vectors to obtain high-definition text, and the size result is also ideal.

图5为图4局部放大后的情况，可以看到，图像部分变得更模糊了。但由于其从内容的角度来说不重要，所有我们可以接受这种模糊；而主体的文字部分，由于被转换为矢量，在放大很多的情况下，依然保持了较高的清晰度。Figure 5 is a partial zoom-in of Figure 4, and it can be seen that the image part has become more blurred. But because it is not important from the perspective of content, we can accept this kind of blurring; and the text part of the main body, because it is converted into a vector, still maintains a high definition when it is enlarged a lot.

显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若本发明的这些修改和变型属于本发明权利要求及其同等技术的范围之内，则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies, the present invention also intends to include these modifications and variations.

Claims

1. A method for making electronic books based on SWF, comprising the following steps:

(1) Open the PDF file to be processed, analyze the content of each page in the PDF file and classify the content of each page; divide the content of each page in the PDF file into 5 types: text-based, image-based Auxiliary type, the text converted into an image is the main type, the image is the auxiliary type, the image is the main type, the text is the main type, the image is the background type, and a comprehensive graphic and text mixed type;

(2) According to the type of content of each page of the PDF file, set the parameters when converting PDF to SWF respectively; the parameters when converting PDF to SWF are set as follows:

When the content in the PDF file is text-based and images are auxiliary types, the text remains as a font vector, and the image is converted to a polygon image;

When the content in the PDF file is text converted to images as the main type and images as the auxiliary type, the polygon image remains as a polygon image, and other images are converted to Jpeg format;

When the content in the PDF file is image-based, the polygon image remains as a polygon image, and other images are converted to Jpeg format;

When the content in the PDF file is mainly text and the image is the background type, convert the image to Jpeg format, and keep the text as a font vector;

When the content in the PDF file is a comprehensive image-text type, the polygon image remains as a polygon image, and other images are converted to Jpeg format;

(3) Convert the PDF file into a SWF file according to the conversion parameters set;

(4) Compressing the converted SWF file and adding a file header to form the final electronic book.

2. a kind of method for making electronic books based on SWF as claimed in claim 1, it is characterized in that: in step (2), when setting parameters, when the content in the PDF file is converted into the text of image mainly, image When it is an auxiliary type, the image converted to Jpeg format is set to medium definition; the medium definition means that the quality parameter range of Jpeg is 70-80.

3. a kind of method for making electronic books based on SWF as claimed in claim 1, it is characterized in that: in step (2), when setting parameter, when the content in PDF file is that image is main type, change to Jpeg The format of the image is set to be high-definition; the high-definition means that the Jpeg quality parameter is 95-100.

4. a kind of method for making electronic books based on SWF as claimed in claim 1, it is characterized in that: in step (2), when setting parameters, when the content in the PDF file is mainly text, when image is background type , the image converted to Jpeg format is set to low definition; the low definition refers to the Jpeg quality parameter being 60-65.

5. a kind of method for making electronic books based on SWF as claimed in claim 1, is characterized in that: in step (2), when setting parameter, when the content in the PDF file is the integrated graphic and text mixed arrangement type, The image converted to the Jpeg format is set to medium definition; the medium definition means that the quality parameter range of Jpeg is 70-80.

6. a kind of method for making electronic books based on SWF as claimed in claim 1, it is characterized in that: in step (3), after PDF file is converted into SWF file, manual inspection is carried out to the file after conversion, does not conform to Required pages are processed separately.

7. A kind of SWF-based electronic book making method as claimed in claim 1, is characterized in that: in step (4), adopt Zip algorithm to compress the converted file.

8. A method for making an electronic book based on SWF as claimed in claim 7, characterized in that: the compressed file is encrypted using a key.