Summary of the invention
The present invention proposes a kind of embedded multi-format electronic document marking method, do not rely on concrete electronic file form, do not rely on special hardware yet, can in traditional GUI interactive interface, use, also can be applied among the more natural PUI (Pen-based user interfaces), can satisfy the user and read the demand that arbitrarily marks on each sort of electronic document, and labeled data and the combination of original document data can be preserved for one.
Embedded multi-format electronic document marking method may further comprise the steps:
(1) from document container, reads document content data and labeled data respectively by the document container operational module and it is shown.
A1. the user selects the document that need open by user interface; Obtain the document path by user interface, call the drawing practice in the document function module, the user can select different types of documents to open.
A2. document function module invokes document format analytics engine group and mark engine import the document path into.
A3. document container operational module interface is according to described document path, the opening document container, read the document header and obtain Doctype, read document data index and labeled data index, obtain document data and labeled data respectively according to document data index and labeled data index, return to document format analytics engine group and mark engine respectively by function call.
A4. document format engine group is called the format engine of corresponding document according to Doctype, and document data is played up, and obtains the documentation page display bitmap.
A5. mark engine labeled data is read in the internal storage data structure, on the bitmap basis that the document format engine returns, internal memory acceptance of the bid notes point coordinate is carried out coordinate conversion calculate again.
If the coordinate of certain point that marks in the internal memory for (x, y), current document page or leaf convergent-divergent rate r, (then this through type (1) is transformed into the point coordinate (x ', y ') of document display region to point for x, y) translation (Δ x, Δ y);
A6. documentation page display bitmap and the labeled data after coordinate conversion are drawn demonstration by user interface.
(2) document data and the labeled data that read are operated, realized the erase feature of mark, convergent-divergent, translation and the mark of document.
Described the document data and the labeled data that read are operated, are realized that the marking Function of document may further comprise the steps:
B1. the user sets the mark operator scheme by user interface, selected stroke color and thickness value, on the interface, carry out the stroke mark with a pin or mouse again, point coordinate by user interface acquisition mark stroke calls and transmits parameters such as stroke color, thickness, point coordinate to the mask method in the document function module.
B2. mask method calls the mark engine and imports parameters such as stroke color, thickness, point coordinate in the document function module interface.
B3. mark engine and draw the current mark stroke of explicit user immediately by user interface.
B4. mark engine and show size, translation position, the mark stroke point coordinate that imports into is carried out coordinate conversion calculate according to the current document page or leaf, and mark stroke chained list in the updating memory;
If a certain mark point coordinate that user interface obtains be (x, y), current document page or leaf convergent-divergent rate r, (then this through type (2) is transformed into and marks point coordinate (x ', y ') the mark point for x, y) translation (Δ x, Δ y);
B5. the method that provides of invoke user interface redraws demonstration with current page mark stroke in the internal memory.
Described the document data and the labeled data that read are operated, are realized that the erase feature of mark may further comprise the steps:
C1. the user sets by user interface and wipes the mark operator scheme, wipes mark with a pin or mouse on the interface, obtains the point coordinate that the user clicks by user interface, calls and transmit the point coordinate parameter to the mask method of wiping in the document function module.
C2. mask method calls the mark engine and imports coordinate parameters in the document function module interface.
C3. mark engine and show size, translation position, the point coordinate that imports into is carried out coordinate conversion calculate acquisition point coordinate z according to the current document page or leaf;
If the erase operation point coordinate that user interface obtains be (x, y), current document page or leaf convergent-divergent rate r, translation (Δ x, Δ y), then this through type (3) is transformed into and marks point coordinate z=(x ', y ');
C4. current page marks the stroke chained list in the sequential search internal memory, coordinate w that is had a few in the calculating stroke and the distance of z, when the distance value of w and z during less than predetermined value, this mark stroke of deletion in chained list, this predetermined value can be set in software is realized according to real needs;
C5. with after current page mark stroke is passed through coordinate conversion in the internal memory, the method that the invoke user interface provides redraws demonstration.
Described the document data and the labeled data that read are operated, are realized that the zoom function of document may further comprise the steps:
D1. the user selects to dwindle or amplifieroperation by user interface, and the user interface response is also called Zoom method in the document function module, and import the scaling value of acquiescence into, and this default value can be set in software is realized according to real needs, such as being made as 1.25.
D2. document function module invokes document format analytics engine group and mark engine import above-mentioned acquiescence scale value into.
D3. document format engine group is called corresponding document format engine according to Doctype, imports above-mentioned acquiescence scale value into, and the document current page is played up again, obtains new documentation page display bitmap.
D4. mark the new bitmap that engine returns according to the document format engine, again internal memory acceptance of the bid notes point coordinate is carried out coordinate conversion and calculate.
If coordinate of certain mark point be in the internal memory (x, y), current document page or leaf convergent-divergent rate r, (then this through type (4) is transformed into the point coordinate (x ', y ') of document display region for x, y) translation (Δ x, Δ y) to mark point;
D5. new documentation page display bitmap and the labeled data after the coordinate conversion are repainted demonstration by user interface;
Described the document data and the labeled data that read are operated, are realized that the translation functions of document may further comprise the steps:
E1. the user uses a pin or MouseAcross to cross user interface to carry out the documentation page translation, and user interface obtains x direction, y direction shift value, calls the shift method in the document function module interface and imports shift value into;
E2. the shift method invoke user interface of document function module interface is carried out translation to document data;
E3. the shift method of document function module interface calls the mark engine, and imports shift value into;
E4. mark engine according to shift value and coordinate conversion rule, again internal memory acceptance of the bid notes point coordinate is carried out coordinate conversion and calculate;
If the coordinate of certain mark point is in the internal memory (x, y), current document page or leaf convergent-divergent rate r, which some translation (is x (y) to the mark point?) translation (Δ x, Δ y), then this through type (5) is transformed into the point coordinate (x ', y ') of document display region;
E5. the labeled data after the coordinate conversion is repainted demonstration by user interface.
(3) document data and the labeled data after will operating is kept in the document container by the document container operational module.
A kind of embedded multi-format electronic document marking system that is used for is made of following construction module:
(1) user interface, seizure, analysis and the result's of the interactive operation of mainly responsible user and labeling system feedback;
To having carried out the abstract class ratio based on the interactive operation at distinct interaction interface, the user is analogized to delineating on paper in the mark on arrangement for reading operation, defined general mark incoming event thus and described, will mark input elemental motion and be defined as: move pen, start to write, lift pen.
(2) document function module, defined a series of document operation, comprise document file page drafting, convergent-divergent, translation, page turning, mark, wipe basic document function such as mark, stipulated the interface that document is read, for document format engine group and the specific implementation that marks engine provide foundation, can defining operation primitive be described by basic document function; Drafting primitive comprises:
Drawing primitive: doc_draw;
Convergent-divergent primitive: doc_zoom;
Translation primitive: doc_drag;
Page turning primitive: doc_new_page;
Mark primitive: doc_annotate;
Wipe mark: doc_del_annotation;
(3) document, mark parsing module are made up of document format analytics engine group and mark engine two parts, and document format engine group and mark engine are independent mutually;
Document format engine group comprises the common format of document, can select the format engine of response that document data is played up demonstration according to the form of opening that the user selects.
The definition tissue of labeled data comprises following structure in the mark engine:
Point (Point), stroke (Stroke).Its midpoint configuration has comprised the coordinate figure of point; Stroke is made of the series of points structure, and has comprised optional color value and stroke weight value.On this basis, labeled data is organization unit with the documentation page, and all the mark strokes on each documentation page are preserved with the chain sheet form, and mark comes these mark stroke chained lists of index by documentation page in full;
The mark engine is realized the playing up of labeled data, coordinate conversion and is wiped.
(4) document container operational module has defined the parsing to document container, obtains independently original document content-data and labeled data by it, offers the different analytics engine in upper strata and resolves and finally show the user;
(5) document container combines document data and labeled data in the storage aspect by effective means, document container is made up of document head, document data, labeled data and document tail;
1) the document head comprises the convenient index of information and the original document data content and the labeled data content of Doctype;
2) document data is meant the original data content of electronic document;
3) labeled data is meant that the user does the formed data of mark operation on document; The organizational form of labeled data is a base unit with the mark stroke, preserves mark stroke in the corresponding documentation page according to the document page number, and a series of mark point coordinate that the mark stroke is gathered during by mark are formed;
4) the document tail then is used for the end of marking document.
Embedded multi-format electronic document marking method of the present invention says that labeled data and document data store respectively, show, be applicable to the electronic document reading of various forms, can arbitrarily carry out the full text mark, use the person's handwriting of shades of colour, thickness to mark, make that the reading of electronic document is more convenient, strengthened the subjective initiative of user in reading process; Labeled data can arbitrarily be wiped, and can carry out storage and uniform with electronic document, can reproduce when opening document reading next time; The user both can select based on traditional mouse-keyboard interactive mode, can select also that pen type is mutual easily based on intelligence more, interface hardware was not had specific (special) requirements, applied range.
Embodiment
Electronic document marking method and system to embedded platform of the present invention is elaborated by embodiment below in conjunction with accompanying drawing.
As shown in Figure 1, embedded multi-format electronic document marking method may further comprise the steps:
(1) user (UI) reads document content data and labeled data respectively by the document container operational module and it is shown from document container.
A1. the user selects the document that need open by user interface; Obtain the document path by user interface, call the drawing practice in the document function module, the user can select different types of documents to open.
A2. document function module invokes document format analytics engine group and mark engine import the document path into.
A3. document container operational module interface is according to described document path, the opening document container, read the document header and obtain Doctype, read document data index and labeled data index, obtain document data and labeled data respectively according to document data index and labeled data index, return to document format analytics engine group and mark engine respectively by function call.
A4. document format engine group is called the format engine of corresponding document according to Doctype, and document data is played up, and obtains the documentation page display bitmap.
A5. mark engine labeled data is read in the internal storage data structure, on the bitmap basis that the document format engine returns, internal memory acceptance of the bid notes point coordinate is carried out coordinate conversion calculate again.
As shown in Figure 2, the a-quadrant is the entire document page or leaf, and the B zone is a document display region, and the coordinate of establishing certain point that marks in the internal memory is (x, y), current document page or leaf convergent-divergent rate r, point (x, y) translation (Δ x, Δ y), then this through type (1) is transformed into the point coordinate (x ', y ') of document display region;
A certain mark point (x in the document display region
0, y
0) need be transformed in the document file page coordinate system, could assignment store in the labeled data structure.
A6. documentation page display bitmap and the labeled data after coordinate conversion are drawn demonstration by user interface.
(2) document data and the labeled data that read are operated, realized the erase feature of mark, convergent-divergent, translation and the mark of document.
As described in Figure 5, the document data and the labeled data that read are operated, are realized that the marking Function of document may further comprise the steps:
B1. the user sets the mark operator scheme by user interface, selected stroke color and thickness value, on the interface, carry out the stroke mark with a pin or mouse again, point coordinate by user interface acquisition mark stroke calls and transmits parameters such as stroke color, thickness, point coordinate to the mask method in the document function module.
B2. mask method calls the mark engine and imports parameters such as stroke color, thickness, point coordinate in the document function module interface.
B3. mark engine and draw the current mark stroke of explicit user immediately by user interface.
B4. mark engine and show size, translation position, the mark stroke point coordinate that imports into is carried out coordinate conversion calculate according to the current document page or leaf, and mark stroke chained list in the updating memory;
If a certain mark point coordinate that user interface obtains be (x, y), current document page or leaf convergent-divergent rate r, (then this through type (2) is transformed into and marks point coordinate (x ', y ') the mark point for x, y) translation (Δ x, Δ y);
B5. the method that provides of invoke user interface redraws demonstration with current page mark stroke in the internal memory.
As shown in Figure 6, the document data and the labeled data that read are operated, are realized that the erase feature of mark may further comprise the steps:
C1. the user sets by user interface and wipes the mark operator scheme, wipes mark with a pin or mouse on the interface, obtains the point coordinate that the user clicks by user interface, calls and transmit the point coordinate parameter to the mask method of wiping in the document function module.
C2. mask method calls the mark engine and imports coordinate parameters in the document function module interface.
C3. mark engine and show size, translation position, the point coordinate that imports into is carried out coordinate conversion calculate acquisition point coordinate z according to the current document page or leaf;
If the erase operation point coordinate that user interface obtains be (x, y), current document page or leaf convergent-divergent rate r, translation (Δ x, Δ y), then this through type (3) is transformed into and marks point coordinate z=(x ', y ');
C4. current page mark stroke chained list in the sequential search internal memory calculates the coordinate w and the distance of z that are had a few in the stroke, when the distance value of w and z during less than predetermined value, deletes this and mark stroke in chained list;
C5. with after current page mark stroke is passed through coordinate conversion in the internal memory, the method that the invoke user interface provides redraws demonstration.
The document data and the labeled data that read are operated, are realized that the zoom function of document may further comprise the steps:
D1. the user selects to dwindle or amplifieroperation by user interface, and the user interface response is also called Zoom method in the document function module, and imports the scaling value of acquiescence into.
D2. document function module invokes document format analytics engine group and mark engine import above-mentioned acquiescence scale value into.
D3. document format engine group is called corresponding document format engine according to Doctype, imports above-mentioned acquiescence scale value into, and the document current page is played up again, obtains new documentation page display bitmap.
D4. mark the new bitmap that engine returns according to the document format engine, again internal memory acceptance of the bid notes point coordinate is carried out coordinate conversion and calculate.
If coordinate of certain mark point be in the internal memory (x, y), current document page or leaf convergent-divergent rate r, (then this through type (4) is transformed into the point coordinate (x ', y ') of document display region for x, y) translation (Δ x, Δ y) to mark point;
D5. new documentation page display bitmap and the labeled data after the coordinate conversion are repainted demonstration by user interface.
The document data and the labeled data that read are operated, are realized that the translation functions of document may further comprise the steps:
E1. the user uses a pin or MouseAcross to cross user interface to carry out the documentation page translation, and user interface obtains x direction, y direction shift value, calls the shift method in the document function module interface and imports shift value into;
E2. the shift method invoke user interface of document function module interface is carried out translation to document data;
E3. the shift method of document function module interface calls the mark engine, and imports shift value into;
E4. mark engine according to shift value and coordinate conversion rule, again internal memory acceptance of the bid notes point coordinate is carried out coordinate conversion and calculate;
If coordinate of certain mark point be in the internal memory (x, y), current document page or leaf convergent-divergent rate r, (then this through type (5) is transformed into the point coordinate (x ', y ') of document display region for x, y) translation (Δ x, Δ y) to mark point;
E5. the labeled data after the coordinate conversion is repainted demonstration by user interface.
(3) document data and the labeled data after will operating is kept in the document container by the document container operational module.
A kind of embedded multi-format electronic document marking system that is used for is made of following construction module:
(1) user interface, seizure, analysis and the result's of the interactive operation of mainly responsible user and labeling system feedback;
To having carried out the abstract class ratio based on the interactive operation at distinct interaction interface, the user is analogized to delineating on paper in the mark on arrangement for reading operation, having defined general mark incoming event thus describes, in traditional embedded platform, can use a pin to replace mouse to carry out interface alternation operation, a pin moving pen, start to write, lift the coordinate figure that pen can obtain point, person's handwriting by the API that operating system provides on touch-screen.
(2) document function module, defined a series of document operation, comprise document file page drafting, convergent-divergent, translation, page turning, mark, wipe basic document function such as mark, stipulated the interface that document is read, for document format engine group and the specific implementation that marks engine provide foundation, can defining operation primitive be described by basic document function; Drafting primitive comprises:
Drawing primitive: doc_draw;
Input: page_num (page number), zoom_ratio (convergent-divergent rate);
Output: doc_page_bitmap (bitmap of electronic document page), annotation_data (mark on the electronic document page);
At first need the data bitmap of electron gain documentation page from format engine if need to show electronic document page, and labeled data and attribute are obtained in the mark calculating of be correlated with according to the mark engine, return results is transferred to user interface layer and is drawn demonstration.Electronic document is organized with the form of page or leaf, need spell out the page number of demonstration.The convergent-divergent rate is defaulted as 1, represents no convergent-divergent, keeps original size.
Convergent-divergent primitive: doc_zoom;
Input: zoom_ratio;
Output: the bitmap of electronic document page and the mark on the current page;
Zoom operations relates to display effect.If electronic file form is the form that vector is preserved, needs so to resolve the bitmap that makes new advances, otherwise can cause document display effect variation by format engine.Labeled data also will be resolved again in addition.
Translation primitive: doc_drag;
Input: delta_x (x direction movement value), delta_y (y direction movement value);
Output: do not have;
The user viewing area may therefore by alternately the document display part being moved, make original covered part show than little through the document file page after amplifying.Owing to obtained the original bitmap of documentation page in the drawing process, translation then only needs the invoke user interface to finish.
Page turning primitive: doc_new_page;
Input: page_num, zoom_ratio;
Output: doc_page_bitmap, annotation_data;
Page turn over operation also can be summed up as mapping operation, so the input and output of page turning primitive are all the same with drawing primitive, only put forward separately as a kind of basic operation.
Mark primitive: doc_annotate;
Input: x, y;
Output: do not have;
Mark operation is mainly obtained the coordinate data that the user does the mark stroke by user interface, therefore is input as the coordinate figure based on the stroke point of viewing area, marks calculating and upgrades current labeled data thereby be correlated with by the mark engine.In doing the mark process, can use drawing primitive repaint the viewing area just can instant playback the result of mark.
Wipe mark: doc_del_annotation;
Input: x, y;
Output: do not have;
Wipe mark and refer to that on the basis of current mark the user chooses some mark to wipe by user interface.After obtaining current coordinate, upgrade current labeled data by the mark engine, re-use drawing primitive and repaint the viewing area and can refresh demonstration in real time.
(3) document, mark parsing module are made up of document format analytics engine group and mark engine two parts, and document format engine group and mark engine are independent mutually;
Document format engine group comprises the common format of document, can select the format engine of response that document data is played up demonstration according to the form of opening that the user selects.
As shown in Figure 4, the definition tissue of labeled data comprises following structure in the mark engine:
Point (Point), stroke (Stroke).Its midpoint configuration has comprised the coordinate figure of point; Stroke is made of the series of points structure, and has comprised optional color value and stroke weight value.The stroke mid point can obtain by the API of call operation system.Because electronic document is base unit with the page or leaf, use in every page mark stroke chained list to write down labeled data in the documentation page, entire document uses Hash table to safeguard all labeled data.
The mark point coordinate that the API that provides by operating system obtains can't directly use, and need mark coordinate Calculation.User interface is to be initial point with document display region (the being user interaction area) upper left corner, is to the right X coordinate positive dirction, is downwards Y coordinate positive dirction, can be referred to as displaing coordinate system.Because limited user interface generally can't show the full content of electronic document page, the coordinate figure of the mark stroke point that interface obtained that provides by user interface development tool can not indirect assignment stores into and marks in the related data structures.For the mark that the user is done can reproduce, the coordinate figure of stroke mid point should be as the resulting coordinate figure of reference coordinate system with electronic document page.At first define the document file page coordinate system: it is a true origin with the document upper left corner, is to the right the positive aspect of X coordinate, is downwards Y coordinate positive dirction.The conversion Calculation of coordinate is exactly that displaing coordinate system changes mutually with the coordinate of document file page coordinate system.
Except to the labelling document, can also wipe marked content.Under erase mode of operation, the user can choose the mark stroke by user interface, and this mark stroke just is wiped free of.The implementation method of wiping mark is as follows: obtain the coordinate figure of click, need carry out coordinate conversion equally, be transformed into coordinate under the document file page coordinate system.Seek satisfactory mark stroke in the mark stroke tabulation in current page, and delete satisfactory certain bar mark stroke.This process is comparatively crucial, step is as follows: order travels through the stroke tabulation in the current page, calculate have a few in the stroke and the point of the rapid middle acquisition of previous step between distance, consider the factor of user interface, as long as the satisfied predetermined value of this distance can think that what choose is this mark stroke, and deletes.In implementation procedure, predetermined value is changeable, set according to actual needs, bigger words are set can be reduced amount of calculation but may cause out of true, less words are set have been increased degree of accuracy but has also increased calculated amount, if may cause and repeatedly click certain bar mark stroke and carry out the situation that erase operation can't be wiped but too little words are set, therefore being provided with of this value need the consideration actual conditions.
The mark engine is realized the playing up of labeled data, coordinate conversion and is wiped.
(4) document container operational module has defined the parsing to document container, obtains independently original document content-data and labeled data by it, offers the different analytics engine in upper strata and resolves and finally show the user.
(5) as shown in Figure 3, document container combines document data and labeled data in the storage aspect by effective means, and document container is made up of document head, document body and document tail.
1) the document head is the document structure information district, comprises document type information, original document index, labeled data index.
2) the document body is the document content information district, comprises document data district and labeled data district.Document data is meant the original data content of electronic document; Labeled data is meant that the user does the formed data of mark operation on document; The organizational form of labeled data is a base unit with the mark stroke, preserves mark stroke in the corresponding documentation page according to the document page number, and a series of mark point coordinate that the mark stroke is gathered during by mark are formed.
For the original electronic document data content, need not to revise also need not to understand its storage mode; For labeled data,, in specific implementation, can use content and present the XML form that separates and store labeled data according to the definition of front.Adopt the incompatible expression storage of following XML tally set labeled data:
<point>x,y</point>
<stroke?color=″...″
pen=″...″>
<point>...</point>
<point>...</point>
...
</stroke>
<ANNOTATION?pagenum=″...″>
<stroke>...</stroke>
...
</ANNOTATION>
3) the document tail then is used for the end of marking document.