CN114971539A

CN114971539A - Simulated manual operation method based on image matching

Info

Publication number: CN114971539A
Application number: CN202210523708.XA
Authority: CN
Inventors: 吴清亮; 张新鹏; 李晓龙; 钱振兴; 秦川
Original assignee: Southeast Digital Economic Development Research Institute
Current assignee: Southeast Digital Economic Development Research Institute
Priority date: 2022-05-13
Filing date: 2022-05-13
Publication date: 2022-08-30

Abstract

The invention discloses a simulated manual operation method based on image matching, which comprises the following steps: step 1: selecting a positioning image of an operation interface; the positioning image is a screenshot of the component; the characteristic elements of the positioning image are unique; step 2: providing a sequence of operational events; the operation event sequence comprises operation behaviors on the positioning image; and step 3: repeating the step 1 and the step 2 to form a set of task scripts; and 4, step 4: and running the task script to automatically finish the office action. The invention provides a simulated manual operation method based on image matching. Since the operation area of each step is positioned by image matching, the positional area variation of the operation interface cannot affect the positioning.

Description

A Method of Simulating Manual Operation Based on Image Matching

技术领域technical field

本发明涉及办公自动化技术领域，具体地说，特别涉及一种基于图像匹配的模拟人工操作方法。The invention relates to the technical field of office automation, in particular, to a method for simulating manual operation based on image matching.

背景技术Background technique

随着现代信息技术的发展，办公与计算机技术结合而成的办公自动化技术是一种正在兴起的综合性技术。它不仅可以实现办公事务的自动化处理，而且可以极大地提高个人或者团队办公事务处理的效率。当前，无论是政府、企业还是个体经营者，可能都存在大量流程性、重复性的文档类工作，人工操作不仅枯燥无聊，还容易导致出错。自动化操作软件可以通过模拟人工操作的行为，通过计算机代替人工处理这类工作，将人从繁琐的重复性劳动中解救出来。同时机器操作可以极大降低人工操作的错误率。With the development of modern information technology, office automation technology, which is a combination of office and computer technology, is a comprehensive technology that is emerging. It can not only realize the automatic processing of office affairs, but also greatly improve the efficiency of individual or team office affairs processing. At present, whether it is a government, an enterprise or a self-employed person, there may be a lot of process and repetitive document work. Manual operation is not only boring, but also prone to errors. Automatic operation software can save people from tedious repetitive labor by simulating the behavior of manual operation and replacing manual processing of such work by computers. At the same time, machine operation can greatly reduce the error rate of manual operation.

目前一种主流的模拟人工操作的技术思路是通过代码记录人工操作时操作系统产生的相关事件(鼠标点击事件、键盘输入事件等)，在之后模拟人工操作过程中按序向操作系统发送已被记录的事件。这种技术一般会记录事件产生时对应的操作区域，操作区域的变化可能会导致机器模拟的行为失败。另一个问题是，对于某些前后步骤有时间要求的场景，人工在进行示范操作时，需要仔细考虑时间间隔等因素，避免出现待操作对象尚未出现而模拟事件已产生的现象，最终导致任务失败。At present, a mainstream technical idea of simulating manual operation is to record the relevant events (mouse click events, keyboard input events, etc.) generated by the operating system during manual operation through codes, and then send messages that have been processed to the operating system in sequence during the process of simulating manual operations. recorded events. This technique generally records the corresponding operation area when the event occurs, and changes in the operation area may cause the behavior of the machine simulation to fail. Another problem is that for some scenarios with time requirements for the previous and subsequent steps, when performing the demonstration operation manually, it is necessary to carefully consider the time interval and other factors to avoid the phenomenon that the object to be operated has not yet appeared but the simulated event has occurred, which will eventually lead to the failure of the task. .

发明内容SUMMARY OF THE INVENTION

为解决计算机模拟人工操作过程受操作区域变动和前后步骤时间差影响的问题，本发明实施例提供了一种基于图像匹配的模拟人工操作方法。所述技术方案如下：In order to solve the problem that the computer simulation manual operation process is affected by the change of the operation area and the time difference between the preceding and following steps, the embodiment of the present invention provides a method for simulating manual operation based on image matching. The technical solution is as follows:

一方面，提供了一种基于图像匹配的模拟人工操作方法，包括：In one aspect, a method for simulating manual operation based on image matching is provided, including:

步骤1：选取操作界面的定位图像；所述定位图像为组件的截图；所述定位图像的特征元素唯一；Step 1: select the positioning image of the operation interface; the positioning image is a screenshot of the component; the feature element of the positioning image is unique;

步骤2：提供操作事件序列；所述操作事件序列包括对所述定位图像的操作行为；Step 2: providing an operation event sequence; the operation event sequence includes an operation behavior on the positioning image;

步骤3：重复步骤1、步骤2，组成一套任务脚本；Step 3: Repeat steps 1 and 2 to form a set of task scripts;

步骤4：运行所述任务脚本，自动化完成办公行为。Step 4: Run the task script to automate the office behavior.

进一步地，步骤1中，选取操作界面的定位图像的具体步骤包括：Further, in step 1, the specific steps of selecting the positioning image of the operation interface include:

获取操作界面窗体图像；Get the image of the operation interface form;

然后通过图像比对算法计算所述窗体图像中是否存在满足所述定位图像特征的多个点位；若存在多个点位，所述定位图像的特征元素无法做到唯一定位，需要重新选择定位元素。Then, use an image comparison algorithm to calculate whether there are multiple points in the window image that meet the characteristics of the positioning image; if there are multiple points, the feature elements of the positioning image cannot be uniquely positioned and need to be re-selected Position the element.

进一步地，步骤2中，提供操作事件序列的具体步骤包括：Further, in step 2, the specific steps of providing the operation event sequence include:

先选择操作事件，再提供输入内容。Action events are selected before input is provided.

进一步地，步骤3中，组成一套任务脚本的具体步骤包括：Further, in step 3, the specific steps of forming a set of task scripts include:

分解预设任务的操作步骤，按照分解顺序依次执行步骤1、步骤2，全部处理完成后保存所述任务脚本。。For the operation steps of decomposing the preset task, step 1 and step 2 are performed in sequence according to the decomposition sequence, and the task script is saved after all processing is completed. .

本发明实施例提供的技术方案带来的有益效果是：The beneficial effects brought by the technical solutions provided in the embodiments of the present invention are:

本发明提供的一种基于图像匹配的模拟人工操作方法，通过图像匹配来确定每个操作步骤，再根据指令文件来确定每个步骤如何模拟人工操作。由于每个步骤的操作区域是根据图像匹配定位，所以操作界面的位置区域变动无法对定位造成影响。另外，如果软件未检测到某一步骤的定位图像，即认为当前步骤尚未开始，将会进入等待模式，等待下一步骤开始，不会模拟尚未出现的对象的待操作行为，保证任务连续和稳定。The invention provides a method for simulating manual operation based on image matching, which determines each operation step through image matching, and then determines how to simulate manual operation for each step according to an instruction file. Since the operation area of each step is positioned according to the image matching, the position area change of the operation interface cannot affect the positioning. In addition, if the software does not detect the positioning image of a certain step, it means that the current step has not yet started, and will enter the waiting mode to wait for the next step to start. .

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.

图1是本发明实施例的一种基于图像匹配的模拟人工操作方法的示意图。FIG. 1 is a schematic diagram of a method for simulating manual operation based on image matching according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

本实施例中提供了一种基于图像匹配的模拟人工操作方法，包括：The present embodiment provides a method for simulating manual operation based on image matching, including:

步骤1：选取定位图像A；定位图像A通常是某个按钮、文本框等组件的截图，需要尽可能保证A的图像特征唯一且明显。Step 1: Select the positioning image A; the positioning image A is usually a screenshot of a button, text box and other components, and it is necessary to ensure that the image features of A are unique and obvious as much as possible.

具体地，定位检查需要先获取整个操作界面窗体图像称为S，定位元素称为A，然后通过图像比对算法计算S中是否存在满足A特征的多个点位。若存在多个点位，意味A特征元素无法做到唯一定位，需要重新选择定位元素。Specifically, the positioning inspection needs to first obtain the entire operation interface window image called S, and the positioning element is called A, and then calculate whether there are multiple points in S that meet the characteristics of A through the image comparison algorithm. If there are multiple points, it means that the A feature element cannot be uniquely positioned, and the positioning element needs to be reselected.

步骤2：提供操作事件序列B；事件序列B中包含的内容是对定位图像A的操作行为，如按钮的点击、文本框的输入。Step 2: Provide an operation event sequence B; the content included in the event sequence B is the operation behavior on the positioning image A, such as button click and text box input.

具体地，先选择操作事件，在提供输入内容；如键盘输入事件，可能需要提供待输入的文本。Specifically, select the operation event first, and then provide the input content; for example, the keyboard input event may need to provide the text to be input.

步骤3：通过重复步骤1、2组成一套任务脚本，任务脚本的内容是为完成预设任务进行的一系列操作；Step 3: A set of task scripts is formed by repeating steps 1 and 2, and the content of the task script is a series of operations to complete the preset task;

具体地，分解某个任务的操作步骤，按照分解顺序依次执行步骤1、步骤2，全部处理完成后保存该任务脚本文件。Specifically, for the operation steps of decomposing a certain task, step 1 and step 2 are performed in sequence according to the decomposing order, and the task script file is saved after all processing is completed.

本实施例中，以访问网页为例，访问网页需要：In this embodiment, taking accessing a web page as an example, accessing a web page requires:

1)点击浏览器；1) Click on the browser;

2)地址栏中输入网址；2) Enter the URL in the address bar;

3)点击访问按钮。3) Click the Access button.

对应的脚本是：The corresponding script is:

1)选取桌面中浏览器图标作为定位图像A1，鼠标点击事件作为事件序列B1内容，该步骤作为任务脚本的一部分称为T1。1) Select the browser icon on the desktop as the positioning image A1, and the mouse click event as the content of the event sequence B1, and this step is called T1 as a part of the task script.

2)选取地址栏图像作为定位图像A2，文本输入事件以及网址作为事件序列B2内容，该步骤作为任务脚本的一部分称为T2。2) Select the address bar image as the positioning image A2, the text input event and the website address as the content of the event sequence B2, and this step is called T2 as a part of the task script.

3)选取访问按钮图像作为定位图像A3，鼠标点击事件作为事件序列B3内容，该步骤作为任务脚本的一部分称为T3。3) Select the access button image as the positioning image A3, and the mouse click event as the content of the event sequence B3. This step is called T3 as a part of the task script.

T1、T2、T3按序组成了访问网页的任务脚本T。T1, T2, and T3 form a task script T for accessing the webpage in sequence.

步骤4：执行任务脚本T时，桌面恢复初始图像P0，按照任务脚本T中的操作序列执行：Step 4: When the task script T is executed, the desktop restores the initial image P0, and executes according to the operation sequence in the task script T:

1)执行任务脚本T1，取出任务脚本T1中的定位图像A1，在图像P0中匹配搜索定位图像A1，当找到图像特征符合定位图像A1特征的区域，确定该区域在操作系统中的位置坐标(x1,y1)，模拟鼠标移动事件将鼠标移动到该位置(x1,y1)。通过操作系统提供的接口在该位置输入事件序列B1。全部结束后即完成了任务脚本T1执行。1) Execute the task script T1, take out the positioning image A1 in the task script T1, match and search the positioning image A1 in the image P0, when finding the area where the image features meet the characteristics of the positioning image A1, determine the position coordinates of this area in the operating system ( x1, y1), simulate a mouse move event to move the mouse to this position (x1, y1). The event sequence B1 is entered at this location through the interface provided by the operating system. After all is completed, the execution of the task script T1 is completed.

2)任务脚本T1执行后桌面会形成新的图像P1。执行任务脚本T2，取出任务脚本T2中的定位图像A2，在图像P1中匹配搜索定位图像A2，找到定位图像A2后输入事件序列B2。全部结束后完成T2执行。2) After the task script T1 is executed, a new image P1 will be formed on the desktop. Execute the task script T2, take out the positioning image A2 in the task script T2, match and search the positioning image A2 in the image P1, and input the event sequence B2 after finding the positioning image A2. After all is completed, T2 execution is completed.

3)任务脚本T2执行后桌面形成新的图像P2。执行任务脚本T3，取出任务脚本T3中的定位图像A3，在图像P2中匹配搜索定位图像A3，找到定位图像A3后输入事件序列B3。全部结束后完成任务脚本T3执行。3) After the task script T2 is executed, the desktop forms a new image P2. Execute the task script T3, take out the positioning image A3 in the task script T3, match and search for the positioning image A3 in the image P2, find the positioning image A3 and input the event sequence B3. After all is completed, the task script T3 is executed.

任务脚本T1、T2、T3全部执行完成后即任务脚本T的任务完成，也就是访问某个网页这一任务完成。After the task scripts T1, T2, and T3 are all executed, the task of the task script T is completed, that is, the task of accessing a certain webpage is completed.

具体操作时，参见图1：For specific operations, see Figure 1:

1.用户打开软件，新建task文件，确定本次的办公任务，分解好任务的每个步骤。1. The user opens the software, creates a new task file, determines the current office task, and decomposes each step of the task.

2.转到单步操作界面，利用软件选取定位图像，软件进行定位可靠性检查，便于task文件执行时软件能确定某步骤。检查通过后将定位图像暂存至内存当中；检测不通过提示用户重新选择。2. Go to the single-step operation interface, use the software to select the positioning image, and the software will check the positioning reliability, so that the software can determine a certain step when the task file is executed. After the inspection is passed, the positioning image will be temporarily stored in the memory; if the inspection fails, the user will be prompted to select again.

3.提示用户选择操作事件，将用户提供的操作事件以及内容保存在内存当中。3. Prompt the user to select an operation event, and store the operation event and content provided by the user in the memory.

4.提示用户是否继续编写脚本操作，若是，则转步骤2；若否，则表明用于完成所有脚本内容编写，软件将内存中所有内容持久化存储至task文件。4. Prompt the user whether to continue writing the script operation. If yes, go to step 2;

5.用户利用软件运行task文件，软件根据task文件中保存的每个步骤的定位图像以及事件内容自动化完成task文件指向的办公任务。5. The user uses the software to run the task file, and the software automatically completes the office task pointed to by the task file according to the positioning image of each step saved in the task file and the event content.

由于每个步骤的操作区域是根据图像匹配定位，所以操作界面的位置区域变动无法对定位造成影响。另外，如果软件未检测到某一步骤的定位图像，即认为当前步骤尚未开始，将会进入等待模式，等待下一步骤开始，不会模拟尚未出现的对象的待操作行为，保证任务连续和稳定。Since the operation area of each step is positioned according to the image matching, the position area change of the operation interface cannot affect the positioning. In addition, if the software does not detect the positioning image of a certain step, it means that the current step has not yet started, and will enter the waiting mode to wait for the next step to start. .

以上仅为本发明的较佳实施例，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention. Inside.

Claims

1. a simulation manual operation method based on image matching, is characterized in that, comprises:

Step 1: select the positioning image of the operation interface; the positioning image is a screenshot of the component; the feature element of the positioning image is unique;

Step 2: providing an operation event sequence; the operation event sequence includes an operation behavior on the positioning image;

Step 3: Repeat steps 1 and 2 to form a set of task scripts;

Step 4: Run the task script to automate the office behavior.

2. a kind of simulated manual operation method based on image matching as claimed in claim 1, is characterized in that, in step 1, the concrete step of selecting the positioning image of operation interface comprises:

Get the image of the operation interface form;

Then, use an image comparison algorithm to calculate whether there are multiple points in the window image that meet the characteristics of the positioning image; if there are multiple points, the feature elements of the positioning image cannot be uniquely positioned and need to be re-selected Position the element.

3. a kind of simulated manual operation method based on image matching as claimed in claim 2, is characterized in that, in step 2, the concrete step that provides operation event sequence comprises:

Action events are selected before input is provided.

4. a kind of simulated manual operation method based on image matching as claimed in claim 3, is characterized in that, in step 3, the concrete steps of forming a set of task script comprise:

For the operation steps of decomposing the preset task, step 1 and step 2 are performed in sequence according to the decomposition sequence, and the task script is saved after all processing is completed.