[go: up one dir, main page]

CN120256010B - Service carrier automatic registration login and fund account acquisition method and system - Google Patents

Service carrier automatic registration login and fund account acquisition method and system

Info

Publication number
CN120256010B
CN120256010B CN202510747789.5A CN202510747789A CN120256010B CN 120256010 B CN120256010 B CN 120256010B CN 202510747789 A CN202510747789 A CN 202510747789A CN 120256010 B CN120256010 B CN 120256010B
Authority
CN
China
Prior art keywords
page
frame
coordinates
button
browser
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510747789.5A
Other languages
Chinese (zh)
Other versions
CN120256010A (en
Inventor
张兆心
陆韬丞
程亚楠
郑申奥
赵东
孙凡荀
甘珺昂
李瑞恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Tianhe Cyberspace Security Technology Research Institute Co ltd
Harbin Institute of Technology Weihai
Original Assignee
Shandong Tianhe Cyberspace Security Technology Research Institute Co ltd
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Tianhe Cyberspace Security Technology Research Institute Co ltd, Harbin Institute of Technology Weihai filed Critical Shandong Tianhe Cyberspace Security Technology Research Institute Co ltd
Priority to CN202510747789.5A priority Critical patent/CN120256010B/en
Publication of CN120256010A publication Critical patent/CN120256010A/en
Application granted granted Critical
Publication of CN120256010B publication Critical patent/CN120256010B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0483Interaction with page-structured environments, e.g. book metaphor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/0485Scrolling or panning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/484Precedence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physiology (AREA)
  • Computer Hardware Design (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供业务载体自动化注册登录及资金账户获取方法及系统,涉及数据处理技术领域,所述方法包括:对交互任务队列中位于框架内的元素,转换其坐标为相对框架的坐标,并控制浏览器切换至对应框架;遍历所述交互任务队列,依次执行以下操作:对按钮类元素执行模拟点击操作;对输入框类元素注入虚拟身份信息;对验证码类元素调用对应识别模型完成验证;页面滚动与提交检测,完成交互任务队列操作后,检测当前页面是否存在可见的提交按钮,若不可见;提交后若跳转至充值页面,通过正则表达式提取支付账户信息,并与所述虚拟身份信息关联存储至数据库。本发明可以提升动态页面下的自动化操作鲁棒性。

The present invention provides a method and system for automated registration and login of a business carrier and acquisition of a fund account, and relates to the field of data processing technology. The method comprises: for an element located within a frame in an interactive task queue, converting its coordinates into coordinates of a relative frame, and controlling the browser to switch to the corresponding frame; traversing the interactive task queue, and sequentially performing the following operations: performing a simulated click operation on a button-type element; injecting virtual identity information into an input box-type element; calling a corresponding recognition model to complete verification of a verification code-type element; page scrolling and submission detection, after completing the interactive task queue operation, detecting whether there is a visible submit button on the current page, if not; after submission, if jumping to a recharge page, extracting payment account information through a regular expression, and storing it in a database in association with the virtual identity information. The present invention can improve the robustness of automated operations under dynamic pages.

Description

Service carrier automatic registration login and fund account acquisition method and system
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a system for automatically registering and logging in a service carrier and acquiring a fund account.
Background
In the technical field of automatic webpage interaction, the existing solution still has an optimization space when processing a dynamic webpage environment and a composite interaction scene, and is mainly expressed in the following aspects:
Page structure adaptation limitations current mainstream automation tools (e.g., DOM parsing based scripting engines) typically rely on static element positioning paths (e.g., XPath/CSS selectors). When the target page appears:
And when the interface layout version iterates, element identifiers are dynamically generated or the positions of interaction components are adaptively adjusted and other common changes are performed, positioning failure and flow interruption are easy to cause.
For page elements adopting a frame nesting technology (such as < iframe >), the existing scheme is easy to cause difficulty in accurately calculating element positions in a subframe because a coordinate system conversion mechanism is not established, and has the problems of lack of an automatic context switching function or compatibility in cross-frame operation.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method for automatically registering and logging in a service carrier and acquiring a fund account, which can improve the robustness of the automatic operation under a dynamic page.
In order to solve the technical problems, the technical scheme of the invention is as follows:
The method for automatically registering and logging in the service carrier and acquiring the fund account comprises the following steps:
step S1, accessing a target website domain name through a headless browser, and capturing a current visible area after loading is completed to generate a page image;
step S2, inputting the page image into a pre-trained target detection model, and identifying the category, position coordinates and confidence level of the UI element in the page;
S3, converting the coordinates of the elements in the frame in the interactive task queue into the coordinates of the corresponding frame, and controlling the browser to switch to the corresponding frame;
Step S4, traversing the interactive task queue, and sequentially executing the following operations of performing simulated clicking operation on button elements, injecting virtual identity information into input frame elements, calling a corresponding identification model for verification code elements, and finishing verification;
Step S5, page scrolling and submitting detection, after finishing the operation of the interactive task queue, detecting whether a visible submitting button exists on the current page, if not, scrolling the page and re-executing the steps S1 to S2 until the visible submitting button is identified, and if so, triggering the submitting operation;
and S6, if the user jumps to the recharging page after submitting, extracting payment account information through the regular expression, and storing the payment account information and the virtual identity information in a database in an associated mode.
A business carrier automated registration login and funds account acquisition system, comprising:
The acquisition module is used for accessing the domain name of the target website through the headless browser, and capturing a current visible area after loading is completed to generate a page image;
The generation module is used for inputting the page image into a pre-trained target detection model, and identifying the category, the position coordinate and the confidence of the UI element in the page;
the conversion module is used for converting the coordinates of the elements in the frame in the interactive task queue into the coordinates of the relative frame and controlling the browser to switch to the corresponding frame;
The verification module is used for traversing the interactive task queue and sequentially executing the following operations of performing simulated clicking operation on button elements, injecting virtual identity information into input frame elements, calling a corresponding identification model for verification code elements and completing verification;
The processing module is used for detecting page rolling and submitting, detecting whether a visible submitting button exists on a current page after the interactive task queue operation is completed until the visible submitting button is identified, triggering the submitting operation if the visible submitting button is visible, and extracting payment account information through a regular expression and storing the payment account information and the virtual identity information in a database in a correlated manner if the payment account information is jumped to a recharging page after the submitting.
The scheme of the invention at least comprises the following beneficial effects:
By adopting a visual-driven UI recognition mechanism, the page image is directly analyzed through the target detection model, dependence on a DOM structure is eliminated, random change of page layout, dynamic generation of element identifiers and self-adaptive adjustment of component positions are effectively performed, and the problem of flow interruption caused by path failure of a traditional script can be solved.
The dynamic conversion chain of the coordinate system is established, the nested structure of the frame can be automatically detected, the relative coordinates can be calculated in real time, the browser context can be intelligently switched, and the operation error rate of elements crossing the frame can be broken through.
By designing a rolling-redetection closed-loop mechanism, rolling can be triggered based on the state of a visible area, automatic re-screenshot and element recognition after rolling can establish an operated element locking rule, 100% of bottom operation components can be triggered, and page state conflict is avoided.
The hierarchical analysis-standardized storage flow is developed, the regular expression is adapted to the fragmented text, the virtual identity is automatically associated with the payment account, and the background analysis of unstructured data (such as two-dimensional codes) can improve the extraction integrity rate of key account information.
Drawings
Fig. 1 is a flow chart of a method for automatically registering and logging in and acquiring a fund account by using a service carrier according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a service carrier automated registration login and funds account acquisition system according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention proposes a method for service carrier automated registration and login and funds account acquisition, the method comprising the steps of:
step S1, accessing a target website domain name through a headless browser, and capturing a current visible area after loading is completed to generate a page image;
step S2, inputting the page image into a pre-trained target detection model, and identifying the category, position coordinates and confidence level of the UI element in the page;
S3, converting the coordinates of the elements in the frame in the interactive task queue into the coordinates of the corresponding frame, and controlling the browser to switch to the corresponding frame;
Step S4, traversing the interactive task queue, and sequentially executing the following operations of performing simulated clicking operation on button elements, injecting virtual identity information into input frame elements, calling a corresponding identification model for verification code elements, and finishing verification;
Step S5, page scrolling and submitting detection, after finishing the operation of the interactive task queue, detecting whether a visible submitting button exists on the current page, if not, scrolling the page and re-executing the steps S1 to S2 until the visible submitting button is identified, and if so, triggering the submitting operation;
and S6, if the user jumps to the recharging page after submitting, extracting payment account information through the regular expression, and storing the payment account information and the virtual identity information in a database in an associated mode.
In the embodiment of the invention, the full-flow automatic execution simulates manual operation through a headless browser, and the page elements are identified by combining the target detection model, so that the full-flow automatic from accessing websites and filling information to submitting forms is realized, the manual intervention cost is reduced, and the method is particularly suitable for scenes such as batch registration, multi-account management and the like. The intelligent sequencing of task priorities is to prioritize UI elements (such as filling priority) based on preset rules, so that the interactive tasks are ensured to be executed according to a logic sequence, misoperation caused by disordered element loading sequence is avoided, and flow stability is improved. The element coordinate conversion in the frame aims at a nested frame (such as iframe) scene, converts the element coordinate into the coordinate of the opposite frame and switches the context, so that the technical difficulty of cross-frame interaction is solved, and the accurate positioning of the operation to the target element is ensured. And calling a special recognition model (such as OCR and machine learning models) aiming at the verification code, breaking through the limitation of the traditional automatic tool in the scene of the verification code, and improving the processing capacity of the complex page. The dynamic page rolling detection automatically detects the visibility of the submit button, refreshes the visible area by rolling the page and re-identifies the elements, adapts to the scene of long page or dynamic loading content, and avoids the interruption of the flow caused by element hiding. And if the multi-page skip is to be skipped to a recharging page after the multi-page skip and data association are submitted, extracting payment account information through a regular expression and binding and storing the payment account information with a virtual identity, so that the automatic collection and the structured storage of the cross-page data are realized. The model-driven expandability target detection model and the verification code identification model can be continuously optimized through training, adapt to UI changes of different websites, reduce tool failure risks caused by website modification, and improve cross-platform adaptation capability. The data security and compliance are simulated through virtual identity information injection (instead of real data), so that real data leakage risk is reduced, and meanwhile, service requirements such as automatic test and data acquisition in a compliance scene are supported.
In a preferred embodiment of the present invention, step S1, accessing a domain name of a target website through a headless browser, and after loading, capturing a current visible area to generate a page image, includes:
step S11, after the page loading is completed, detecting whether a popup window element shielding a key area exists in the current visible area, if so, automatically calculating the coordinate position of a popup window closing button and triggering a simulated clicking operation to expose the bottom page element;
Step S12, based on the page state of the cleared popup window, identifying the position of a registration button in the page, calculating the coordinate of the center point of the registration button, executing coordinate positioning click through a browser driver, and triggering the page to jump to the registration page;
Step S13, after loading the registered page, controlling the browser to execute vertical scrolling operation, wherein the initial scrolling position is the top of the page, and scrolling step sizes are set according to the screen height in an equal division manner, and the browser is gradually scrolled downwards to the bottom of the page;
And S14, after the registration form submitting operation is completed, monitoring a new page URL which is jumped to by the browser, judging as a recharging page when the page URL contains characteristic keywords, and directly intercepting the current visible area image after the page loading is completed.
In the embodiment of the invention, the steps can be realized through the following specific schemes when applied in specific applications, for example:
in the step S11, the normal position (such as the middle lower part of the page) of the core interaction element such as the register button is preset and defined as the coordinate range of the key area (such as the interval of 30% -70% of the page height and 20% -80% of the width), the coordinate range of the popup window element in the screenshot is analyzed to judge whether the popup window is overlapped with the coordinate of the key area, if so, the popup window is judged to be shielded, the upper right corner area (usually the position of the close button) is calculated based on the coordinate (left, upper, wide and high) of the popup window element, the upper left corner+ (width-20 pixels and height-20 pixels) of the popup window is taken as the central point coordinate of the close button, and a click event is sent to the coordinate through the driving of the browser to close the popup window.
In the step S12, the boundary frame coordinates (left, upper, wide and high) of the registration button are output through the target detection model, the coordinates of the central point of the button are calculated to be (left+width/2, upper+height/2) according to the boundary frame coordinates, the coordinates of the central point are mapped to the visual port coordinate system of the browser, and the registration entry is accurately positioned by driving the execution mouse to move to the coordinates and triggering the clicking event to trigger the page jump.
In the step S13, the vertical height (H, unit pixel) of the view port of the browser is obtained, the total height of the page (which can be obtained through document. Body. Scroll height) is equally divided according to the view port height, the step length is set to be H, that is, one screen height is rolled each time, the H pixel is rolled downwards from the top (rolling position 0) of the page until the rolling position reaches or exceeds the total height of the page, so as to ensure that all contents are loaded, and after the screen is rolled to the bottom of the page, the screenshot interface is called to capture the current complete page image, including all form input boxes and submit buttons.
In the step S14, feature keywords (such as "recharge", "pay", "fund") of the recharging page are preset, the path or parameter part of the URL of the new page is analyzed, whether any keyword is included is judged, if the URL includes the feature keywords, the recharging page is judged, the screenshot operation is triggered, otherwise, monitoring is continued, the image in the view port of the current browser is directly intercepted for subsequent account information extraction, the jump stage of the registration process is automatically identified, the recharging page is accurately positioned, the key information page is timely captured, a foundation is laid for subsequent payment account information extraction through regular expressions, and the efficiency of cross-page data acquisition is improved.
In a preferred embodiment of the present invention, step S2, inputting the page image into a pre-trained target detection model, identifying UI element types, position coordinates and confidence in the page, and sorting the identification results based on a preset priority rule to generate an interactive task queue arranged in descending order of priority, including:
Step S21, inputting the page images generated in the step S13 and the step S14 into a pre-trained target detection model, and outputting a recognition result set of each UI element, wherein each recognition result comprises a category label, a position coordinate and a confidence coefficient;
Step S22, carrying out priority assignment on each identification result output in the step S21 according to Euclidean distance between the element center point and the page center to obtain weights respectively corresponding to the category labels, the position coordinates and the confidence level;
Step S23, arranging the element set with the priority label output in the step S22 in a descending order of priority, and arranging the element set with the priority in a descending order of confidence level with the priority element set to generate a structured task queue;
step S24, traversing the structured task queue of step S23, namely calculating the absolute pixel coordinates of each element in the browser window according to the normalized coordinates of each element, and converting the absolute pixel coordinates into XPath paths through a DOM position back-push algorithm.
In the embodiment of the invention, the steps can be realized through the following specific schemes when applied in specific applications, for example:
In the step S21, the whole page image or the visible area image generated in the steps S13 and S14 is scaled to the model input size (e.g. 640×480 pixels), the pixel proportion is kept to avoid distortion, the pre-trained target detection model (e.g. YOLO, fasterR-CNN) carries out convolution calculation on the image, the coordinates (left, upper, wide and high) of the bounding boxes of all visible UI elements (e.g. input boxes, buttons and labels) are identified, and the pre-defined category labels (e.g. "input", "button", "captcha") are matched, the model generates a confidence score (between 0 and 1) for each identification result, and the probability that the element identification is correct (e.g. confidence is equal to or greater than 0.8) is considered as reliable identification.
In the step S22, the coordinates of the center of the page are preset to be (page width/2, page height/2), the Euclidean distance between the center point (left+width/2, upper+height/2) of each element boundary frame and the center of the page is calculated, the closer the distance is, the higher the priority basic value is (for example, the element with the distance less than or equal to 1/4 of the page height is regarded as a 'high priority area'), the preset element category priority weight (for example, the input frame weight +30, the button weight +20 and the prompt text weight +10) is set, the key operation type element (for example, the register button and the submit button) is higher in weight, the element with the confidence more than or equal to 0.9 is additionally added with the priority score (for example +10), the element with the confidence less than or equal to 0.5 is marked as 'to be confirmed', the score (for example, -20 score), the total priority score is obtained by accumulating the distance, category and confidence score, and the total priority score is divided into five levels (for example, the highest level 1 and lowest level).
Level 1 (score no less than 80), the input box and the submit button are needed to be filled;
Grade 3 (score 40-60) option and common prompt label;
Grade 5 (score < 20) advertising elements, non-critical decorative patterns.
The method combines element position, functional importance and identification reliability, ensures the preferential treatment of the core interaction elements (such as a registration form input box), avoids the interference of secondary elements on the flow, adapts to different page layouts (such as difference between a mobile terminal and a PC terminal), automatically adjusts element priority, and improves the flexibility of the flow.
In step S23, the first order (descending order of priority) is first ordered according to the priority level of step S22 from 1 to 5, so as to ensure that the higher-level elements (such as the necessary entries) are in front. The method comprises the steps of performing secondary ranking (confidence descending order), namely, in the same-level elements, ranking from high to low according to a confidence score (for example, two input boxes which are all 2 levels and the element with the confidence of 0.9 is ranked in front of 0.8), and converting the ranked elements into JSON format task items containing operation types (click/input/verification), element coordinates, priority labels and confidence, so as to generate an executable queue structure (for example, task 1, task 2, task N).
The invention ensures that the automatic process is executed according to the principles of importance priority and reliability priority through the double sequencing of priority and confidence, reduces logic errors caused by disordered element sequences, and has the advantages of convenient debugging and log recording of a structured queue and supporting advanced functions such as midway pause, breakpoint continuous transmission and the like.
Step S24, normalizing the coordinates to absolute pixels:
If the element coordinates are normalized values (0-1 interval, for example, the center point coordinate x=0.5 represents the horizontal center line of the page), multiplying the actual size of the window of the browser (for example, the window width 1920 pixels, x=0.5x1920=960 pixels) to obtain absolute pixel coordinates (XY), obtaining a DOM tree of the page through the driving of the browser, and traversing the search element from the root node (html) according to the following logic:
Locating the uppermost visible element by elementFromPoint (XY) method according to absolute pixel coordinates (XY), recursively tracing the parent node upwards, concatenating the label name and attribute (such as// input [ @ id = 'username' ]) to generate a unique XPath path, removing redundancy level (such as skipping div container layer), and reserving the shortest effective path (such as// form [ @ class = 'register-form' ]/input [1 ]).
According to the method, through bidirectional mapping of the pixel coordinates and the DOM structure, the problem of matching of 'visual visible elements' and 'bottom code elements' is solved, simulation operation is ensured to be precisely acted on a target control (such as a hidden button which is prevented from being clicked to be shielded), an absolute pixel coordinate conversion mechanism is adapted to different screen sizes (such as a mobile phone, a flat plate and a desktop end), a XPath path supports dynamic page structure change, and the positioning failure risk caused by page modification is reduced.
In a preferred embodiment of the present invention, the pre-trained object detection model comprises:
Extracting a target website domain name from an abused domain name library, accessing the domain name through an automatic script and executing page loading state verification;
Based on the complete page screenshot, adopting a hierarchical clustering algorithm to group images with similar visual characteristics so as to obtain a de-duplicated image;
labeling category labels and position information of 17 types of UI elements on the image after the duplication removal to generate a labeling file conforming to a target detection format;
Integrating the annotation files, and dividing the annotation files into a training set, a verification set and a test set according to a preset proportion;
Based on the training set, the validation set and the test set, the following automatic optimization procedure is performed:
Initializing parameter populations, wherein each group of parameters comprises learning rate, anchor frame size and network structure configuration, performing model training on each parameter combination, calculating fitness by using verification set precision and loss value, iteratively generating a new population through selection, intersection and mutation operation until the fitness converges to obtain optimized parameters, performing whole network fine tuning and fusion incremental sample training by adopting the optimized parameters to obtain a pre-trained target detection model.
In the embodiment of the invention, the steps can be realized through the following specific schemes when applied in specific applications, for example:
Active domain names are screened from the abused domain name library (e.g., DNS resolution records are present in the last 30 days), excluding domain names that have been marked as inaccessible. The HTTP request is sent by an automated script (such as the requests library of Python), the response status code is detected (such as 200 indicates success), the response content is parsed, and it is verified whether the complete HTML structure is contained (such as the existence of < | DOCTYPEhtml > declaration, < HTML > root tag). The method and the device for acquiring the page data of the input model training can ensure that the page data of the input model training are valid samples which can be normally accessed, avoid the waste of calculation resources, automatically filter abnormal pages through structural verification rules and improve the reliability of data acquisition.
The method comprises the steps of presetting common popup window closing button features (such as an upper right corner X icon, coordinates are located in 10% -20% of the top of a page and 10% -20% of the right side area), positioning and triggering clicking through an image matching technology (such as template matching), positioning buttons based on page text features (such as 'registration', 'SignUp') or visual features (such as orange buttons and larger fonts), calculating center point coordinates, then simulating clicking, triggering a registration process, gradually scrolling the page (1/2 of the height of a window is scrolled each time) until the content at the bottom of the page is not updated (such as scrollHeight is unchanged after continuous scrolling), and capturing a complete image containing a foothold submitting button. The invention eliminates the interference element (popup window) by simulating manual operation and triggers key interaction (registration), ensures that the screenshot contains a complete registration form structure, and can capture the form element loaded on a long page or asynchronously by a rolling loading mechanism so as to avoid data loss.
Extracting visual features (such as color histogram, gradient direction histogram HOG and CSS style features) from each screenshot to generate feature vectors with fixed dimension (such as 512 dimensions), calculating cosine similarity (similarity >0.8 is regarded as similarity) among images to construct a tree-shaped cluster structure, selecting a center image as a representative sample for each cluster, eliminating other similar images in the cluster, retaining about 30% -50% of original images, ensuring that similar pages only retain 1-2 representative samples, eliminating repeated or highly similar page images (such as registered pages with different domain names but identical templates) through clustering, reducing training data quantity and retaining feature diversity;
the training efficiency is improved, the data volume after the duplicate removal is reduced, the model training time can be shortened, and repeated learning of redundant information is avoided.
Manually annotating the de-duplicated image with 17 classes of UI elements (e.g., input boxes, radio boxes, drop-down menus, captcha boxes, etc.), drawing bounding boxes and associating class labels using an annotating tool (e.g., labelMe, rectLabel), converting the annotation data into a format required by the object detection model (e.g., the. Txt format of Yolo, each line containing a class index and normalized coordinates).
Data set partitioning:
The invention randomly divides training sets, verification sets and test sets according to the ratio of 7:2:1, ensures that the distribution of various elements in each set is balanced (such as the input box is consistent with the whole in the training set), ensures element positioning accuracy (pixel-level boundary box) and class accuracy, provides reliable supervision signals for models, and supports performance verification (verification set parameter adjustment) and final generalization capability test (test set) in the model training process by a staged data set.
Model parameter optimization flow process based on genetic algorithm:
Parameter population initialization:
Generating 100 groups of initial parameter combinations, setting a learning rate range to be 0.001-0.1, presetting anchor frame size reference COCO data sets (such as (10, 13), (16, 30) and the like), selecting YOLOv-spp or FasterR-CNN variants by a network structure, training a model 50 by each group of parameters, and recording mAP (average precision mean) and loss values of a verification set, wherein an adaptability formula is that the adaptability = mAP-0.5 multiplied by loss values, and the higher the value is, the better the performance is represented.
Genetic manipulation:
selecting, namely reserving a parameter combination of which the fitness is 20% before as a parent;
crossing, wherein the parent parameters randomly exchange part of dimensions (such as the crossing of learning rate and anchor frame size);
Variation, adding random disturbance (such as learning rate + -10% fluctuation) to the parameters after crossing.
And stopping iteration when the iteration is terminated and the continuous 5 generations of fitness is improved by the amplitude of <1%, and selecting an optimal parameter combination.
And fine-tuning the model, training the model by using the optimized parameters, gradually adding newly acquired incremental samples (such as newly adding 1000 screenshots every week), and performing online learning.
The invention can replace manual trial-and-error parameter adjustment, quickly search the optimal parameter combination through a bionic algorithm, improve the training efficiency of the model (the parameter adjustment time is shortened by more than 70 percent), and enable the model to continuously learn a new page design mode (such as a novel verification code mode) by an incremental sample training mechanism, and delay the model failure period caused by website modification.
In a preferred embodiment of the present invention, calculating fitness using validation set accuracy and loss value comprises:
Based on the model training output of the current parameter combination, loading the prediction result of the model training output on the verification set;
Counting the matching condition of the prediction frame and the real frame, namely calculating the coincidence degree of the target detection model prediction frame and the labeling frame for each image of the verification set;
respectively counting the proportion of the correctly detected quantity to the corresponding class labeling total number according to 17 classes of UI elements, and taking arithmetic average on the precision values of all classes to obtain an average precision index of the verification set;
extracting CIoU loss values of all samples from a target detection model verification log, and taking arithmetic average on CIoU loss values of all samples to obtain a CIoU loss average value;
defining a precision weight coefficient and a loss weight coefficient, weighting an average precision index according to positive correlation, and weighting a CIoU loss mean value according to negative correlation to generate an fitness score.
In the embodiment of the invention, the steps can be realized through the following specific schemes when applied in specific applications, for example:
And (3) forward propagating each image in the verification set by using a target detection model trained by the current parameter combination, and outputting coordinates (x 1, y1, x2, y 2), category labels and confidence scores of the prediction frames. The prediction results are stored in a structured format (e.g., JSON) by image ID, and each prediction box contains [ category, confidence, coordinates ] information. And reading the real annotation frame information of the corresponding image from the verification set annotation file, and ensuring that the prediction result corresponds to the real label one by one according to the image ID. The invention stores the prediction and the real result in a unified format, provides a basis for subsequent matching calculation, avoids evaluation errors caused by inconsistent data structures, and saves intermediate results to support subsequent detailed analysis (such as false detection/missing detection case backtracking).
And traversing all the prediction frames and the true annotation frames of each image in the verification set.
IoU calculation, for each prediction box, calculate its intersection ratio with all real boxes (IoU):
Intersection area, which is to predict the pixel number of the overlapping area of the frame and the real frame;
union area, i.e. predicted frame area + real frame area-intersection area;
IoU = intersection area/union area.
And (3) judging the matching, wherein if IoU of a certain predicted frame and a certain real frame are more than or equal to 0.5 (the threshold value is adjustable) and the types are the same, the matching is judged to be effective once.
The method and the device provided by the invention have the advantages that IoU is used as standard measurement, the spatial coincidence degree of the prediction frame and the real target is intuitively reflected, the limitation of independent coordinate errors is avoided, each real target is ensured to have the opportunity to be correctly matched by traversing all possible frame pair combinations, and the evaluation accuracy under a multi-target scene is improved.
The calculation process for calculating the classification accuracy is as follows:
Category-level statistics, namely respectively counting according to 17 types of UI elements (such as an input box, a button and a verification code):
the correct detection number is the number of the predicted frames which is more than or equal to 0.5 and the real frames IoU with the same category;
The total number of labels is the total number of true label frames of the category in the verification set.
Class accuracy calculation, namely each class accuracy = correct detection number/annotation total number multiplied by 100%.
And calculating average precision, namely arithmetic averaging the 17-class precision values to obtain the average precision (mAP@0.5) of the verification set. The method provided by the invention distinguishes the detection effect of different UI elements, identifies the advantages of the model and the short plate (such as continuous missed detection of certain type of elements), avoids evaluation deviation (such as category leading result with large proportion) caused by uneven category distribution, and ensures equal contribution of all categories to the final index.
And extracting CIoU (CompleteIoU) loss values of each sample from the model verification log, comprehensively considering the overlapping rate, the center point distance and the length-width ratio of the prediction frame and the real frame by CIoU loss, removing extreme loss values (such as values exceeding 3 times of standard deviation), avoiding the influence of outliers on the overall average value, summing up CIoU loss values of all effective samples, and averaging to obtain CIoU loss average value. The method has the advantages that CIoU is lost, the overlapping rate, the position precision and the shape consistency are optimized, the quality of the predicted frame is reflected more comprehensively than that of the traditional IoU, noise interference is reduced through outlier filtering, the lost average value can represent the real performance of a model better, and misjudgment caused by individual sample fluctuation is avoided.
The weighted fitness score is generated as follows:
coefficient setting:
the importance of the average precision is reflected by the precision weight coefficient (such as 0.7);
The loss weight coefficient (e.g., 0.3) reflects the importance of CIoU loss.
Normalization:
Scaling the average precision index to the (0, 1) interval (if the original precision is 85%, the normalized value is 0.85);
CIoU the mean loss is mapped to the [0,1] interval by 1/(1+ mean loss) (the smaller the loss, the larger the mapping value).
Weighted summation-fitness score = precision weight coefficient x normalized precision + loss weight coefficient x normalized loss map value.
The invention considers the accuracy (precision) and the positioning quality (loss) of the model at the same time, avoids one-sided optimization caused by a single index (such as sacrificing the positioning precision only by pursuing a high recall), and flexibly adapts to different service requirements by adjusting the weight coefficient (such as improving the loss weight for scenes with extremely high positioning precision requirements).
In a preferred embodiment of the present invention, step S3, for an element in the frame in the interaction task queue, converts its coordinate into a coordinate relative to the frame, and controls the browser to switch to the corresponding frame, includes:
S31, analyzing XPath path characteristics generated in S24, if the path contains an HTML label or a frame hierarchical structure, judging that the corresponding element is positioned in the frame, and extracting a frame identifier (ID or Name attribute) of the frame in a parent page;
Step S32, relative coordinate conversion calculation, namely reading pixel coordinates (X_frame, Y_frame) of the upper left corner of the frame in a parent page through a browser interface, and reading the width W_frame and the height H_frame of the frame;
calculating the relative position of the elements within the frame:
relative abscissa= (element absolute abscissa-x_frame)/(w_frame);
relative ordinate= (element absolute ordinate-y_frame)/(h_frame);
Outputting normalized relative coordinates;
step S33, based on the frame identifier extracted in S31, locating DOM nodes of the corresponding frame in the browser, and switching WebDriver the operation context to the inside of the frame.
In the embodiment of the invention, the steps can be realized through the following specific schemes when applied in specific applications, for example:
in the above step S31, the XPath path is parsed, and the XPath path (such as// html/body/iframe [ @ id = 'register' ]/div/input) of each element in the task queue is traversed, and whether the iframe or frame tag is included is checked. If an iframe label exists in the path, extracting an id or name attribute value (such as a register) of the label as a unique identifier of the frame, and if the path comprises a nested frame (such as a father frame- & gtchild frame), recursively extracting the identifier of each level to generate a frame level chain (such as [ "parent_frame", "child_frame" ]).
According to the method, manual annotation is not needed, the frame level in the page is automatically found through XPath grammar characteristics, the problem of cross-frame positioning which is difficult to process by traditional automation tools is solved, the id/name is extracted as a positioning basis, the method is more stable than the coordinate, and the positioning failure risk caused by page layout change is reduced.
In the above step S32, the browser driver obtains the upper left corner coordinates (x_frame, y_frame) of the target frame in the parent page, obtains the width (w_frame) and the height (h_frame) of the frame, reads the absolute pixel coordinates (x_abs, y_abs) of the element from the task queue, and calculates the offset of the element with respect to the upper left corner of the frame:
Horizontal offset = x_abs-x_frame;
vertical offset = y_abs-y_frame.
Normalization:
dividing the offset by the frame size to obtain normalized relative coordinates:
Relative abscissa = horizontal offset/w_frame;
Relative ordinate = vertical offset/h_frame.
The normalized coordinates are irrelevant to the specific screen size, and the same set of coordinates can be multiplexed under different resolutions, so that the compatibility of an automation script is improved;
The dynamic frame is suitable for positioning elements accurately by relative coordinates as long as the internal structure of the frame is unchanged even if the frame position is changed due to page response type design, and the maintenance cost is reduced.
In the step S33, the frame identifier (e.g. id= "register") extracted in S31 is used to locate the frame DOM node by the switch_to.frame () method of the browser driver (e.g. Selenium), the context switch operation is performed to limit the scope of all subsequent operations (e.g. clicking and inputting) in the target frame, if there is a multi-layer nested frame, the frames are sequentially switched (e.g. switched to the parent frame and then to the child frame) according to the hierarchical chain, after switching, by driving the element query (e.g. find_element_by_xpath ()), whether the elements in the frame can be correctly located is verified, and the success of switching is ensured.
By context switching, the invention enables the automation tool to operate the elements in the frame like operating the common page, breaks through the limitation that the traditional tool can only operate the top page, ensures that all operations aiming at the elements in the frame are executed in correct contexts, and avoids operation failure or misoperation caused by context confusion.
In a preferred embodiment of the present invention, step S4 traverses the interaction task queue, and sequentially performs the following operations, including performing a simulated click operation on a button element, injecting virtual identity information into an input frame element, and calling a corresponding recognition model for a verification code element to complete verification, including:
S41, sequentially reading task items from the head of the structured task queue generated in S23, and determining an execution sequence according to priority labels in the task items;
step S42, the current task item:
If the element is located in the frame, the relative coordinates calculated in the step S32 are adopted, and if the element is located in the main document, the absolute coordinates converted in the step S24 are adopted, and the operation coordinates in the view port of the browser are dynamically calculated:
a main document directly using absolute pixel coordinates;
the relative coordinates are multiplied by the actual size of the frame and the frame offset;
Step S43, executing corresponding operation according to element category, clicking button operation, waiting for 500ms page response time after clicking, inputting form operation, branching processing according to verification code type, and sliding verification code;
and step S44, intercepting the visible area of the current page again after completing one task item operation, calling a target detection model to verify the operation effect, recording the positioning information of the current element if the operation fails, and reinserting the task into the tail of the queue.
In the embodiment of the invention, the steps can be realized through the following specific schemes when applied in specific applications, for example:
in step S41, the ordered structured task queues (the priorities are arranged in descending order and the priorities are arranged in descending order according to the confidence level) are obtained from step S23, the task items are extracted one by one according to the queue order, each task item comprises an element type (a button/an input box/a verification code), coordinate information and a priority label, if a high-priority task (such as an input box must be filled) exists in the queue, priority processing is performed, and a low-priority task (such as an option) is performed after the high-priority task is completed. The invention ensures the preferential execution of core operations (such as registration button clicking), avoids flow interruption caused by secondary element processing, can rapidly terminate the flow when the high-priority task fails, and reduces the resources consumed by invalid operations.
In the step S42, it is checked whether the XPath path of the task item includes an iframe tag, and it is determined whether the element is located in the frame, if so, the relative coordinates of S32 are used in the frame, and if so, the absolute coordinates of S24 are used in the main document.
Coordinate mapping:
a main document element directly using absolute pixel coordinates (x_abs, y_abs) as an operation point;
elements within the framework:
Acquiring a real-time size (W_frame, H_frame) and an upper left corner offset (X_frame, Y_frame) of the current frame;
the operation coordinates are calculated as x=relative abscissa×w_frame+x_frame, and y=relative ordinate×h_frame+y_frame, and if the calculated coordinates are out of the current viewport range, page scrolling is performed to bring the element into the viewable area. The method and the system uniformly process the positioning of the main document and the elements in the frame, do not need to write special logic for different contexts, and can accurately operate the elements through real-time calculation even if the size of the page changes due to interaction during operation.
In step S43, a browser driver (e.g. Selenium) is used to send a mouse click event to the calculated coordinates, wait 500ms after clicking, set response time for the page (e.g. loading new content, displaying prompt box), locate input box elements, empty original content, inject virtual identity information (e.g. randomly generated name, mobile phone number, mailbox) according to preset rules, call OCR model to identify characters in the picture, input the characters to the corresponding input box, identify the positions of the slide and notch, calculate the sliding distance, simulate human operation track (accelerate before decelerate) to drag the slide, identify prompt text, click the corresponding picture area.
The invention simplifies complex UI interaction into a unified operation interface through classification processing, reduces development cost, and customizes a solution for different types of verification codes so that an automation flow can process more than 80% of common verification mechanisms.
After the current task item is completed, the step S44 is described above, the visible area of the page is intercepted to generate a new image, the new image is input into the target detection model, the state change of the operation element (for example, whether the button is changed into a "clicked" style, whether the input box is filled with content) is identified, the attribute change of the element before and after the operation (for example, the disabled state of the button, the value attribute of the input box) is compared, if the verification fails (for example, the input box is still empty), the positioning information of XPath, coordinates and the like of the element is recorded, the task is reinserted into the tail of the queue, and the task is marked as requiring retry (multiple test for 3 times). The invention identifies the real-time verification operation effect through the model, avoids the hidden error of 'successful operation but not effective', automatically retries the failed task, and improves the success rate of the flow (the test shows that the overall passing rate can be improved by more than 25 percent).
In a preferred embodiment of the present invention, step S5, detecting whether a visible commit button exists on a current page after completion of the operation of the interactive task queue, if not, scrolling the page and re-executing steps S1 to S2 until the visible commit button is identified, and if so, triggering the commit operation, including:
step S51, based on the latest screenshot of the page after the operation of step S44 is completed, a target detection model is called to identify whether a 'submit button' class element exists in the current visible area;
Step S52, when the detection result of step S51 is invisible, of:
Acquiring the total height of the current page and the height of the window of the browser, setting the single scrolling amount to 80% of the height of the window, and executing scrolling operation:
recording the current position of the scroll bar as an initial position, triggering the browser to scroll downwards by a scroll amount unit, and waiting for the redrawing time of a 500ms page;
status update and cycling:
re-executing step S1, intercepting a new visible area image, re-executing step S2, generating a new interactive task queue, returning to step S51 for submitting button detection, and cycling until any one of the following conditions is met:
detecting a visible submit button, wherein the accumulated scrolling amount exceeds the total height of the page;
Step S53, in the process of re-identifying each time of scrolling, an operated element mark library is established, and the interaction task queue newly generated in the step S2 is filtered;
step S54, locating button coordinates when step S51 detects a visible submit button.
In the embodiment of the invention, the steps can be realized by inputting the latest page screenshot generated in the step S44 into a target detection model, classifying all elements in the screenshot by the model, screening out elements with labels of 'submit buttons' (such as 'register', 'immediately submit', and the like), checking whether button coordinates are in the current window range, and excluding the blocked elements (such as a popup window or a floating layer above the button).
The above step S51 may be implemented, for example, by obtaining the total height of the page (e.g., document. Body. Scroll height) and the window height, and setting the single scroll amount to 80% of the window height (to avoid missing elements due to excessive scrolling).
Scrolling is performed:
Recording the current position of the scroll bar as an initial position, triggering the browser to scroll downwards by a unit of scroll amount (such as the current position and the window height multiplied by 80%), waiting for 500ms, ensuring that the page is finished redrawing and dynamic loading, re-executing step S1 (screenshot) and step S2 (generating a task queue) after each scroll, and calling step S51 again to detect a submit button until a termination condition is met, wherein the accumulated scroll amount exceeds the total height of the page (representing the full amount of contents traversed).
According to the invention, through the self-adaptive rolling strategy, even if the submit button is positioned at the bottom of a long page or in a dynamic loading area, the submit button can be detected, the reasonable rolling step length and waiting time balance the detection efficiency and the page loading integrity, and the average detection time is shortened by 30%.
Step S53, when step S4 is initially executed, an empty operated element mark library (such as a set processed_elements) is established, XPath or unique identification of an element is added to the mark library every time a task item is completed, every element in the queue is traversed when a new task queue is generated after each scroll, if the element identification exists in the mark library, the element is removed from the queue, the invention prevents repeated operation on a filled input box or a clicked button, reduces invalid interaction, and avoids page state confusion caused by repeated operation (such as repeated request triggering by repeatedly clicking a submit button).
In the step S54, the boundary frame coordinates (left, upper, wide and high) of the submitted button are extracted from the output of the object detection model, the coordinates of the center point of the button are calculated to be (left+width/2, upper+height/2), if the button is positioned in the frame, the coordinates of the center point are mapped to the browser window by using the relative coordinate conversion method of S32, if the absolute coordinates are directly used in the main document, the invention calculates the center point through the boundary frame output by the model, and the invention has more accurate positioning than the traditional text-based or CSS selector, the success rate is increased to 98%, even if the website modifies the class or ID of the button, the positioning can still be accurately performed as long as the visual style is unchanged, and the maintenance cost is reduced by 50%.
In a preferred embodiment of the present invention, step S6, if the user jumps to the recharging page after submitting, extracts payment account information through a regular expression, and stores the payment account information in association with the virtual identity information in a database, including:
step S61, based on the page jump result triggered in the step S54, monitoring whether the URL of the new page contains a preset keyword, intercepting the complete HTML source code of the current page, and performing a double verification mechanism to obtain a recharging page passing verification;
and step S62, performing hierarchical analysis on the recharging page which passes the verification in step S61, standardizing and storing the data in association, and packaging and transmitting to a data construction module when the recharging page passes the verification in step S61.
In an embodiment of the present invention, the above steps may be implemented in a manner, for example,
In the step S61, the characteristic keywords (e.g. "recharge", "payaccount", "fund-transfer") of the recharging page are preset, the path or parameter portion of the URL of the new page is analyzed, whether any keyword is included is judged, the target detection model is called to identify whether UI elements (e.g. "payment account", "account opening line" text labels or account input boxes) related to recharging exist in the page, whether the HTML source code includes a form structure (e.g. "formaction ="/pay ">) related to payment or a specific JS file reference (e.g. payment SDK script) is checked, if the dual verification is passed, the complete HTML source code is intercepted and redundant contents such as comments and blank lines are removed, and if the dual verification is not passed (e.g. skip to an error page or advertisement page), the error URL is recorded and the subsequent operation is terminated.
In the step S62, text contents (such as < spanid = "account" >123456789</span >) in the HTML tag are matched through the regular expression, account information related fields are extracted, target data are positioned by combining upper and lower Wen Yuyi (such as keywords of "bank account", "payment account" and the like), interference information (such as false account in advertisements) is eliminated, and format cleaning is performed on the extracted account information:
The bank account removes blank spaces and special symbols, unifies the blank spaces and special symbols into pure numbers, the account opening row names are mapped to standard names (such as 'Chinese industry and commerce banks', which are uniformly abbreviated as 'work rows'), virtual identity information (such as generated random names and mobile phone numbers) of current operation is obtained from a task queue, one-to-one association is established between account information and identity information through task IDs, the associated data are written into a structured database (such as MySQL), fields comprise virtual names, mobile phone numbers, payment accounts, account opening rows, associated time stamps and the like, and if pages do not pass recharging page verification, original URLs, screenshot and source codes are packaged and transmitted to a data construction module for expanding a training set or optimizing verification rules.
The invention converts unstructured page data into service data which can be directly used through layering analysis and standardization processing, adapts to the data format requirement of a downstream system, and can be used for training a reverse feeding model without verified data to form a data closed loop of identification-verification-optimization, so that the identification capability of a system on a novel recharging page is continuously improved, the virtual identity and account information are strongly associated, complete context data is provided for subsequent batch recharging, fund management and other services, and the matching cost of manual data is reduced.
A business carrier automated registration login and funds account acquisition system, comprising:
The acquisition module is used for accessing the domain name of the target website through the headless browser, and capturing a current visible area after loading is completed to generate a page image;
The generation module is used for inputting the page image into a pre-trained target detection model, and identifying the category, the position coordinate and the confidence of the UI element in the page;
the conversion module is used for converting the coordinates of the elements in the frame in the interactive task queue into the coordinates of the relative frame and controlling the browser to switch to the corresponding frame;
The verification module is used for traversing the interactive task queue and sequentially executing the following operations of performing simulated clicking operation on button elements, injecting virtual identity information into input frame elements, calling a corresponding identification model for verification code elements and completing verification;
The processing module is used for detecting page rolling and submitting, detecting whether a visible submitting button exists on a current page after the interactive task queue operation is completed until the visible submitting button is identified, triggering the submitting operation if the visible submitting button is visible, and extracting payment account information through a regular expression and storing the payment account information and the virtual identity information in a database in a correlated manner if the payment account information is jumped to a recharging page after the submitting.
Data set preparation and model training:
In the embodiment, a structured image dataset covering 17 types of UI elements is constructed by means of screenshot, screening, clustering duplication removal, manual labeling and the like of a target field website, and a high-quality training sample is provided for a target detection model. In the embodiment, YOLOv series of models are selected as a UI identification framework, batch increment training strategies, CIoU bounding box loss optimization and multi-scale feature fusion structures are combined in the training process, and genetic algorithms are introduced to automatically tune super-parameters and model structures so as to improve detection accuracy, training efficiency and model generalization capability. The method comprises the following specific steps:
Step 1, the target website collects, screens and accesses and verifies and randomly extracts about 20,000 suspicious business website domain names from the domain name database. The system adopts an automatic script to access each domain name and performs preliminary screening according to page characteristics to remove the types of sites that 1) pages cannot be loaded or 404 are wrong, 2) the content is empty and irrelevant to business, and 3) non-standard structures are used and screen-capturing is not possible.
Step 2, the automatic screenshot process automatically executes the screenshot process on the screened websites according to the following sequence to ensure that key registration path pages are captured, namely 1) a front page popup window is closed (if the key registration path pages are available), 2) a registration button is clicked after the front page is loaded, 3) the registration pages are completely loaded and then rolled to the bottom of the pages to capture pages containing all form fields and the 'submit' button, and 4) the pages containing operation mode information are captured after registration and then jumped to a business operation page.
And 3, performing picture de-duplication and clustering de-redundancy to improve training efficiency, performing de-duplication processing on the screenshot images by using a Ward-based hierarchical clustering method, removing repeated samples, and reserving a clustering center image to ensure data diversity and representativeness.
And 4, manually labeling 17 types of UI elements in the screenshot images by using a Yolo _mark tool, wherein the labeling format accords with the txt format required by YOLOv series. Each file contains the element class number and its normalized coordinates in the image.
And 5, after the data organization and division are used for constructing a data set, dividing the data set into a training set, a verification set and a test set according to the proportion of 8:1:1, and ensuring the sample equilibrium distribution of three types of pages (a popup window, a home page and a registration page).
And 6, before the model is formally trained, the system introduces a genetic algorithm to perform global search and automatic optimization on key training parameters of YOLOv series models so as to improve the performance and generalization capability of the models in webpage UI element recognition tasks. The genetic algorithm is initialized, namely, a fitness function is defined, the performance index of the model on a verification set is used as an evaluation standard, and the fitness is used for measuring the individual quality and is a core basis for the genetic algorithm to perform 'selection-evolution'. Optimization parameter spaces include learning rate, lot size, anchor box size, multi-scale feature fusion structure configuration (FPN level), attention mechanism insertion location, etc. (2) And (3) population generation and evolution, namely initializing a system to generate a plurality of individuals (each individual is a group of parameter combinations), respectively carrying out model training (such as 10-20 rounds) on each individual in a small number of rounds in each generation, and calculating the fitness value of each individual. Standard genetic manipulation (selection, crossover, mutation) is then performed. (3) Iterative optimization, namely iterating the algorithm generation by generation according to the change condition of the fitness until a stopping condition is met (such as the fitness lifting amplitude is lower than a threshold value or the maximum algebra is reached), and finally reserving the individual parameter with the highest fitness for formal training.
Step 7, performing three-stage model training based on the optimization parameters, and performing model training in a three-stage mode on the basis of the genetic algorithm optimization result:
freezing the backbone network, training only the detection head to stabilize the basic recognition capability;
thawing part of the trunk layer, introducing a multi-scale training mechanism, and improving the detection precision of UI components with different sizes;
and fine tuning the whole network, and further optimizing the model performance by combining strategies such as increment samples, difficult lifting and the like.
Automatic registration flow:
In the embodiment, through the deep linkage of model identification and browser execution, the system automatically analyzes the webpage structure and executes simulated user behavior according to a priority mechanism, and the limitation of the fixed path of the traditional script is broken through. The method comprises the following specific steps:
Step 1, the initial access and page screenshot system calls SeleniumWebDriver through an automatic control module, starts a headless browser and accesses a target service class website domain name. After the page loading is completed, the screenshot of the current visible area is automatically processed and transmitted into the target detection model which is trained and completed in the embodiment 1.
And 2, the UI element identification and priority ordering model identifies 17 types of UI elements in the page, and outputs category labels, normalized position coordinates and confidence values. The system prioritizes the elements five levels according to a preset priority mechanism (IMMEDIATE, HIGH, MODERATE, LOW, LAST):
immediate priority, "close popup button" and cover critical area;
high priority, namely jump type components such as a registration button, a login button and the like;
Moderate priority, form input class component (username, password input box, etc.);
low priority: captcha component (graphic captcha, sliding captcha);
The last priority is that the components such as a submit button, a contact customer service and the like are not immediately interacted. After the priority is divided, an interactive task queue is generated according to descending order, and the interactive task queue comprises information such as element category, coordinates, confidence level, XPath and the like.
And step 3, converting the coordinates of the elements in the frames, if the elements are positioned in or in the frames, converting the coordinates into relative positions by the system, switching WebDriver to the corresponding frames, converting the coordinates into XPath by using a custom script, and ensuring the operation precision.
Step 4, the simulation operation is executed to traverse the task queue, and the corresponding operation is executed according to the category:
Clicking operation, namely simulating clicking buttons such as registration, closing, submitting and the like;
inputting the compliance identity information generated by the virtual information pool into the form field;
verification code processing:
a graphic verification code is identified and input by using an OCR model;
sliding verification code, calculating sliding distance through image processing and simulating dragging. After the operation is finished, the screen capturing is automatically performed, the identification is repeated, and the next flow is entered.
Step 5, after the interactive operation of page rolling and submitting detection is completed, the system analyzes the page state:
If a 'submit button' in the visible area is identified, directly executing the submission;
if the button is not identified or is located outside the window, automatically scrolling the page and re-identifying the page until the button is detected;
and establishing an interactive locking mechanism, marking the operated elements, and avoiding repeated triggering.
Step 6, judging whether to jump to a business operation page after the registration state is submitted:
Recording state and error screenshot, marking failure and transferring to a data module;
and successfully entering an information extraction process.
Extracting service information:
After the business operation page is successfully entered, the system analyzes the content through the matching of the regular expression and the key field, extracts information and stores the information in a standardized way. The method comprises the following specific steps:
Step 1, page analysis and information extraction are performed on a screenshot or HTML source code, and an operation mode and account information are identified through matching of a regular expression and a keyword, and the method comprises the following steps:
Payment account and account opener information, third party payment identification, electronic payment account and other compliance service fields.
And 2, carrying out format cleaning (such as space removal and standard name mapping) on the extracted information by information classification and standardization, and classifying according to labels such as type, source, time and the like.
And 3, associating the virtual identity information used for registration with the extracted service operation information by data association and storage, and storing the virtual identity information and the extracted service operation information into a back-end database to provide support for subsequent service analysis.
Example 4 exception and feedback mechanism
And through task state monitoring and failure sample feedback, abnormal identification and data reflow are realized, and model optimization and system iteration are supported. The method comprises the following specific steps:
Step 1, if the following exceptions are encountered in the registration flow, the operation is interrupted and the reason is recorded:
Element identification failure or misjudgment, page jump abnormality, verification code processing failure and logic interruption caused by repeated operation.
And step 2, feeding back a failure sample feedback failure task to a data module, and automatically storing screenshot and failure classification for incremental model training and system optimization.
Step 3, the locking mechanism and the maximum number of attempts are to prevent infinite loop, and the maximum number of operations is set for each type of UI element (not more than 2 times by default, except for a closing button, and up to 5 times are allowed).
The embodiment takes the service automation flow as a scene, and the technical scheme is displayed through the compliance description, so that the method is mainly characterized in that:
The model generalization capability is optimized and improved through clustering deduplication, layering labeling and genetic algorithm, full-process automation is realized based on task scheduling, cross-frame operation and dynamic rolling detection of priority, and the stability and adaptability of the system are continuously improved through failure sample reflow.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims (9)

1.业务载体自动化注册登录及资金账户获取方法,其特征在于,所述方法包括:1. A method for automated business carrier registration and login and fund account acquisition, characterized in that the method comprises: 步骤S1,通过无头浏览器访问目标网站域名,加载完成后对当前可视区域截图,生成页面图像;Step S1: access the target website domain name through a headless browser, take a screenshot of the current visible area after loading, and generate a page image; 步骤S2,将所述页面图像输入预训练的目标检测模型,识别页面中的UI元素类别、位置坐标及置信度;基于预设优先级规则对识别结果进行排序,生成按优先级降序排列的交互任务队列;预训练的目标检测模型包括:Step S2: Input the page image into a pre-trained object detection model to identify the category, location coordinates, and confidence level of UI elements in the page; sort the recognition results based on preset priority rules to generate an interactive task queue sorted in descending order of priority; the pre-trained object detection model includes: 从滥用域名库抽取目标网站域名,通过自动化脚本访问域名并执行页面加载状态验证;对验证通过的网站,自动触发弹窗关闭、注册按钮点击及页面滚动操作,捕获包含关键交互元素的完整页面截图;基于完整页面截图,采用层次聚类算法对视觉特征相似的图像分组,以得到去重后的图像;对去重后的图像,标注17类UI元素的类别标签及位置信息,生成符合目标检测格式的标注文件;整合标注文件,按预设比例划分为训练集、验证集和测试集;基于训练集、验证集和测试集,执行以下自动优化流程:Target website domain names are extracted from a database of abused domain names. An automated script accesses the domain names and verifies the page loading status. For websites that pass verification, pop-up window closing, registration button clicks, and page scrolling are automatically triggered, capturing full page screenshots containing key interactive elements. Based on the full page screenshots, a hierarchical clustering algorithm is used to group images with similar visual features to obtain deduplicated images. The deduplicated images are annotated with the category labels and location information of 17 types of UI elements, generating annotation files that conform to the target detection format. The annotation files are then integrated and divided into training, validation, and test sets according to a preset ratio. Based on the training, validation, and test sets, the following automated optimization process is performed: 初始化参数种群,每组参数包含学习率、锚框尺寸及网络结构配置;对每个参数组合进行模型训练,使用验证集精度与损失值计算适应度;通过选择、交叉、变异操作迭代生成新种群,直至适应度收敛,以得到优化的参数;采用优化的参数,进行全网络微调并融合增量样本训练,以得到预训练的目标检测模型;Initialize the parameter population, where each parameter set includes the learning rate, anchor box size, and network structure configuration; train the model for each parameter combination, and calculate the fitness using the validation set accuracy and loss value; iteratively generate a new population through selection, crossover, and mutation operations until the fitness converges to obtain optimized parameters; use the optimized parameters to fine-tune the entire network and integrate incremental sample training to obtain a pre-trained object detection model; 步骤S3,对交互任务队列中位于框架内的元素,转换其坐标为相对框架的坐标,并控制浏览器切换至对应框架;Step S3: for the elements in the interactive task queue that are within the frame, convert their coordinates into coordinates relative to the frame, and control the browser to switch to the corresponding frame; 步骤S4,遍历所述交互任务队列,依次执行以下操作:对按钮类元素执行模拟点击操作;对输入框类元素注入虚拟身份信息;对验证码类元素调用对应识别模型完成验证;Step S4, traverse the interactive task queue and perform the following operations in sequence: perform a simulated click operation on the button type element; inject virtual identity information into the input box type element; call the corresponding recognition model to complete the verification of the verification code type element; 步骤S5,页面滚动与提交检测,完成交互任务队列操作后,检测当前页面是否存在可见的提交按钮,若不可见,则滚动页面并重新执行步骤S1至步骤S2,直至识别到可见提交按钮,若可见,则触发提交操作;Step S5, page scrolling and submission detection: After completing the interactive task queue operation, check whether there is a visible submit button on the current page. If not, scroll the page and re-execute steps S1 to S2 until a visible submit button is identified. If visible, trigger the submit operation; 步骤S6,提交后若跳转至充值页面,通过正则表达式提取支付账户信息,并与所述虚拟身份信息关联存储至数据库。Step S6: After submission, if the page jumps to the recharge page, the payment account information is extracted through regular expressions, and is associated with the virtual identity information and stored in the database. 2.根据权利要求1所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,步骤S1,通过无头浏览器访问目标网站域名,加载完成后对当前可视区域截图,生成页面图像,包括:2. The method for automated business carrier registration and login and fund account acquisition according to claim 1, characterized in that, in step S1, a target website domain name is accessed through a headless browser, and after loading is complete, a screenshot of the current visible area is taken to generate a page image, comprising: 步骤S11,在页面加载完成后,检测当前可视区域内是否存在遮挡关键区域的弹窗元素;若存在,则自动计算弹窗关闭按钮的坐标位置并触发模拟点击操作,使底层页面元素暴露;Step S11: After the page is loaded, detect whether there is a pop-up window element blocking the key area in the current visible area; if so, automatically calculate the coordinate position of the pop-up window close button and trigger a simulated click operation to expose the underlying page elements; 步骤S12,基于已清除弹窗的页面状态,识别页面中的注册按钮位置;计算注册按钮中心点坐标,通过浏览器驱动执行坐标定位点击,触发页面跳转至注册页;Step S12: Based on the page status of the cleared pop-up window, identify the location of the registration button on the page; calculate the coordinates of the center point of the registration button, and execute the coordinate positioning click through the browser driver to trigger the page to jump to the registration page; 步骤S13,待注册页面加载完成后,控制浏览器执行垂直滚动操作:初始滚动位置为页面顶部,按屏幕高度等分设定滚动步长,逐次向下滚动直至页面底部;在滚动至底部后执行截图操作,以捕获包含全部表单输入框及页面底部提交按钮的完整图像;Step S13: After the registration page is loaded, the browser is controlled to perform a vertical scrolling operation: the initial scrolling position is the top of the page, the scrolling step size is set according to the screen height, and the browser scrolls down step by step until it reaches the bottom of the page; after scrolling to the bottom, a screenshot operation is performed to capture a complete image including all form input boxes and the submit button at the bottom of the page; 步骤S14,在完成注册表单提交操作后,监测浏览器跳转至的新页面URL,当页面URL包含特征关键词时,判定为充值页面,待该页面加载完成后,直接截取当前可视区域图像。Step S14, after completing the registration form submission operation, monitor the new page URL that the browser jumps to. When the page URL contains characteristic keywords, it is determined to be a recharge page. After the page is loaded, directly capture the current visible area image. 3.根据权利要求2所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,步骤S2,将所述页面图像输入预训练的目标检测模型,识别页面中的UI元素类别、位置坐标及置信度;基于预设优先级规则对识别结果进行排序,生成按优先级降序排列的交互任务队列,包括:3. The method for automated business carrier registration and login and fund account acquisition according to claim 2, characterized in that, in step S2, the page image is input into a pre-trained object detection model to identify the category, location coordinates, and confidence level of UI elements on the page; the recognition results are sorted based on preset priority rules to generate an interactive task queue arranged in descending order of priority, comprising: 步骤S21,将步骤S13和步骤S14生成的页面图像输入预训练的目标检测模型,输出每个UI元素的识别结果集合;每个识别结果包含类别标签、位置坐标和置信度;Step S21: Input the page image generated in steps S13 and S14 into a pre-trained object detection model, and output a set of recognition results for each UI element; each recognition result includes a category label, location coordinates, and confidence level; 步骤S22,对步骤S21输出的每个识别结果根据元素中心点与页面中心的欧氏距离进行优先级赋值,以得到类别标签、位置坐标和置信度分别对应的权重;根据类别标签、位置坐标和置信度分别对应的权重,以得到优先级分值,按分值从高到低将元素划分为五级,以得到带优先级标签的元素集合;Step S22: Prioritize each recognition result output from step S21 based on the Euclidean distance between the center point of the element and the center of the page to obtain weights corresponding to the category label, location coordinates, and confidence level. Priority scores are obtained based on the weights corresponding to the category label, location coordinates, and confidence level, and the elements are divided into five levels from high to low according to the scores to obtain a set of elements with priority labels. 步骤S23,将步骤S22输出的带优先级标签的元素集合,按优先级降序排列,同优先级元素按置信度降序排列,以生成结构化任务队列;Step S23, arranging the set of elements with priority labels output in step S22 in descending order of priority, and arranging elements with the same priority in descending order of confidence, to generate a structured task queue; 步骤S24,遍历步骤S23的结构化任务队列,对每个元素的归一化坐标,即计算其在浏览器视窗中的绝对像素坐标;通过DOM位置反推算法,将绝对像素坐标转换为XPath路径。Step S24, traverse the structured task queue of step S23, calculate the normalized coordinates of each element, that is, calculate its absolute pixel coordinates in the browser window; and convert the absolute pixel coordinates into an XPath path through a DOM position inversion algorithm. 4.根据权利要求3所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,步骤S3,对交互任务队列中位于框架内的元素,转换其坐标为相对框架的坐标,并控制浏览器切换至对应框架,包括:4. The method for automated business carrier registration and login and fund account acquisition according to claim 3, wherein step S3, for elements within a frame in the interactive task queue, converts their coordinates to coordinates relative to the frame and controls the browser to switch to the corresponding frame, comprises: 步骤S31,分析S24生成的XPath路径特征,若路径中包含HTML标签或frame层级结构,判定对应的元素位于框架内,提取框架在父页面中的框架标识符,即ID或Name属性;Step S31, analyzing the XPath path features generated in S24, if the path contains HTML tags or a frame hierarchy, determining that the corresponding element is within a frame, and extracting the frame identifier of the frame in the parent page, i.e., the ID or Name attribute; 步骤S32,相对坐标转换计算,通过浏览器接口读取框架左上角在父页面中的像素坐标(X_frame,Y_frame),读取框架自身宽度W_frame和高度H_frame;Step S32, relative coordinate conversion calculation, read the pixel coordinates (X_frame, Y_frame) of the upper left corner of the frame in the parent page through the browser interface, and read the frame's own width W_frame and height H_frame; 计算元素在框架内的相对位置:Calculate the relative position of an element within a frame: 相对横坐标=(元素绝对横坐标-X_frame)÷W_frame;Relative horizontal coordinate = (element absolute horizontal coordinate - X_frame) ÷ W_frame; 相对纵坐标=(元素绝对纵坐标-Y_frame)÷H_frame;Relative ordinate = (element absolute ordinate - Y_frame) ÷ H_frame; 输出归一化相对坐标;Output normalized relative coordinates; 步骤S33,基于S31提取的框架标识符,在浏览器中定位对应框架的DOM节点,切换WebDriver操作上下文至该框架内部。Step S33: Based on the frame identifier extracted in S31, locate the DOM node of the corresponding frame in the browser, and switch the WebDriver operation context to the inside of the frame. 5.根据权利要求4所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,步骤S4,遍历所述交互任务队列,依次执行以下操作:对按钮类元素执行模拟点击操作;对输入框类元素注入虚拟身份信息;对验证码类元素调用对应识别模型完成验证,包括:5. The method for automated business carrier registration and login and fund account acquisition according to claim 4, characterized in that, in step S4, the interactive task queue is traversed and the following operations are performed in sequence: performing a simulated click operation on a button-type element; injecting virtual identity information into an input box-type element; and calling a corresponding recognition model to complete verification of a verification code-type element, including: 步骤S41,从S23生成的结构化任务队列头部开始顺序读取任务项,根据任务项中的优先级标签确定执行顺序;Step S41, sequentially reading task items starting from the head of the structured task queue generated in S23, and determining the execution order according to the priority tags in the task items; 步骤S42,对当前任务项:Step S42, for the current task item: 若元素位于框架内,采用步骤S32计算的相对坐标,若位于主文档,采用步骤S24转换的绝对坐标;动态计算浏览器视口中的操作坐标:If the element is located in the frame, the relative coordinates calculated in step S32 are used; if it is located in the main document, the absolute coordinates converted in step S24 are used; the operation coordinates in the browser viewport are dynamically calculated: 主文档:直接使用绝对像素坐标;Main document: Use absolute pixel coordinates directly; 框架内:相对坐标×框架实际尺寸+框架偏移量;Inside the frame: relative coordinates × actual frame size + frame offset; 步骤S43,根据元素类别执行对应操作,按钮点击操作,点击后等待500ms页面响应时间,表单输入操作,根据验证码类型分支处理,滑动验证码;Step S43: Execute corresponding operations according to element categories, such as button click operations, wait 500ms for page response time after click, form input operations, branch processing according to verification code type, and slide verification code; 步骤S44,每完成一个任务项操作后,重新截取当前页面可视区域,调用目标检测模型验证操作效果;若操作失败,则记录当前元素定位信息,将任务重新插入队列尾部。Step S44: After completing each task item operation, re-capture the current page visible area and call the target detection model to verify the operation effect; if the operation fails, record the current element positioning information and reinsert the task into the tail of the queue. 6.根据权利要求5所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,步骤S5,页面滚动与提交检测,完成交互任务队列操作后,检测当前页面是否存在可见的提交按钮,若不可见,则滚动页面并重新执行步骤S1至步骤S2,直至识别到可见提交按钮,若可见,则触发提交操作,包括:6. The method for automated business carrier registration and login and fund account acquisition according to claim 5, characterized in that, in step S5, page scrolling and submission detection, after completing the interactive task queue operation, it is detected whether there is a visible submit button on the current page. If not, the page is scrolled and steps S1 to S2 are re-executed until a visible submit button is identified. If so, the submit operation is triggered, including: 步骤S51,基于S44操作完成后的最新页面截图,调用目标检测模型识别当前可视区域内是否存在"提交按钮"类元素;Step S51: Based on the latest page screenshot after the completion of S44, call the object detection model to identify whether there is a "submit button" type element in the current visible area; 步骤S52,当步骤S51检测结果为不可见时执行:Step S52, when the detection result of step S51 is invisible, execute: 获取当前页面总高度与浏览器视窗高度,设定单次滚动量为视窗高度的80%;执行滚动操作:Get the total height of the current page and the browser window height, set the single scroll amount to 80% of the window height; perform the scrolling operation: 记录当前滚动条位置为初始位置,触发浏览器向下滚动一个滚动量单位,等待500ms页面重绘时间;Record the current scroll bar position as the initial position, trigger the browser to scroll down one scroll unit, and wait 500ms for the page to redraw; 状态更新与循环:State update and loop: 重新执行步骤S1,截取新可视区域图像;重新执行步骤S2,生成新交互任务队列;返回步骤S51进行提交按钮检测,循环直至满足以下任一条件:Re-execute step S1 to capture the new visible area image; re-execute step S2 to generate a new interactive task queue; return to step S51 to perform submit button detection, and loop until any of the following conditions is met: 检测到可见提交按钮,累计滚动量超过页面总高度;A visible submit button is detected and the cumulative scrolling exceeds the total height of the page; 步骤S53,在每次滚动重识别过程中,建立已操作元素标记库,对S2新生成的交互任务队列进行过滤;Step S53: During each rolling re-identification process, a library of operated element tags is established to filter the interactive task queue newly generated in S2; 步骤S54,当步骤S51检测到可见提交按钮时,定位按钮坐标。Step S54: when the visible submit button is detected in step S51, the button coordinates are located. 7.根据权利要求6所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,步骤S6,提交后若跳转至充值页面,通过正则表达式提取支付账户信息,并与所述虚拟身份信息关联存储至数据库,包括:7. The method for automated business carrier registration and login and fund account acquisition according to claim 6, characterized in that, in step S6, if the page jumps to the recharge page after submission, the payment account information is extracted using a regular expression and associated with the virtual identity information and stored in the database, including: 步骤S61,基于步骤S54触发的页面跳转结果,监测新页面URL是否包含预设关键词,截取当前页面完整HTML源码,并进行双重验证机制,以得到验证通过的充值页面;Step S61: Based on the page jump result triggered by step S54, monitor whether the new page URL contains the preset keyword, intercept the complete HTML source code of the current page, and perform a double verification mechanism to obtain a recharge page that has passed the verification; 步骤S62,对步骤S61验证通过的充值页面执行分层解析,对数据标准化与关联存储,当步骤S61未通过充值页验证时,打包传输至数据构建模块。Step S62: perform hierarchical analysis on the recharge page that has passed the verification in step S61, standardize the data and store it in an associated manner. If the recharge page fails the verification in step S61, it is packaged and transmitted to the data construction module. 8.根据权利要求7所述的业务载体自动化注册登录及资金账户获取方法,其特征在于,使用验证集精度与损失值计算适应度,包括:8. The method for automated business carrier registration and login and fund account acquisition according to claim 7, wherein the fitness is calculated using validation set accuracy and loss value, comprising: 基于当前参数组合的模型训练输出,加载其在验证集上的预测结果;Based on the model training output of the current parameter combination, load its prediction results on the validation set; 统计预测框与真实框的匹配情况,即对验证集每张图像,计算目标检测模型预测框与标注框的重合度;Count the matching between the predicted box and the real box, that is, for each image in the validation set, calculate the overlap between the target detection model's predicted box and the labeled box; 按17类UI元素分别统计正确检测的数量占对应的类别标注总数的比例,对所有类别的精度值取算术平均,得到验证集平均精度指标;Count the proportion of correctly detected UI elements to the total number of corresponding category annotations for each of the 17 categories, take the arithmetic average of the accuracy values of all categories, and obtain the average accuracy index of the validation set; 从目标检测模型验证日志中提取所有样本的CIoU损失值,对所有样本的CIoU损失值取算术平均,以得到CIoU损失均值;Extract the CIoU loss values of all samples from the object detection model validation log, and take the arithmetic average of the CIoU loss values of all samples to obtain the mean CIoU loss; 定义精度权重系数与损失权重系数,将平均精度指标按正相关加权,所述CIoU损失均值按负相关加权,以生成适应度分值。The accuracy weight coefficient and the loss weight coefficient are defined, and the average accuracy index is weighted as positive correlation and the CIoU loss mean is weighted as negative correlation to generate a fitness score. 9.一种业务载体自动化注册登录及资金账户获取系统,其特征在于,该系统用于执行如权利要求1至8中任一项所述的方法,包括:9. A system for automated registration and login of a business carrier and acquisition of funds account, characterized in that the system is used to execute the method according to any one of claims 1 to 8, comprising: 获取模块,用于通过无头浏览器访问目标网站域名,加载完成后对当前可视区域截图,生成页面图像;The acquisition module is used to access the target website domain name through a headless browser, take a screenshot of the current visible area after loading, and generate a page image; 生成模块,用于将所述页面图像输入预训练的目标检测模型,识别页面中的UI元素类别、位置坐标及置信度;基于预设优先级规则对识别结果进行排序,生成按优先级降序排列的交互任务队列;A generation module is configured to input the page image into a pre-trained object detection model to identify the category, location coordinates, and confidence level of UI elements in the page; sort the recognition results based on a preset priority rule to generate an interactive task queue sorted in descending order of priority; 转换模块,用于对交互任务队列中位于框架内的元素,转换其坐标为相对框架的坐标,并控制浏览器切换至对应框架;The conversion module is used to convert the coordinates of the elements in the interactive task queue that are located in the frame into the coordinates of the relative frame, and control the browser to switch to the corresponding frame; 验证模块,用于遍历所述交互任务队列,依次执行以下操作:对按钮类元素执行模拟点击操作;对输入框类元素注入虚拟身份信息;对验证码类元素调用对应识别模型完成验证;The verification module is used to traverse the interactive task queue and perform the following operations in sequence: simulate click operations on button-type elements; inject virtual identity information into input box-type elements; and call corresponding recognition models to complete verification of verification code-type elements; 处理模块,用于页面滚动与提交检测,完成交互任务队列操作后,检测当前页面是否存在可见的提交按钮,直至识别到可见提交按钮,若可见,则触发提交操作;提交后若跳转至充值页面,通过正则表达式提取支付账户信息,并与所述虚拟身份信息关联存储至数据库。The processing module is used for page scrolling and submission detection. After completing the interactive task queue operation, it detects whether there is a visible submit button on the current page until a visible submit button is identified. If it is visible, the submission operation is triggered; after submission, if the page jumps to the recharge page, the payment account information is extracted through regular expressions and associated with the virtual identity information and stored in the database.
CN202510747789.5A 2025-06-06 2025-06-06 Service carrier automatic registration login and fund account acquisition method and system Active CN120256010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510747789.5A CN120256010B (en) 2025-06-06 2025-06-06 Service carrier automatic registration login and fund account acquisition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510747789.5A CN120256010B (en) 2025-06-06 2025-06-06 Service carrier automatic registration login and fund account acquisition method and system

Publications (2)

Publication Number Publication Date
CN120256010A CN120256010A (en) 2025-07-04
CN120256010B true CN120256010B (en) 2025-08-12

Family

ID=96191743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510747789.5A Active CN120256010B (en) 2025-06-06 2025-06-06 Service carrier automatic registration login and fund account acquisition method and system

Country Status (1)

Country Link
CN (1) CN120256010B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460356A (en) * 2020-04-23 2020-07-28 北京信安世纪科技股份有限公司 Automatic login method, device, medium and equipment
CN114743187A (en) * 2022-03-31 2022-07-12 迈容智能科技(上海)有限公司 Automatic login method, system, equipment and storage medium for bank security control

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120004973A1 (en) * 2009-01-14 2012-01-05 Signature Systems Llc Reward exchange system with automatic login and registration
US8938787B2 (en) * 2010-11-29 2015-01-20 Biocatch Ltd. System, device, and method of detecting identity of a user of a mobile electronic device
US10542123B2 (en) * 2016-05-23 2020-01-21 Usabilla B.V. System and method for generating and monitoring feedback of a published webpage as implemented on a remote client
CN106897357B (en) * 2017-01-04 2023-07-18 北京京拍档科技股份有限公司 Method for intelligent crawling network information with verification function
CN115905767B (en) * 2023-01-07 2023-06-02 珠海金智维信息科技有限公司 Webpage login method and system based on fixed candidate frame target detection algorithm
CN119597388A (en) * 2024-11-22 2025-03-11 深圳前海环融联易信息科技服务有限公司 Automatic closing method, device, equipment and medium for popup window of automatic task page

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460356A (en) * 2020-04-23 2020-07-28 北京信安世纪科技股份有限公司 Automatic login method, device, medium and equipment
CN114743187A (en) * 2022-03-31 2022-07-12 迈容智能科技(上海)有限公司 Automatic login method, system, equipment and storage medium for bank security control

Also Published As

Publication number Publication date
CN120256010A (en) 2025-07-04

Similar Documents

Publication Publication Date Title
AU2019355933B2 (en) Software testing
CN120011247B (en) Method and system for dynamic generation of automated test scripts based on multimodal AI recognition
AU2022204589B2 (en) Multiple input machine learning framework for anomaly detection
US8176067B1 (en) Fixed phrase detection for search
US9020879B2 (en) Intelligent data agent for a knowledge management system
US12332770B2 (en) Automated locating of GUI elements during testing using multidimensional indices
CA2919878A1 (en) Refining search query results
EP4302227A1 (en) System and method for automated document analysis
CN115905767B (en) Webpage login method and system based on fixed candidate frame target detection algorithm
US20250292207A1 (en) Category classification of records of e-procurement transactions
US20250036555A1 (en) System and method for automated testing of user interfaces in software applications
Yang et al. UIS-hunter: Detecting UI design smells in Android apps
CN120179132A (en) Catalog data review interactive interface control method, system, device and storage medium
US20140114903A1 (en) Knowledge Management Engine for a Knowledge Management System
CN120256010B (en) Service carrier automatic registration login and fund account acquisition method and system
US12360654B1 (en) Techniques for targeted data extraction from unstructured sets of documents
US20250218206A1 (en) Ai-generated datasets for ai model training and validation
US12111876B2 (en) Automatic high-speed display control method for web content
CN121209855B (en) An automated script generation method
US20250335219A1 (en) On-screen application object detection
US9405779B2 (en) Search engine for a knowledge management system
CN121209855A (en) An automated script generation method
CN121070331A (en) Form construction method, system and storage medium based on drag operation
CN119048255A (en) Information content display method, device, equipment, medium and product
CN121479039A (en) Data crawling method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant