[go: up one dir, main page]

CN118916031B - Multi-terminal code mapping translation method for Internet medical platform - Google Patents

Multi-terminal code mapping translation method for Internet medical platform Download PDF

Info

Publication number
CN118916031B
CN118916031B CN202411402188.2A CN202411402188A CN118916031B CN 118916031 B CN118916031 B CN 118916031B CN 202411402188 A CN202411402188 A CN 202411402188A CN 118916031 B CN118916031 B CN 118916031B
Authority
CN
China
Prior art keywords
code
file
node
sequence
key code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411402188.2A
Other languages
Chinese (zh)
Other versions
CN118916031A (en
Inventor
曹兴兵
高飞
王超
林涛
毛夏薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Nali Shuzhi Health Technology Co ltd
Original Assignee
Zhejiang Nali Shuzhi Health Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Nali Shuzhi Health Technology Co ltd filed Critical Zhejiang Nali Shuzhi Health Technology Co ltd
Priority to CN202411402188.2A priority Critical patent/CN118916031B/en
Publication of CN118916031A publication Critical patent/CN118916031A/en
Application granted granted Critical
Publication of CN118916031B publication Critical patent/CN118916031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/73Program documentation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/40ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management of medical equipment or devices, e.g. scheduling maintenance or upgrades

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Epidemiology (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Library & Information Science (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明属于终端代码技术领域,本发明公开了一种用于互联网医疗平台的多终端代码映射转译方法;包括:获取医疗平台待转译的源代码文件,源代码文件包括n段代码片段,对源代码文件进行词法分析,并生成得到抽象语法树;遍历抽象语法树,提取关键代码片段,计算每个关键代码片段的信息熵值和上下文相关性权重;将每个关键代码片段按照对应的信息熵值和上下文相关性权重进行排序,得到有序代码片段序列;将有序代码片段序列进行映射转译,生成目标代码文件;将目标代码文件分发至不同的终端医疗设备,不同的终端医疗设备对运行目标代码文件进行反向映射并运行,极大地降低了人工干预和编码的工作量,提高了转译效率和准确性。

The present invention belongs to the technical field of terminal codes, and discloses a multi-terminal code mapping and translation method for an Internet medical platform; the method comprises: obtaining a source code file to be translated of a medical platform, the source code file comprising n code segments, performing lexical analysis on the source code file, and generating an abstract syntax tree; traversing the abstract syntax tree, extracting key code segments, and calculating the information entropy value and context relevance weight of each key code segment; sorting each key code segment according to the corresponding information entropy value and context relevance weight to obtain an ordered code segment sequence; mapping and translating the ordered code segment sequence to generate a target code file; distributing the target code file to different terminal medical devices, and the different terminal medical devices reversely mapping and running the running target code file, thereby greatly reducing the workload of manual intervention and coding, and improving the translation efficiency and accuracy.

Description

Multi-terminal code mapping translation method for Internet medical platform
Technical Field
The invention relates to the technical field of terminal codes, in particular to a multi-terminal code mapping translation method for an Internet medical platform.
Background
The patent with the application publication number of CN104360850A discloses a service code processing method and a device, which comprise the steps of receiving a request for acquiring a code description sent by an external application program, requesting associated information containing the code description, judging whether the code type is loaded in a pre-generated code translation mapping relation table, acquiring the code description corresponding to the code value from the code translation mapping relation table if the code type is loaded in the pre-generated code translation mapping relation table, and sending the code description corresponding to the code value acquired from the code translation mapping relation table to the external application program, so that the overhead and redundancy of an application program memory can be reduced, and efficient and convenient translation of the service code and synchronous sharing of the code translation mapping relation are realized.
However, in the internet medical field, because of the huge difference between the hardware architecture and the operating system of the terminal medical equipment, the traditional code translation method often needs to perform a great amount of manual adjustment for each platform system, has low efficiency and is easy to introduce errors, the control codes of the traditional code translation method have uneven running efficiency on different equipment, which seriously affects the accuracy and the safety, on the other hand, the traditional code translation method often has difficulty in accurately evaluating the complexity context correlation of the codes, so that the problem of low efficiency or semantic errors may exist in the translation result, and in addition, the traditional code translation process lacks automation and intelligence, needs a great amount of manual intervention and coding, has low efficiency and is easy to cause human errors.
In view of the above, the present invention proposes a multi-terminal code mapping translation method for an internet medical platform to solve the above-mentioned problems.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a multi-terminal code mapping translation method for an Internet medical platform, which comprises the following steps of S1, acquiring a source code file to be translated of the medical platform, wherein the source code file comprises n sections of code fragments, performing lexical analysis on the source code file, and generating an abstract syntax tree;
S2, traversing an abstract syntax tree, extracting key code segments, and calculating an information entropy value and a context correlation weight of each key code segment;
s3, ordering each key code segment according to the corresponding information entropy value and the context correlation weight to obtain an ordered code segment sequence;
s4, mapping and translating the ordered code segment sequences to generate an object code file;
and S5, distributing the object code file to different terminal medical equipment, and reversely mapping and operating the operation object code file by the different terminal medical equipment.
Further, the generating the abstract syntax tree comprises:
The method comprises the steps of converting a source code file into a character stream, extracting lexical units from the character stream, generating a lexical unit sequence, carrying out grammar analysis on the lexical unit sequence, constructing a grammar tree, preprocessing the grammar tree, and reorganizing the hierarchical structure of nodes in the grammar tree to obtain an abstract grammar tree.
Further, the method for generating the lexical unit sequence comprises the following steps:
initializing an empty sequence shell, reading characters in a character stream one by one, and recording the current read position in real time by using a pointer;
Defining lexical unit types, constructing a corresponding prefix tree for each lexical unit type, sequentially trying to match the prefix tree corresponding to each lexical unit type from the current read position, and extracting the corresponding lexical unit if the prefix tree is successfully matched with the current character;
Determining the type and the value of the lexical unit according to the corresponding prefix tree, recording the position information of the lexical unit in the source code file, constructing a lexical unit example based on the type, the value and the position information of the lexical unit, adding the lexical unit example into a sequence shell, moving a pointer to the end position of the lexical unit extracted at the time, repeating until the character stream is read, and obtaining the final sequence shell as the lexical unit sequence.
Further, the method for converting the source code file into the character stream comprises the following steps:
an initial memory buffer area is initially defined and has a size of Opening a source code file by using file I/O operation, obtaining a file stream object, reading the file stream object byte by byte to obtain m file blocks, sequentially storing the m file blocks in a memory buffer, and dynamically adjusting the size of the memory bufferTo the point of;
Wherein, the method comprises the steps of,For the total size of m file blocks,For the size of the file block that has been read,For the total memory size to be the same,As for the memory that is currently already in use,For a preset upper limit of memory buffer size,AndIn order to adjust the factor(s),AndIs an exponential factor; in order to adjust the function of the function, For the purpose of I/O throughput,For the purpose of CPU utilization,The memory fragmentation degree;
Adjusting function Wherein, the method comprises the steps of,For the throughput adjustment factor to be used,In order to adjust the coefficient of utilization,The memory fragmentation adjustment coefficient is used;
Preprocessing the file blocks in the memory buffer area to obtain preprocessed file blocks, and converting the preprocessed file blocks into character streams, namely reading characters in the preprocessed file blocks one by one and forming a character sequence, namely the character streams.
Further, the method for preprocessing the file blocks in the memory buffer area includes:
Defining a five-tuple Wherein, the method comprises the steps of, wherein,As a set of states,In order to input the alphabet list of the letters,As a function of the state transition(s),Is in an initial state, and;Is a group of termination states, and;
State transfer functionWherein, the method comprises the steps of, wherein,In the event of a current state,In order to input the character(s),Is in a new state;
defining a state set based on m file blocks And inputting an alphabet;
Reading characters from file blocks of a memory buffer one by oneAccording to the current stateAnd the character readNew state is calculated by state transfer functionIf the state is newBelonging toThen execute corresponding preprocessing operation to update the current state toRepeating until all characters in the file block of the memory buffer are read, and obtaining the preprocessed file block.
Further, the method for analyzing the lexical unit sequence comprises the following steps:
Defining grammar rules, namely analyzing grammar of a programming language, initializing a grammar tree based on the defined grammar rules, creating a root node, extracting lexical units from a lexical unit sequence, matching the lexical units with the grammar rules, creating a new node serving as a left sub-node of the root node, matching operators by using the first lexical unit as a sub-node of the newly created left sub-node, creating a new node serving as a right sub-node of the root node, continuing to match according to a programming generation formula, creating a new node serving as a right sub-node te of the right sub-node of the root node, matching a second lexical unit serving as a right sub-node of the newly created te, matching operators, creating a new node serving as a left sub-node tw of the right sub-node te, continuing to match a third lexical unit according to the programming generation formula, creating a new node serving as a right sub-node of tw, repeating until the extraction of the lexical units in the lexical unit sequence is finished, and completing the construction of the grammar tree.
Further, the method for reorganizing the hierarchical structure of the nodes in the syntax tree includes:
Defining a tree reconstruction rule as a group of pattern matching rules, wherein each pattern matching rule consists of a matching pattern and a reconstruction action;
The matching modes have m2 types, and define the structure of the subtree to be reconstructed, wherein the reconstruction actions comprise node lifting, node merging and node splitting;
Matching rules for each pattern Assigning a priority weightFor any two pattern matching rulesAndThey are respectively matched with grammar treeSubtrees of (3)AndIf there is a nesting relationship between the matched patterns, calculating the nesting value between them;
If it isThen reconstructIf (1)Then reconstructIf (1)Then compareAndThe weights of the two are reconstructed firstly with higher weight;
;
Wherein, AndIs the adjustment coefficient of the light source,Is thatAndIs used to determine the most recent common ancestor node of (c),Is thatIs provided with a root node of (c),Is thatAndIs provided for the difference in size of the (c) in the (c),Is thatAndIs a degree of overlap of (2); Is a grammar tree In (a)Is provided with a depth of (a),Is a grammar treeIn (a)Is a depth of (2);
The reconstruction process comprises the following steps:
Constructing a priority queue Q, initially placing all pattern matching rules into the priority queue Q according to descending priority, and when traversing to a node n', extracting the pattern matching rule corresponding to the highest priority from the Q Matching the matching modes of the subtrees taking n' as the root, if the matching is successful, executing the reconstruction action to obtain the reconstructed subtrees, and adjusting other mode matching rules in Q according to the reconstruction action;
If the matching fails, continuing to match the pattern matching rule corresponding to the next priority level until all pattern matching rules are traversed, and replacing the original grammar tree with all reconstructed subtrees to finish the generation of the abstract grammar tree.
Further, the calculation mode of the information entropy value and the context correlation weight of each key code segment comprises the following steps:
representing each key code segment as a corresponding sequence of tokens, wherein each token is a key, identifier or literal quantity;
Counting each unique token In key code segmentFrequency of occurrence in (a)Frequency-basedCalculating to obtain key code fragmentsInformation entropy value of (2);
,
Wherein, Is thatIs used to determine the complexity of the code segment of (c),Is thatIs a data stream complexity of (1); And Adjusting coefficients for the information;
wherein, the method comprises the steps of, Is thatIs used to determine the degree of complexity of the cycle,Is thatIs used for the branching complexity of the (c) signal,Is thatIs a length of (2); And Adjusting parameters for the weights;
Loop complexity refers to the sum of the layers of nested loops in key code segments
The impurity degree refers to the sum of the number of conditional branch sentences in the code segment;
wherein, the method comprises the steps of, Is thatIs used to determine the scope complexity of the (c) system,Is thatIs used to determine the degree of complexity of the dependency of (1),Is thatThe scope complexity refers to the sum of nesting layers of the scope of the variable in the key code segment;
wherein, the method comprises the steps of, Is the firstThe weight coefficient of the individual variable is determined,Is the firstThe depth of dependence of the individual variables,To rely on adjustment parameters; Is the first Complexity of individual control dependencies;
fragmenting key code And its context is converted into a vector representation and the cosine similarity between them is calculated as semantic similarity;
Computing key code snippetsStructural similarity mean with other key code segmentsComputing statistical similarity using an n-gram model;
Based on semantic similaritySimilarity of structuresAnd statistical similarityCalculating to obtain key code fragmentsContext correlation weight of (c),
Wherein, the method comprises the steps of,Is a first adjustment coefficient; Is the second adjustment coefficient.
Further, the calculating method of the structural similarity mean value includes:
Traversing the abstract syntax tree, and extracting the subtrees corresponding to each key code segment according to the position information of the extracted key code segment.
The definition tree editing operation comprises node renaming, node deleting and node inserting, and a cost value is respectively assigned to the node renaming, the node deleting and the node inserting;
for the following Corresponding subtreeAnd any sub-treeThe calculation willConversion toMinimum editing cost, i.e. tree editing distance;
The calculation mode of the minimum editing cost comprises the following steps:
initializing a two-dimensional table, the rows and columns respectively corresponding to AndFrom bottom to top, calculates the value of each two-dimensional table cell, representing the node to beIs converted into (a) subtrees ofMinimum edit cost of the subtree of (a);
Then Wherein, the method comprises the steps of,For the number of all sub-trees,AndRespectively representAndIs the number of nodes;
the acquisition mode of the ordered code segment sequence comprises the following steps:
The information entropy value and the context correlation weight of each key code segment are weighted and summed to obtain a comprehensive grading value of each key code segment;
the acquisition mode of the target code file comprises the following steps:
Traversing the ordered code segment sequence, mapping each key code segment to an equivalent code segment of the target language, and fusing the mapped target code segments according to the context information of the key code segments in the source code file to obtain a target code file;
The reverse mapping is performed by converting the object code file into machine code or intermediate code directly executed by the terminal medical equipment.
The multi-terminal code mapping and translating system for the Internet medical platform is used for realizing the multi-terminal code mapping and translating method for the Internet medical platform and comprises a source code acquisition and analysis module, a processing module and a processing module, wherein the source code acquisition and analysis module is used for acquiring a source code file to be translated of the medical platform, the source code file comprises n sections of code fragments, performing lexical analysis on the source code file and generating an abstract syntax tree;
The segment extraction analysis module is used for traversing the abstract syntax tree, extracting key code segments and calculating the information entropy value and the context correlation weight of each key code segment;
the segment sequencing module is used for sequencing each key code segment according to the corresponding information entropy value and the context correlation weight to obtain an ordered code segment sequence;
The code mapping translation module is used for mapping and translating the ordered code fragment sequences to generate an object code file;
And the distribution mapping module is used for distributing the object code file to different terminal medical equipment, and the different terminal medical equipment reversely maps and operates the operation object code file.
The invention relates to a multi-terminal code mapping translation method for an Internet medical platform, which has the technical effects and advantages that:
The invention improves the quality and efficiency of code translation through deep lexical analysis and abstract grammar tree generation, different terminal medical equipment can execute the translated and optimized target code with high efficiency, improves the performance and reliability of a medical informatization system, introduces an information entropy value and a context correlation weight calculation mechanism, generates a better and efficient target code, is beneficial to improving the running speed of the system, and ensures the semantic correctness of the code in various heterogeneous environments, thereby reducing potential errors and potential safety hazards, realizing flexible code conversion and optimization, remarkably reducing the development and maintenance cost, realizing the automation and intellectualization of the code translation process, greatly reducing the workload of manual intervention and coding, improving the translation efficiency and accuracy, and effectively improving the response speed and user experience of a medical platform.
Drawings
FIG. 1 is a diagram of a multi-terminal code mapping translation method for an Internet medical platform according to the present invention;
Fig. 2 is a schematic diagram of a multi-terminal code mapping translation system for an internet medical platform according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, a multi-terminal code mapping translation method for an internet medical platform according to the present embodiment includes:
s1, acquiring a source code file to be translated of a medical platform, wherein the source code file comprises n sections of code fragments, performing lexical analysis on the source code file, and generating an abstract syntax tree;
S2, traversing an abstract syntax tree, extracting key code segments, and calculating an information entropy value and a context correlation weight of each key code segment;
s3, ordering each key code segment according to the corresponding information entropy value and the context correlation weight to obtain an ordered code segment sequence;
s4, mapping and translating the ordered code segment sequences to generate an object code file;
s5, distributing the object code file to different terminal medical equipment, and reversely mapping and operating the operation object code file by the different terminal medical equipment, so that cross-platform code translation is realized, and the operation efficiency of the code in a heterogeneous environment is improved.
Further, the generating the abstract syntax tree includes:
The source code file is converted into a character stream (sequence of characters organized in sequence), lexical units are extracted from the character stream, and a sequence of lexical units is generated.
Specifically, an empty sequence shell is initialized, characters in the character stream are read one by one, and a pointer is used for recording the current read position in real time.
The method comprises the steps of defining lexical unit types (such as keywords, identifiers, numerical values and the like), constructing a corresponding prefix tree (the character string set is organized and stored according to the prefix paths of the character strings) for each lexical unit type, and storing all possible character strings (such as keyword lists) contained in each lexical unit type in one prefix tree.
And sequentially trying to match the prefix tree corresponding to each lexical unit type from the current read position, and extracting the corresponding lexical unit if the prefix tree is successfully matched with the current character.
Determining the type and the value of the lexical unit according to the corresponding prefix tree, recording the position information (such as line number, column number and the like) of the lexical unit in the source code file, constructing a lexical unit example based on the type, the value and the position information of the lexical unit, adding the lexical unit example into a sequence shell, moving a pointer to the end position of the lexical unit extracted at the time, repeating until the character stream is read, and obtaining the final sequence shell as the lexical unit sequence.
The method for converting the source code file into the character stream comprises the following steps:
an initial memory buffer area is initially defined and has a size of (The initial memory buffer size may be set to a smaller value, e.g., 4KB or 8 KB), opening the source code file using a file I/O operation, obtaining a resulting file stream object, reading the file stream object byte by byte, obtaining m file blocks (the multiple small blocks of data into which the source code file is divided), sequentially storing the m file blocks in the memory buffer, and dynamically adjusting the size of the memory bufferTo the point of
Wherein, the method comprises the steps of,For the total size of m file blocks,For the size of the file block that has been read,For the total memory size (total memory of the system),As for the memory that is currently already in use,For a preset upper limit of memory buffer size,AndFor adjusting the factor, for controlling the proportional relation between the new buffer size and the file remaining size and the current available memory,AndThe index factor is used for adjusting the influence degree of the residual size of the file and the current available memory on the size of the buffer area; in order to adjust the function of the function, For the purpose of I/O throughput,For the purpose of CPU utilization,Is the degree of memory fragmentation.
The memory fragmentation level is 1 minus the size of the largest consecutive idle block in the available memory divided by the total available memory size.
Adjusting functionWherein, the method comprises the steps of,For the throughput adjustment factor to be used,In order to adjust the coefficient of utilization,The memory fragmentation adjustment coefficient is used for controlling the influence degree of each factor on the size of the buffer area.
Preprocessing the file blocks in the memory buffer area to obtain preprocessed file blocks, and converting the preprocessed file blocks into character streams.
The method for preprocessing the file blocks in the memory buffer area comprises the following steps:
Defining a five-tuple Wherein, the method comprises the steps of, wherein,As a set of states,In order to input the alphabet list of the letters,As a function of the state transition(s),Is in an initial state, and;Is a group of termination states, and
State transfer functionWherein, the method comprises the steps of, wherein,In the event of a current state,In order to input the character(s),Is in a new state.
Defining a state set based on m file blocksAnd inputting an alphabetSpecifically, initializing a state set to be a null set, initializing an input alphabet to be null, traversing m file blocks, and executing the following operations on each file block:
Scanning each character w in the file block, adding the character w to the input alphabet if the character w appears for the first time, adding the corresponding state of the character w to the state set according to the type of the character w (such as letters, numbers, special characters and the like), for example, if the character w is a letter, the character w may correspond to an identifier starting state, the character w may correspond to a numerical value starting state and the like, defining possible subsequent state transition of each new de state according to the lexical rule of the language, and constructing the state set and the input alphabet after traversing all the file blocks.
Reading characters from file blocks of a memory buffer one by oneAccording to the current stateAnd the character readNew state is calculated by state transfer functionIf the state is newBelonging toA corresponding preprocessing operation (delete comment, delete blank character, blank character including space, tab, and line feed) is performed.
Updating the current state toRepeating until all characters in the file block of the memory buffer are read, and obtaining the preprocessed file block.
And converting the preprocessed file blocks into character streams, namely reading characters in the preprocessed file blocks one by one, and forming a character sequence, namely the character streams.
Carrying out grammar analysis on the lexical unit sequence, constructing a grammar tree, and representing a grammar structure of a source code;
Specifically, grammar rules are defined, i.e. the grammar of the programming language (terminators, non-terminators and authoring production formulas) is analyzed.
Initializing a grammar tree based on a defined grammar rule, creating a root node, extracting a lexical unit from a lexical unit sequence, matching with the grammar rule, creating a new node as a left sub-node of the root node, using the matched first lexical unit as a sub-node of the newly created left sub-node, matching operators (terminators, non-terminators and other operators), creating a new node as a right sub-node of the root node, continuing matching according to a writing production formula, creating a new node as a right sub-node te of the right sub-node of the root node, matching a second lexical unit as a right sub-node of the newly created te, matching operators, creating a new node as a left sub-node tw of the right sub-node te, continuing matching a third lexical unit according to the writing production formula, creating a new node as a right sub-node of tw, repeating until the extraction of the lexical unit in the sequence is finished, and completing the construction of the grammar tree.
In the construction process, each time a writing generating formula is matched, a corresponding node is created in the grammar tree, the child node is correctly connected with the father node, a terminal (lexical unit) is used as a leaf node, and a non-terminal is used as an internal node.
Meanwhile, it should be noted that if any of the writing formulas cannot be matched, a grammar error is represented, and for ambiguous grammars, multiple grammar trees may exist, and ambiguity resolution is required according to the priority and the combinability rules of the language.
The method comprises the steps of generating an abstract syntax tree based on the syntax tree, specifically, preprocessing the syntax tree, namely traversing each node in the syntax tree, and removing nonsensical nodes such as notes, blank characters and the like.
And specifically, defining a tree reconstruction rule as a group of pattern matching rules for guiding adjustment of the node hierarchical structure of the abstract syntax tree, wherein each pattern matching rule consists of a matching pattern and a reconstruction action.
There are m2 types of matching patterns defining the structure of the subtree to be reconstructed, for example, matching pattern BinaryExpr (op, real) represents a binary operation expression, both operands of which are Literal quantities.
The reconstruction actions include node lifting, node merging and node splitting, matching rules for each patternAssigning a priority weightThe higher the priority, the earlier the reconstruction is made.
Specifically, a base weight is assigned to each pattern matching rule, which may range from 1 to 100. The basic weight reflects the basic importance of the rule in the reconstruction process, and the basic weight is adjusted in real time according to the complexity of the mode.
Rule matching for any two patternsAndThey are respectively matched with grammar treeSubtrees of (3)AndIf there is a nesting relationship between the matched patterns, calculating the nesting value between them;
If it isThen reconstructIf (1)Then reconstructIf (1)Then compareAndThe weights of the two are reconstructed earlier with higher weight.
Wherein, the method comprises the steps of,AndIs an adjusting coefficient for controlling the influence degree of each factor on the nesting depth, the value ranges are 0,1,Is thatAndIs used to determine the most recent common ancestor node of (c),Is thatIs provided with a root node of (c),Is thatAndExpressed as the difference in node number or subtree height,Is thatAndIs expressed as a proportion of their number of overlapping nodes to the total number of nodes; Is a grammar tree In (a)Is (root node depth is 0),Is a grammar treeIn (a)Is a depth of (c).
The reconstruction process comprises the following steps:
constructing a priority queue Q, initially placing all pattern matching rules into the priority queue Q in descending order of priority (obtained based on nested values), and when traversing to a node n', taking out the pattern matching rule corresponding to the highest priority from the Q And matching the matching modes of the subtrees taking n' as the root, if the matching is successful, executing a reconstruction action to obtain a reconstructed subtree, and adjusting other mode matching rules in Q according to the reconstruction action (if the reconstruction action breaks the matching modes of other mode matching rules, the mode matching rules need to be put into a queue again).
If the matching fails, continuing to match the pattern matching rule corresponding to the next priority level until all pattern matching rules are traversed, and replacing the original grammar tree with all reconstructed subtrees to finish the generation of the abstract grammar tree.
Further, the method for extracting the key code segments comprises the following steps:
the key code snippets include the content of the code snippet (in the form of a string), the code snippet type (function definition, loop statement, conditional statement, etc.), the location information of the code snippet in the source code (file name, line number, column number range), and parent node information of the code snippet (for determining context).
Traversing each node of the abstract syntax tree using a depth-first traversal or breadth-first traversal algorithm, and for each node, identifying key parts in the code, such as function definitions, function calls, variable assignments, control flow statements, etc., according to predefined rules, the key parts typically corresponding to a particular node type or node combination pattern in the abstract syntax tree.
For the identified key code snippets, the corresponding source code text is extracted (the corresponding source code location and scope can be traced back from the grammar tree node, and the corresponding code text is extracted).
The calculation mode of the information entropy value and the context correlation weight of each key code segment comprises the following steps:
each key-code snippet is represented as a corresponding sequence of tokens, where each token is a key, identifier, or literal quantity.
Counting each unique tokenIn key code segmentFrequency of occurrence in (a)Frequency-basedCalculating to obtain key code fragmentsInformation entropy value of (2);
Wherein, the method comprises the steps of, wherein,Is thatIs used to determine the complexity of the code segment of (c),Is thatIs a data stream complexity of (1); And And the information adjustment coefficient is used for controlling the influence degree of the complexity of the code segment and the complexity of the data stream on the information entropy value.
Wherein, the method comprises the steps of,Is thatIs used to determine the degree of complexity of the cycle,Is thatIs used for the branching complexity of the (c) signal,Is thatLength (total number of tokens); And For weight adjustment parameters, the influence of variables on the complexity of the code segment is controlled.
The loop complexity refers to the sum of the layers of nested loops in the key code segment, and the branch complexity refers to the sum of the number of conditional branch sentences in the code segment.
Wherein, the method comprises the steps of,Is thatIs used to determine the scope complexity of the (c) system,Is thatIs used to determine the degree of complexity of the dependency of (1),Is thatNumber of medium variables.
It should be explained that, in the abstract syntax subtree, all identifier nodes serving as leaf nodes correspond to variables, specifically, traversing the abstract syntax subtree corresponding to the key code segment to find all the leaf nodes, where the leaf nodes are either constant values or identifiers (variable names), and for the leaf nodes are identifiers, obtaining the variable names of the identifiers, and counting to obtain the total number of the variable names.
Scope complexity refers to the sum of the nesting layers of variable scopes in a key code segment, and dependency complexity refers to the complexity of data dependencies between variables in a key code segment.
Wherein, the method comprises the steps of,Is the firstThe weight coefficient of each variable reflects the importance of the variable in the key code segment,Is the firstThe dependency depth of a variable, i.e. the number of other variables on which the variable depends,Controlling the influence degree of the dependency term on the dependency complexity for the dependency adjustment parameter; Is the first Complexity of individual control dependency, for each control dependency in a key code snippet (e.g., conditional statement, loop statement, etc.), its complexity is calculatedThe complexity of the control dependency is evaluated by weighting according to the number of nested layers and the number of branches.
Fragmenting key codeAnd its context is converted into a vector representation and the cosine similarity between them is calculated as semantic similarity
Computing key code snippetsStructural similarity mean with other key code segmentsComputing statistical similarity using an n-gram model or other statistical model
The calculation method of the structural similarity mean value comprises the following steps:
Traversing the abstract syntax tree, and extracting the subtrees corresponding to each key code segment according to the position information of the extracted key code segment.
The definition tree editing operation comprises node renaming, node deleting and node inserting, and a cost value is respectively assigned to the node renaming, the node deleting and the node inserting, wherein the cost value can be fixed or can be obtained by dynamic calculation according to factors such as node types, subtree sizes and the like.
For the followingCorresponding subtreeAnd any sub-treeThe calculation willConversion toMinimum editing cost, i.e. tree editing distance
The calculation mode of the minimum editing cost comprises the following steps:
initializing a two-dimensional table, the rows and columns respectively corresponding to AndFrom bottom to top, calculates the value of each two-dimensional table cell, representing the node to beIs converted into (a) subtrees ofMinimum edit cost of the subtree of (a);
Then Wherein, the method comprises the steps of,For the number of all sub-trees,AndRespectively representAndIs a node number of (a) in the network.
Based on semantic similaritySimilarity of structuresAnd statistical similarityCalculating to obtain key code fragmentsContext correlation weight of (c);
Wherein, the method comprises the steps of,The first adjustment coefficient is used for controlling the influence degree of the semantic similarity on the context correlation weight; And the second adjustment coefficient is used for controlling the influence degree of the structural similarity and the statistical similarity on the context correlation weight.
By adjusting the first adjusting coefficient and the first adjusting coefficient, the influence degree of semantic similarity, structural similarity and statistical similarity on the context correlation weight is adjusted, the correlation between the code segment and the context at the semantic, structural and statistical feature level is considered, the correlation degree between the code segment and the context can be evaluated more comprehensively, and therefore a more accurate result is provided for the calculation of the context correlation weight.
The acquisition mode of the ordered code segment sequence comprises the following steps:
And carrying out weighted summation on the information entropy value and the context correlation weight of each key code segment to obtain a comprehensive grading value of each key code segment, and carrying out descending order arrangement on all the key code segments according to the size of the comprehensive grading value to obtain an ordered code segment sequence.
The acquisition mode of the target code file comprises the following steps:
Traversing the ordered code segment sequence, mapping each key code segment to an equivalent code segment of the target language, and fusing the mapped target code segments according to the context information (such as variable scope, function call relation and the like) of the key code segments in the source code file to obtain the target code file.
Specifically, for each key code segment, it is mapped into equivalent target language code segments (equivalent code segments) according to the grammar and API of the target language (for each medical device), and it is noted that this process needs to consider the differences between languages, such as grammar structure, keywords, data types, built-in functions, use equivalent alternative implementations for key code segments that cannot be mapped directly, or insert necessary auxiliary codes (such as type conversion, function wrapping, etc.).
The specific method for fusing the mapped target code segments comprises the following steps:
According to the scope of the variable, correctly declaring and referring to the variable, inserting function declaration, parameter transfer and return value processing codes according to the function call relation, generating codes such as class definition, method rewriting and the like according to the inheritance relation of the class, and organizing code fragments according to control flow structures such as conditional statements, circulating statements and the like.
Further, the reverse mapping is performed by converting the object code file into a machine code or an intermediate code directly executed by the terminal medical device.
Different terminal medical devices can reversely map and execute the mapped object code files to realize cross-platform code translation and operation, and it should be noted that, because the hardware and software environments of different devices are greatly different, the reverse mapping and executing processes may need to be optimized and adjusted pertinently to ensure efficient operation of codes in various heterogeneous environments.
According to the embodiment, through deep lexical analysis and abstract grammar tree generation, the quality and efficiency of code translation are improved, different terminal medical devices can execute the translated and optimized target codes efficiently, the performance and reliability of a medical informatization system are improved, an introduced information entropy value and a context correlation weight computing mechanism generate more high-quality and efficient target codes, the running speed of the system is improved, the semantic correctness of the codes in various heterogeneous environments can be ensured, potential errors and potential safety hazards are reduced, flexible code conversion and optimization are realized, development and maintenance cost is remarkably reduced, automation and intellectualization of a code translation process are greatly reduced, the workload of manual intervention and coding is improved, the translation efficiency and accuracy are improved, and the response speed and user experience of a medical platform are effectively improved.
Example 2
Referring to fig. 2, the detailed description of the embodiment is not shown in the description of embodiment 1, and a multi-terminal code mapping translation system for an internet medical platform is provided, which includes:
the source code acquisition and analysis module is used for acquiring a source code file to be translated of the medical platform, wherein the source code file comprises n sections of code fragments, performing lexical analysis on the source code file and generating an abstract syntax tree;
The segment extraction analysis module is used for traversing the abstract syntax tree, extracting key code segments and calculating the information entropy value and the context correlation weight of each key code segment;
the segment sequencing module is used for sequencing each key code segment according to the corresponding information entropy value and the context correlation weight to obtain an ordered code segment sequence;
The code mapping translation module is used for mapping and translating the ordered code fragment sequences to generate an object code file;
The distribution mapping module is used for distributing the object code files to different terminal medical equipment, the different terminal medical equipment reversely maps and operates the operation object code files, and the modules are connected in a wired and/or wireless mode to realize data transmission among the modules.
Example 3
The embodiment discloses an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the running mode of the multi-terminal code mapping translation method for the internet medical platform when executing the computer program.
Since the electronic device described in this embodiment is an electronic device for implementing a multi-terminal code mapping translation method for an internet medical platform according to the embodiment of the present application, based on the multi-terminal code mapping translation method for an internet medical platform described in the embodiment of the present application, those skilled in the art can understand the specific implementation of the electronic device and various modifications thereof, so how to implement the method in the embodiment of the present application in this electronic device will not be described in detail herein. It is within the scope of the present application to provide an electronic device for implementing a multi-terminal code mapping translation method for an internet medical platform according to the embodiments of the present application.
The above formulas are all formulas with dimensionality removed and numerical calculation, the formulas are formulas with the latest real situation obtained by software simulation through collecting a large amount of data, and preset parameters and threshold selection in the formulas are set by those skilled in the art according to the actual situation.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to those skilled in the art without departing from the principles of the present invention are intended to be comprehended within the scope of the present invention.

Claims (8)

1.一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,包括:S1、获取医疗平台待转译的源代码文件,源代码文件包括n段代码片段,对源代码文件进行词法分析,并生成得到抽象语法树;1. A multi-terminal code mapping and translation method for an Internet medical platform, characterized by comprising: S1, obtaining a source code file to be translated of a medical platform, the source code file comprising n code segments, performing lexical analysis on the source code file, and generating an abstract syntax tree; S2、遍历抽象语法树,提取关键代码片段,计算每个关键代码片段的信息熵值和上下文相关性权重;S2, traverse the abstract syntax tree, extract key code snippets, and calculate the information entropy value and context relevance weight of each key code snippet; S3、将每个关键代码片段按照对应的信息熵值和上下文相关性权重进行排序,得到有序代码片段序列;S3, sorting each key code snippet according to the corresponding information entropy value and context relevance weight to obtain an ordered code snippet sequence; S4、将有序代码片段序列进行映射转译,生成目标代码文件;S4, mapping and translating the ordered code fragment sequence to generate a target code file; S5、将目标代码文件分发至不同的终端医疗设备,不同的终端医疗设备对目标代码文件进行反向映射并运行;S5. Distribute the target code file to different terminal medical devices, and the different terminal medical devices reversely map and run the target code file; 所述每个关键代码片段的信息熵值和上下文相关性权重的计算方式包括:The information entropy value and context relevance weight of each key code snippet are calculated as follows: 将每个关键代码片段表示为对应的一个令牌序列,其中,每个令牌为关键字、标识符或字面量;Represent each key code snippet as a corresponding sequence of tokens, where each token is a keyword, identifier, or literal; 统计每个唯一的令牌在关键代码片段中出现的频率;基于频率计算得到关键代码片段的信息熵值Count each unique token In the key code snippet Frequency of occurrence ; Based on frequency Calculate the key code snippet Information entropy value ; ,其中,的代码片段复杂度,的数据流复杂度;为信息调节系数; ,in, for The complexity of the code snippet is for The data flow complexity of and is the information adjustment coefficient; ;其中,的循环复杂度,的分支复杂度,的长度;为权重调节参数; ;in, for The cyclomatic complexity of for The branch complexity of for Length; , and is the weight adjustment parameter; 循环复杂度指关键代码片段中嵌套循环的层数和;分支复杂度指代码片段中条件分支语句的数量和;Cyclomatic complexity refers to the sum of the number of nested loops in the key code snippet; branch complexity refers to the sum of the number of conditional branch statements in the code snippet; ;其中,的作用域复杂度,的依赖复杂度,中变量的数量;作用域复杂度指关键代码片段中变量作用域的嵌套层数和; ;in, for The scope complexity of for The dependency complexity, for The number of variables in the code; scope complexity refers to the sum of the nesting levels of variable scopes in key code snippets; ;其中,是第个变量的权重系数,为第个变量的依赖深度,为依赖调整参数;为第个控制依赖项的复杂度; ;in, It is The weight coefficient of the variable, For the The dependency depth of each variable, Adjust parameters for dependencies; For the A control over the complexity of dependencies; 将关键代码片段和其上下文转换为向量表示,并计算它们之间的余弦相似度作为语义相似度The key code snippet and its context are converted into vector representations, and the cosine similarity between them is calculated as the semantic similarity ; 计算关键代码片段与其他关键代码片段之间的结构相似度均值;使用n-gram模型来计算统计相似度Calculate key code snippets The mean structural similarity with other key code snippets ; Use the n-gram model to calculate statistical similarity ; 基于语义相似度、结构相似度均值和统计相似度,计算得到关键代码片段的上下文相关性权重Based on semantic similarity , mean structural similarity and statistical similarity , calculate the key code snippet The contextual relevance weight , ;其中,为第一调节系数;为第二调节系数; ;in, is the first adjustment coefficient; is the second adjustment coefficient; 所述结构相似度均值的计算方式包括:The calculation method of the structural similarity mean includes: 遍历抽象语法树,根据提取的关键代码片段的位置信息,提取出每个关键代码片段对应的子树;Traverse the abstract syntax tree and extract the subtree corresponding to each key code fragment according to the location information of the extracted key code fragment; 定义树编辑操作包括节点重命名、节点删除和节点插入,并为节点重命名、节点删除和节点插入各自赋予一个代价值;Define tree editing operations including node renaming, node deletion and node insertion, and assign a cost value to each of node renaming, node deletion and node insertion; 对于对应的子树和任一棵子树,计算将转换为的最小编辑代价,即树编辑距离for The corresponding subtree and any subtree , the calculation will Convert to The minimum edit cost of ; 最小编辑代价的计算方式包括:The calculation method of the minimum editing cost includes: 初始化一个二维表格,行和列分别对应中的节点,自底向上计算每个二维表格的单元格的值,表示将的子树转换为的子树的最小编辑代价;Initialize a two-dimensional table with rows and columns corresponding to and The nodes in the table calculate the value of each cell in the two-dimensional table from bottom to top, indicating that The subtree is converted to The minimum editing cost of the subtree of ; ;其中,为全部子树的数量,分别表示的节点数量;but ;in, is the number of all subtrees, and Respectively and The number of nodes; 所述有序代码片段序列的获取方式包括:The method of obtaining the ordered code snippet sequence includes: 将每个关键代码片段的信息熵值和上下文相关性权重进行加权求和,得到每个关键代码片段的综合评分值;根据综合评分值的大小对所有关键代码片段进行降序排列,得到有序代码片段序列;The information entropy value and context relevance weight of each key code snippet are weighted and summed to obtain a comprehensive score value of each key code snippet; all key code snippets are arranged in descending order according to the size of the comprehensive score value to obtain an ordered code snippet sequence; 所述目标代码文件的获取方式包括:The object code file is obtained by: 遍历有序代码片段序列,将每个关键代码片段映射到目标语言的等效代码片段,根据关键代码片段在源代码文件中的上下文信息,将映射后的目标代码片段进行融合,得到目标代码文件;Traversing the ordered code snippet sequence, mapping each key code snippet to an equivalent code snippet in the target language, and merging the mapped target code snippets according to the context information of the key code snippet in the source code file to obtain a target code file; 所述进行反向映射的方式为将目标代码文件转换为终端医疗设备直接执行的机器码或中间码。The reverse mapping method is to convert the target code file into machine code or intermediate code that is directly executed by the terminal medical device. 2.根据权利要求1所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,所述生成得到抽象语法树的方式包括:2. According to a multi-terminal code mapping and translation method for an Internet medical platform according to claim 1, it is characterized in that the method of generating an abstract syntax tree includes: 将源代码文件转换为字符流,从字符流中提取词法单元,并生成词法单元序列;对词法单元序列进行语法分析,构建语法树,将语法树进行预处理,并重新组织语法树中节点的层级结构,得到抽象语法树。Convert the source code file into a character stream, extract lexical units from the character stream, and generate a lexical unit sequence; perform grammatical analysis on the lexical unit sequence, build a syntax tree, preprocess the syntax tree, and reorganize the hierarchical structure of the nodes in the syntax tree to obtain an abstract syntax tree. 3.根据权利要求2所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,所述生成词法单元序列的方式包括:3. According to a multi-terminal code mapping and translation method for an Internet medical platform according to claim 2, it is characterized in that the method of generating a lexical unit sequence includes: 初始化一个空的序列壳,逐个读取字符流中的字符,并使用一个指针实时记录当前读取的位置;Initialize an empty sequence shell, read the characters in the character stream one by one, and use a pointer to record the current reading position in real time; 定义词法单元类型,为每一种词法单元类型构建对应的前缀树;从当前读取的位置开始,依次尝试匹配每一种词法单元类型对应的前缀树;若前缀树与当前的字符匹配成功,则提取出对应的词法单元;Define the lexical unit type and build a corresponding prefix tree for each lexical unit type; starting from the current read position, try to match the prefix tree corresponding to each lexical unit type in turn; if the prefix tree successfully matches the current character, extract the corresponding lexical unit; 根据对应的前缀树确定词法单元的类型和值,并记录词法单元在源代码文件中的位置信息,基于词法单元的类型、值和位置信息,构造出词法单元实例,并加入到序列壳中,将指针移动到本次提取的词法单元的结束位置,重复直到字符流读取完毕;最终的序列壳即为词法单元序列。Determine the type and value of the lexical unit according to the corresponding prefix tree, and record the position information of the lexical unit in the source code file. Based on the type, value and position information of the lexical unit, construct a lexical unit instance and add it to the sequence shell. Move the pointer to the end position of the lexical unit extracted this time, and repeat until the character stream is read; the final sequence shell is the lexical unit sequence. 4.根据权利要求3所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,所述将源代码文件转换为字符流的方式包括:4. A multi-terminal code mapping and translation method for an Internet medical platform according to claim 3, characterized in that the method of converting the source code file into a character stream comprises: 初始定义一个初始的内存缓冲区,其大小为,使用文件I/O操作打开源代码文件,获取得到文件流对象,逐字节读取文件流对象,得到m个文件块,将m个文件块依次存储在内存缓冲区中,并动态调整内存缓冲区的大小Initially define an initial memory buffer with a size of , use file I/O operation to open the source code file, obtain the file stream object, read the file stream object byte by byte, obtain m file blocks, store the m file blocks in the memory buffer in sequence, and dynamically adjust the size of the memory buffer to ; ;其中,为m个文件块的总大小,为已读取的文件块的大小,为总内存大小,为当前已使用的内存,为预设的内存缓冲区大小的上限,为调节因子,为指数因子;为调节函数,为I/O吞吐量,为CPU利用率,为内存碎片化程度; ;in, is the total size of m file blocks, is the size of the file block that has been read, is the total memory size, is the currently used memory, The upper limit of the preset memory buffer size. and is the regulating factor, and is the exponential factor; is the adjustment function, is the I/O throughput, is the CPU utilization, is the degree of memory fragmentation; 调节函数;其中,为吞吐量调节系数,为利用率调节系数,为内存碎片化调节系数;Adjustment function ;in, is the throughput adjustment factor, is the utilization adjustment coefficient, is the memory fragmentation adjustment factor; 将内存缓冲区内的文件块进行预处理,得到预处理后的文件块,将预处理后的文件块转换为字符流;即逐个读取预处理后的文件块中的字符,并组成一个字符序列,即字符流。The file blocks in the memory buffer are preprocessed to obtain preprocessed file blocks, and the preprocessed file blocks are converted into character streams; that is, characters in the preprocessed file blocks are read one by one and form a character sequence, namely, a character stream. 5.根据权利要求4所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,所述将内存缓冲区内的文件块进行预处理的方式包括:5. According to a multi-terminal code mapping and translation method for an Internet medical platform according to claim 4, it is characterized in that the method of preprocessing the file blocks in the memory buffer includes: 定义一个五元组,其中,为状态集合,为输入字母表,为状态转移函数,是初始状态,且为一组终止状态,且Define a quintuple ,in, is the state set, To input the alphabet, is the state transfer function, is the initial state, and ; is a set of terminal states, and ; 状态转移函数,其中,为当前状态,为输入的字符,为新状态;State transfer function ,in, is the current state, is the input character, For the new state; 基于m个文件块,定义状态集合和输入字母表Based on m file blocks, define the state set and input alphabet ; 从内存缓冲区的文件块中逐个读取字符,根据当前状态和读取的字符,通过状态转移函数计算得到新状态;若新状态属于,则执行相应的预处理操作;将当前状态更新为,重复直至读取完内存缓冲区的文件块中的所有字符;即得到预处理后的文件块。Read characters one by one from the file block in the memory buffer , according to the current state and the characters read , the new state is obtained by calculating the state transfer function If the new state belong , then perform the corresponding preprocessing operations; update the current state to , repeat until all characters in the file block in the memory buffer are read; that is, the preprocessed file block is obtained. 6.根据权利要求5所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,所述对词法单元序列进行语法分析的方式包括:6. A multi-terminal code mapping and translation method for an Internet medical platform according to claim 5, characterized in that the method of performing grammatical analysis on the lexical unit sequence includes: 定义语法规则,即分析编程语言的语法;基于定义的语法规则,初始化一棵语法树,创建一个根节点,从词法单元序列中提取词法单元,并与语法规则进行匹配,并创建一个新节点作为根节点的左子节点;匹配的第一个词法单元,将其作为新创建的左子节点的子节点,匹配运算符,并创建一个新节点作为根节点的右子节点,根据编写产生式,继续匹配,创建一个新节点作为根节点的右子节点的右子节点te,匹配第二个词法单元,将其作为新创建的te的右子节点,并匹配运算符,创建一个新节点作为右子节点te的左子节点tw,根据编写产生式,继续匹配第三个词法单元,创建一个新节点作为tw的右子节点,重复直至词法单元序列中的词法单元提取结束;即完成语法树的构建。Define grammar rules, that is, analyze the grammar of the programming language; based on the defined grammar rules, initialize a grammar tree, create a root node, extract lexical units from the lexical unit sequence, match them with the grammar rules, and create a new node as the left child of the root node; match the first lexical unit, make it the child of the newly created left child node, match the operator, and create a new node as the right child of the root node; continue matching according to the written production rule, create a new node as the right child node te of the right child node of the root node, match the second lexical unit, make it the right child node of the newly created te, match the operator, create a new node as the left child node tw of the right child node te; continue matching the third lexical unit according to the written production rule, create a new node as the right child node of tw, and repeat until the extraction of lexical units in the lexical unit sequence is completed; that is, the construction of the grammar tree is completed. 7.根据权利要求6所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,所述重新组织语法树中节点的层级结构的方式包括:7. A multi-terminal code mapping and translation method for an Internet medical platform according to claim 6, characterized in that the method of reorganizing the hierarchical structure of the nodes in the syntax tree comprises: 定义树重构规则为一组模式匹配规则,每个模式匹配规则均由匹配模式和重构动作组成;The tree reconstruction rules are defined as a set of pattern matching rules, each of which consists of a matching pattern and a reconstruction action; 匹配模式有m2种,定义了需要重构的子树的结构;重构动作包括节点提升、节点合并和节点拆分;There are m2 matching modes, which define the structure of the subtree that needs to be reconstructed; the reconstruction actions include node promotion, node merging, and node splitting; 为每个模式匹配规则指定一个优先级权重;对于任意两个模式匹配规则,它们分别匹配语法树中的子树;若它们匹配到的匹配模式存在嵌套关系,计算它们之间的嵌套值For each pattern matching rule Assign a priority weight ; For any two pattern matching rules and , which match the syntax tree Subtree in and ; If the matching patterns they match have a nested relationship, calculate the nested value between them ; ,则先重构;若,则先重构;若,则比较各自的权重,权重更高的先重构;like , then reconstruct first ;like , then reconstruct first ;like , then compare and Their respective weights, the one with higher weight will be reconstructed first; ;其中,是调节系数,的最近的公共祖先节点,的根节点,的大小差异,的重叠程度;为语法树的深度,为语法树的深度; ;in, , and is the adjustment coefficient, for and The nearest common ancestor node, for The root node of for and The size difference, for and the degree of overlap; For syntax tree middle The depth of For syntax tree middle Depth 重构的过程包括:The reconstruction process includes: 构建一个优先级队列Q,初始时将所有模式匹配规则按照优先级降序放入优先级队列Q,每遍历到一个节点n'时,从Q中取出最高的优先级对应的模式匹配规则,匹配在以n'为根的子树的匹配模式,若匹配成功,则执行重构动作,得到重构后的子树,并根据重构动作对Q中的其他模式匹配规则进行调整;Construct a priority queue Q. Initially, put all pattern matching rules into the priority queue Q in descending order of priority. Each time a node n' is traversed, take out the pattern matching rule corresponding to the highest priority from Q. , match the matching pattern in the subtree with n' as the root. If the match is successful, perform the reconstruction action to obtain the reconstructed subtree, and adjust other pattern matching rules in Q according to the reconstruction action; 若匹配失败,则继续与下一个优先级对应的模式匹配规则进行匹配,直到遍历完所有模式匹配规则,将所有重构后的子树替换原来的语法树,即完成抽象语法树的生成。If the match fails, continue to match the pattern matching rule corresponding to the next priority until all pattern matching rules are traversed and all reconstructed subtrees replace the original syntax tree, thus completing the generation of the abstract syntax tree. 8.一种用于互联网医疗平台的多终端代码映射转译系统,其用于实现权利要求1至7任一项所述的一种用于互联网医疗平台的多终端代码映射转译方法,其特征在于,包括:源码获取与分析模块,用于获取医疗平台待转译的源代码文件,源代码文件包括n段代码片段,对源代码文件进行词法分析,并生成得到抽象语法树;8. A multi-terminal code mapping and translation system for an Internet medical platform, which is used to implement a multi-terminal code mapping and translation method for an Internet medical platform as described in any one of claims 1 to 7, characterized in that it comprises: a source code acquisition and analysis module, which is used to acquire a source code file to be translated of the medical platform, the source code file comprising n code segments, perform lexical analysis on the source code file, and generate an abstract syntax tree; 片段提取分析模块,用于遍历抽象语法树,提取关键代码片段,计算每个关键代码片段的信息熵值和上下文相关性权重;The fragment extraction and analysis module is used to traverse the abstract syntax tree, extract key code fragments, and calculate the information entropy value and context relevance weight of each key code fragment; 片段排序模块,用于将每个关键代码片段按照对应的信息熵值和上下文相关性权重进行排序,得到有序代码片段序列;A fragment sorting module is used to sort each key code fragment according to the corresponding information entropy value and context relevance weight to obtain an ordered code fragment sequence; 代码映射转译模块,用于将有序代码片段序列进行映射转译,生成目标代码文件;A code mapping and translation module is used to map and translate the ordered code snippet sequence to generate a target code file; 分发映射模块,用于将目标代码文件分发至不同的终端医疗设备,不同的终端医疗设备对目标代码文件进行反向映射并运行。The distribution mapping module is used to distribute the target code file to different terminal medical devices, and the different terminal medical devices reversely map and run the target code file.
CN202411402188.2A 2024-10-09 2024-10-09 Multi-terminal code mapping translation method for Internet medical platform Active CN118916031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411402188.2A CN118916031B (en) 2024-10-09 2024-10-09 Multi-terminal code mapping translation method for Internet medical platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411402188.2A CN118916031B (en) 2024-10-09 2024-10-09 Multi-terminal code mapping translation method for Internet medical platform

Publications (2)

Publication Number Publication Date
CN118916031A CN118916031A (en) 2024-11-08
CN118916031B true CN118916031B (en) 2024-12-31

Family

ID=93309013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411402188.2A Active CN118916031B (en) 2024-10-09 2024-10-09 Multi-terminal code mapping translation method for Internet medical platform

Country Status (1)

Country Link
CN (1) CN118916031B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111708539A (en) * 2020-06-17 2020-09-25 腾讯科技(深圳)有限公司 Application program code conversion method and device, electronic equipment and storage medium
US11392373B1 (en) * 2019-12-10 2022-07-19 Cerner Innovation, Inc. System and methods for code base transformations

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3430252B2 (en) * 2000-01-24 2003-07-28 独立行政法人産業技術総合研究所 Source code conversion method, recording medium recording source code conversion program, and source code conversion device
CN1516009A (en) * 2003-01-08 2004-07-28 深圳市中兴通讯股份有限公司上海第二 Efficient Optimization Method of Speech Codec Based on Digital Signal Processor
US10157055B2 (en) * 2016-09-29 2018-12-18 Microsoft Technology Licensing, Llc Code refactoring mechanism for asynchronous code optimization using topological sorting
CN113791757B (en) * 2021-07-14 2023-08-22 北京邮电大学 Software requirement and code mapping method and system
CN115016793A (en) * 2022-04-25 2022-09-06 中国平安人寿保险股份有限公司 Code generation method and device based on syntax tree, electronic equipment and storage medium
CN118113264A (en) * 2023-12-13 2024-05-31 天翼云科技有限公司 A SQL code hint method based on keyword backtracking and token sorting

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11392373B1 (en) * 2019-12-10 2022-07-19 Cerner Innovation, Inc. System and methods for code base transformations
CN111708539A (en) * 2020-06-17 2020-09-25 腾讯科技(深圳)有限公司 Application program code conversion method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN118916031A (en) 2024-11-08

Similar Documents

Publication Publication Date Title
CN108717470B (en) A Code Snippet Recommendation Method with High Accuracy
CN112035511A (en) Target data searching method based on medical knowledge graph and related equipment
CN111191002B (en) Neural code searching method and device based on hierarchical embedding
CN114924741B (en) A code completion method based on structural features and sequence features
CN1661593B (en) Method for translating computer language and translation system
CN111124487B (en) Code clone detection method and device and electronic equipment
CN115438709A (en) Code Similarity Detection Method Based on Code Attribute Graph
CN112306494A (en) Code classification and clustering method based on convolution and cyclic neural network
CN116719520B (en) Code generation method and device
CN112379917A (en) Browser compatibility improving method, device, equipment and storage medium
CN113918512A (en) System and method for constructing knowledge map of power grid operation rules
CN110597847A (en) SQL statement automatic generation method, device, equipment and readable storage medium
CN111831624A (en) Data table creating method and device, computer equipment and storage medium
CN116225933A (en) Program code checking method and checking device
CN118916031B (en) Multi-terminal code mapping translation method for Internet medical platform
Hu et al. Deep-autocoder: Learning to complete code precisely with induced code tokens
Xu et al. Tree2tree structural language modeling for compiler fuzzing
CN118484465A (en) A method and device for generating SQL statements from natural language statements
CN117828024A (en) Plug-in retrieval method, device, storage medium and equipment
CN117573084A (en) Code complement method based on layer-by-layer fusion abstract syntax tree
CN117390130A (en) A code search method based on multimodal representation
KR20050065015A (en) System and method for checking program plagiarism
CN114647418A (en) Software code recommendation method for tree serialization embedding
CN112988952A (en) Multi-level-length text vector retrieval method and device and electronic equipment
CN111797624A (en) An automatic extraction method of drug business card based on NPL

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant