CN118916031B

CN118916031B - Multi-terminal code mapping translation method for Internet medical platform

Info

Publication number: CN118916031B
Application number: CN202411402188.2A
Authority: CN
Inventors: 曹兴兵; 高飞; 王超; 林涛; 毛夏薇
Original assignee: Zhejiang Nali Shuzhi Health Technology Co ltd
Current assignee: Zhejiang Nali Shuzhi Health Technology Co ltd
Priority date: 2024-10-09
Filing date: 2024-10-09
Publication date: 2024-12-31
Anticipated expiration: 2044-10-09
Also published as: CN118916031A

Abstract

The present invention belongs to the technical field of terminal codes, and discloses a multi-terminal code mapping and translation method for an Internet medical platform; the method comprises: obtaining a source code file to be translated of a medical platform, the source code file comprising n code segments, performing lexical analysis on the source code file, and generating an abstract syntax tree; traversing the abstract syntax tree, extracting key code segments, and calculating the information entropy value and context relevance weight of each key code segment; sorting each key code segment according to the corresponding information entropy value and context relevance weight to obtain an ordered code segment sequence; mapping and translating the ordered code segment sequence to generate a target code file; distributing the target code file to different terminal medical devices, and the different terminal medical devices reversely mapping and running the running target code file, thereby greatly reducing the workload of manual intervention and coding, and improving the translation efficiency and accuracy.

Description

Multi-terminal code mapping translation method for Internet medical platform

Technical Field

The invention relates to the technical field of terminal codes, in particular to a multi-terminal code mapping translation method for an Internet medical platform.

Background

The patent with the application publication number of CN104360850A discloses a service code processing method and a device, which comprise the steps of receiving a request for acquiring a code description sent by an external application program, requesting associated information containing the code description, judging whether the code type is loaded in a pre-generated code translation mapping relation table, acquiring the code description corresponding to the code value from the code translation mapping relation table if the code type is loaded in the pre-generated code translation mapping relation table, and sending the code description corresponding to the code value acquired from the code translation mapping relation table to the external application program, so that the overhead and redundancy of an application program memory can be reduced, and efficient and convenient translation of the service code and synchronous sharing of the code translation mapping relation are realized.

However, in the internet medical field, because of the huge difference between the hardware architecture and the operating system of the terminal medical equipment, the traditional code translation method often needs to perform a great amount of manual adjustment for each platform system, has low efficiency and is easy to introduce errors, the control codes of the traditional code translation method have uneven running efficiency on different equipment, which seriously affects the accuracy and the safety, on the other hand, the traditional code translation method often has difficulty in accurately evaluating the complexity context correlation of the codes, so that the problem of low efficiency or semantic errors may exist in the translation result, and in addition, the traditional code translation process lacks automation and intelligence, needs a great amount of manual intervention and coding, has low efficiency and is easy to cause human errors.

In view of the above, the present invention proposes a multi-terminal code mapping translation method for an internet medical platform to solve the above-mentioned problems.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a multi-terminal code mapping translation method for an Internet medical platform, which comprises the following steps of S1, acquiring a source code file to be translated of the medical platform, wherein the source code file comprises n sections of code fragments, performing lexical analysis on the source code file, and generating an abstract syntax tree;

S2, traversing an abstract syntax tree, extracting key code segments, and calculating an information entropy value and a context correlation weight of each key code segment;

s3, ordering each key code segment according to the corresponding information entropy value and the context correlation weight to obtain an ordered code segment sequence;

s4, mapping and translating the ordered code segment sequences to generate an object code file;

and S5, distributing the object code file to different terminal medical equipment, and reversely mapping and operating the operation object code file by the different terminal medical equipment.

Further, the generating the abstract syntax tree comprises:

The method comprises the steps of converting a source code file into a character stream, extracting lexical units from the character stream, generating a lexical unit sequence, carrying out grammar analysis on the lexical unit sequence, constructing a grammar tree, preprocessing the grammar tree, and reorganizing the hierarchical structure of nodes in the grammar tree to obtain an abstract grammar tree.

Further, the method for generating the lexical unit sequence comprises the following steps:

initializing an empty sequence shell, reading characters in a character stream one by one, and recording the current read position in real time by using a pointer;

Defining lexical unit types, constructing a corresponding prefix tree for each lexical unit type, sequentially trying to match the prefix tree corresponding to each lexical unit type from the current read position, and extracting the corresponding lexical unit if the prefix tree is successfully matched with the current character;

Determining the type and the value of the lexical unit according to the corresponding prefix tree, recording the position information of the lexical unit in the source code file, constructing a lexical unit example based on the type, the value and the position information of the lexical unit, adding the lexical unit example into a sequence shell, moving a pointer to the end position of the lexical unit extracted at the time, repeating until the character stream is read, and obtaining the final sequence shell as the lexical unit sequence.

Further, the method for converting the source code file into the character stream comprises the following steps:

an initial memory buffer area is initially defined and has a size of Opening a source code file by using file I/O operation, obtaining a file stream object, reading the file stream object byte by byte to obtain m file blocks, sequentially storing the m file blocks in a memory buffer, and dynamically adjusting the size of the memory bufferTo the point of;

Wherein, the method comprises the steps of,For the total size of m file blocks,For the size of the file block that has been read,For the total memory size to be the same,As for the memory that is currently already in use,For a preset upper limit of memory buffer size,AndIn order to adjust the factor(s),AndIs an exponential factor; in order to adjust the function of the function, For the purpose of I/O throughput,For the purpose of CPU utilization,The memory fragmentation degree;

Adjusting function Wherein, the method comprises the steps of,For the throughput adjustment factor to be used,In order to adjust the coefficient of utilization,The memory fragmentation adjustment coefficient is used;

Preprocessing the file blocks in the memory buffer area to obtain preprocessed file blocks, and converting the preprocessed file blocks into character streams, namely reading characters in the preprocessed file blocks one by one and forming a character sequence, namely the character streams.

Further, the method for preprocessing the file blocks in the memory buffer area includes:

Defining a five-tuple Wherein, the method comprises the steps of, wherein,As a set of states,In order to input the alphabet list of the letters,As a function of the state transition(s),Is in an initial state, and;Is a group of termination states, and;

State transfer functionWherein, the method comprises the steps of, wherein,In the event of a current state,In order to input the character(s),Is in a new state;

defining a state set based on m file blocks And inputting an alphabet;

Reading characters from file blocks of a memory buffer one by oneAccording to the current stateAnd the character readNew state is calculated by state transfer functionIf the state is newBelonging toThen execute corresponding preprocessing operation to update the current state toRepeating until all characters in the file block of the memory buffer are read, and obtaining the preprocessed file block.

Further, the method for analyzing the lexical unit sequence comprises the following steps:

Defining grammar rules, namely analyzing grammar of a programming language, initializing a grammar tree based on the defined grammar rules, creating a root node, extracting lexical units from a lexical unit sequence, matching the lexical units with the grammar rules, creating a new node serving as a left sub-node of the root node, matching operators by using the first lexical unit as a sub-node of the newly created left sub-node, creating a new node serving as a right sub-node of the root node, continuing to match according to a programming generation formula, creating a new node serving as a right sub-node te of the right sub-node of the root node, matching a second lexical unit serving as a right sub-node of the newly created te, matching operators, creating a new node serving as a left sub-node tw of the right sub-node te, continuing to match a third lexical unit according to the programming generation formula, creating a new node serving as a right sub-node of tw, repeating until the extraction of the lexical units in the lexical unit sequence is finished, and completing the construction of the grammar tree.

Further, the method for reorganizing the hierarchical structure of the nodes in the syntax tree includes:

Defining a tree reconstruction rule as a group of pattern matching rules, wherein each pattern matching rule consists of a matching pattern and a reconstruction action;

The matching modes have m2 types, and define the structure of the subtree to be reconstructed, wherein the reconstruction actions comprise node lifting, node merging and node splitting;

Matching rules for each pattern Assigning a priority weightFor any two pattern matching rulesAndThey are respectively matched with grammar treeSubtrees of (3)AndIf there is a nesting relationship between the matched patterns, calculating the nesting value between them;

If it isThen reconstructIf (1)Then reconstructIf (1)Then compareAndThe weights of the two are reconstructed firstly with higher weight;

;

Wherein, 、AndIs the adjustment coefficient of the light source,Is thatAndIs used to determine the most recent common ancestor node of (c),Is thatIs provided with a root node of (c),Is thatAndIs provided for the difference in size of the (c) in the (c),Is thatAndIs a degree of overlap of (2); Is a grammar tree In (a)Is provided with a depth of (a),Is a grammar treeIn (a)Is a depth of (2);

The reconstruction process comprises the following steps:

Constructing a priority queue Q, initially placing all pattern matching rules into the priority queue Q according to descending priority, and when traversing to a node n', extracting the pattern matching rule corresponding to the highest priority from the Q Matching the matching modes of the subtrees taking n' as the root, if the matching is successful, executing the reconstruction action to obtain the reconstructed subtrees, and adjusting other mode matching rules in Q according to the reconstruction action;

If the matching fails, continuing to match the pattern matching rule corresponding to the next priority level until all pattern matching rules are traversed, and replacing the original grammar tree with all reconstructed subtrees to finish the generation of the abstract grammar tree.

Further, the calculation mode of the information entropy value and the context correlation weight of each key code segment comprises the following steps:

representing each key code segment as a corresponding sequence of tokens, wherein each token is a key, identifier or literal quantity;

Counting each unique token In key code segmentFrequency of occurrence in (a)Frequency-basedCalculating to obtain key code fragmentsInformation entropy value of (2);

,

Wherein, Is thatIs used to determine the complexity of the code segment of (c),Is thatIs a data stream complexity of (1); And Adjusting coefficients for the information;

wherein, the method comprises the steps of, Is thatIs used to determine the degree of complexity of the cycle,Is thatIs used for the branching complexity of the (c) signal,Is thatIs a length of (2);、 And Adjusting parameters for the weights;

Loop complexity refers to the sum of the layers of nested loops in key code segments

The impurity degree refers to the sum of the number of conditional branch sentences in the code segment;

wherein, the method comprises the steps of, Is thatIs used to determine the scope complexity of the (c) system,Is thatIs used to determine the degree of complexity of the dependency of (1),Is thatThe scope complexity refers to the sum of nesting layers of the scope of the variable in the key code segment;

wherein, the method comprises the steps of, Is the firstThe weight coefficient of the individual variable is determined,Is the firstThe depth of dependence of the individual variables,To rely on adjustment parameters; Is the first Complexity of individual control dependencies;

fragmenting key code And its context is converted into a vector representation and the cosine similarity between them is calculated as semantic similarity;

Computing key code snippetsStructural similarity mean with other key code segmentsComputing statistical similarity using an n-gram model;

Based on semantic similaritySimilarity of structuresAnd statistical similarityCalculating to obtain key code fragmentsContext correlation weight of (c),

Wherein, the method comprises the steps of,Is a first adjustment coefficient; Is the second adjustment coefficient.

Further, the calculating method of the structural similarity mean value includes:

Traversing the abstract syntax tree, and extracting the subtrees corresponding to each key code segment according to the position information of the extracted key code segment.

The definition tree editing operation comprises node renaming, node deleting and node inserting, and a cost value is respectively assigned to the node renaming, the node deleting and the node inserting;

for the following Corresponding subtreeAnd any sub-treeThe calculation willConversion toMinimum editing cost, i.e. tree editing distance;

The calculation mode of the minimum editing cost comprises the following steps:

initializing a two-dimensional table, the rows and columns respectively corresponding to AndFrom bottom to top, calculates the value of each two-dimensional table cell, representing the node to beIs converted into (a) subtrees ofMinimum edit cost of the subtree of (a);

Then Wherein, the method comprises the steps of,For the number of all sub-trees,AndRespectively representAndIs the number of nodes;

the acquisition mode of the ordered code segment sequence comprises the following steps:

The information entropy value and the context correlation weight of each key code segment are weighted and summed to obtain a comprehensive grading value of each key code segment;

the acquisition mode of the target code file comprises the following steps:

Traversing the ordered code segment sequence, mapping each key code segment to an equivalent code segment of the target language, and fusing the mapped target code segments according to the context information of the key code segments in the source code file to obtain a target code file;

The reverse mapping is performed by converting the object code file into machine code or intermediate code directly executed by the terminal medical equipment.

The multi-terminal code mapping and translating system for the Internet medical platform is used for realizing the multi-terminal code mapping and translating method for the Internet medical platform and comprises a source code acquisition and analysis module, a processing module and a processing module, wherein the source code acquisition and analysis module is used for acquiring a source code file to be translated of the medical platform, the source code file comprises n sections of code fragments, performing lexical analysis on the source code file and generating an abstract syntax tree;

The segment extraction analysis module is used for traversing the abstract syntax tree, extracting key code segments and calculating the information entropy value and the context correlation weight of each key code segment;

the segment sequencing module is used for sequencing each key code segment according to the corresponding information entropy value and the context correlation weight to obtain an ordered code segment sequence;

The code mapping translation module is used for mapping and translating the ordered code fragment sequences to generate an object code file;

And the distribution mapping module is used for distributing the object code file to different terminal medical equipment, and the different terminal medical equipment reversely maps and operates the operation object code file.

The invention relates to a multi-terminal code mapping translation method for an Internet medical platform, which has the technical effects and advantages that:

The invention improves the quality and efficiency of code translation through deep lexical analysis and abstract grammar tree generation, different terminal medical equipment can execute the translated and optimized target code with high efficiency, improves the performance and reliability of a medical informatization system, introduces an information entropy value and a context correlation weight calculation mechanism, generates a better and efficient target code, is beneficial to improving the running speed of the system, and ensures the semantic correctness of the code in various heterogeneous environments, thereby reducing potential errors and potential safety hazards, realizing flexible code conversion and optimization, remarkably reducing the development and maintenance cost, realizing the automation and intellectualization of the code translation process, greatly reducing the workload of manual intervention and coding, improving the translation efficiency and accuracy, and effectively improving the response speed and user experience of a medical platform.

Drawings

FIG. 1 is a diagram of a multi-terminal code mapping translation method for an Internet medical platform according to the present invention;

Fig. 2 is a schematic diagram of a multi-terminal code mapping translation system for an internet medical platform according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

Referring to fig. 1, a multi-terminal code mapping translation method for an internet medical platform according to the present embodiment includes:

s1, acquiring a source code file to be translated of a medical platform, wherein the source code file comprises n sections of code fragments, performing lexical analysis on the source code file, and generating an abstract syntax tree;

s5, distributing the object code file to different terminal medical equipment, and reversely mapping and operating the operation object code file by the different terminal medical equipment, so that cross-platform code translation is realized, and the operation efficiency of the code in a heterogeneous environment is improved.

Further, the generating the abstract syntax tree includes:

The source code file is converted into a character stream (sequence of characters organized in sequence), lexical units are extracted from the character stream, and a sequence of lexical units is generated.

Specifically, an empty sequence shell is initialized, characters in the character stream are read one by one, and a pointer is used for recording the current read position in real time.

The method comprises the steps of defining lexical unit types (such as keywords, identifiers, numerical values and the like), constructing a corresponding prefix tree (the character string set is organized and stored according to the prefix paths of the character strings) for each lexical unit type, and storing all possible character strings (such as keyword lists) contained in each lexical unit type in one prefix tree.

And sequentially trying to match the prefix tree corresponding to each lexical unit type from the current read position, and extracting the corresponding lexical unit if the prefix tree is successfully matched with the current character.

Determining the type and the value of the lexical unit according to the corresponding prefix tree, recording the position information (such as line number, column number and the like) of the lexical unit in the source code file, constructing a lexical unit example based on the type, the value and the position information of the lexical unit, adding the lexical unit example into a sequence shell, moving a pointer to the end position of the lexical unit extracted at the time, repeating until the character stream is read, and obtaining the final sequence shell as the lexical unit sequence.

The method for converting the source code file into the character stream comprises the following steps:

an initial memory buffer area is initially defined and has a size of (The initial memory buffer size may be set to a smaller value, e.g., 4KB or 8 KB), opening the source code file using a file I/O operation, obtaining a resulting file stream object, reading the file stream object byte by byte, obtaining m file blocks (the multiple small blocks of data into which the source code file is divided), sequentially storing the m file blocks in the memory buffer, and dynamically adjusting the size of the memory bufferTo the point of。

Wherein, the method comprises the steps of,For the total size of m file blocks,For the size of the file block that has been read,For the total memory size (total memory of the system),As for the memory that is currently already in use,For a preset upper limit of memory buffer size,AndFor adjusting the factor, for controlling the proportional relation between the new buffer size and the file remaining size and the current available memory,AndThe index factor is used for adjusting the influence degree of the residual size of the file and the current available memory on the size of the buffer area; in order to adjust the function of the function, For the purpose of I/O throughput,For the purpose of CPU utilization,Is the degree of memory fragmentation.

The memory fragmentation level is 1 minus the size of the largest consecutive idle block in the available memory divided by the total available memory size.

Adjusting functionWherein, the method comprises the steps of,For the throughput adjustment factor to be used,In order to adjust the coefficient of utilization,The memory fragmentation adjustment coefficient is used for controlling the influence degree of each factor on the size of the buffer area.

Preprocessing the file blocks in the memory buffer area to obtain preprocessed file blocks, and converting the preprocessed file blocks into character streams.

The method for preprocessing the file blocks in the memory buffer area comprises the following steps:

Defining a five-tuple Wherein, the method comprises the steps of, wherein,As a set of states,In order to input the alphabet list of the letters,As a function of the state transition(s),Is in an initial state, and;Is a group of termination states, and。

State transfer functionWherein, the method comprises the steps of, wherein,In the event of a current state,In order to input the character(s),Is in a new state.

Defining a state set based on m file blocksAnd inputting an alphabetSpecifically, initializing a state set to be a null set, initializing an input alphabet to be null, traversing m file blocks, and executing the following operations on each file block:

Scanning each character w in the file block, adding the character w to the input alphabet if the character w appears for the first time, adding the corresponding state of the character w to the state set according to the type of the character w (such as letters, numbers, special characters and the like), for example, if the character w is a letter, the character w may correspond to an identifier starting state, the character w may correspond to a numerical value starting state and the like, defining possible subsequent state transition of each new de state according to the lexical rule of the language, and constructing the state set and the input alphabet after traversing all the file blocks.

Reading characters from file blocks of a memory buffer one by oneAccording to the current stateAnd the character readNew state is calculated by state transfer functionIf the state is newBelonging toA corresponding preprocessing operation (delete comment, delete blank character, blank character including space, tab, and line feed) is performed.

Updating the current state toRepeating until all characters in the file block of the memory buffer are read, and obtaining the preprocessed file block.

And converting the preprocessed file blocks into character streams, namely reading characters in the preprocessed file blocks one by one, and forming a character sequence, namely the character streams.

Carrying out grammar analysis on the lexical unit sequence, constructing a grammar tree, and representing a grammar structure of a source code;

Specifically, grammar rules are defined, i.e. the grammar of the programming language (terminators, non-terminators and authoring production formulas) is analyzed.

Initializing a grammar tree based on a defined grammar rule, creating a root node, extracting a lexical unit from a lexical unit sequence, matching with the grammar rule, creating a new node as a left sub-node of the root node, using the matched first lexical unit as a sub-node of the newly created left sub-node, matching operators (terminators, non-terminators and other operators), creating a new node as a right sub-node of the root node, continuing matching according to a writing production formula, creating a new node as a right sub-node te of the right sub-node of the root node, matching a second lexical unit as a right sub-node of the newly created te, matching operators, creating a new node as a left sub-node tw of the right sub-node te, continuing matching a third lexical unit according to the writing production formula, creating a new node as a right sub-node of tw, repeating until the extraction of the lexical unit in the sequence is finished, and completing the construction of the grammar tree.

In the construction process, each time a writing generating formula is matched, a corresponding node is created in the grammar tree, the child node is correctly connected with the father node, a terminal (lexical unit) is used as a leaf node, and a non-terminal is used as an internal node.

Meanwhile, it should be noted that if any of the writing formulas cannot be matched, a grammar error is represented, and for ambiguous grammars, multiple grammar trees may exist, and ambiguity resolution is required according to the priority and the combinability rules of the language.

The method comprises the steps of generating an abstract syntax tree based on the syntax tree, specifically, preprocessing the syntax tree, namely traversing each node in the syntax tree, and removing nonsensical nodes such as notes, blank characters and the like.

And specifically, defining a tree reconstruction rule as a group of pattern matching rules for guiding adjustment of the node hierarchical structure of the abstract syntax tree, wherein each pattern matching rule consists of a matching pattern and a reconstruction action.

There are m2 types of matching patterns defining the structure of the subtree to be reconstructed, for example, matching pattern BinaryExpr (op, real) represents a binary operation expression, both operands of which are Literal quantities.

The reconstruction actions include node lifting, node merging and node splitting, matching rules for each patternAssigning a priority weightThe higher the priority, the earlier the reconstruction is made.

Specifically, a base weight is assigned to each pattern matching rule, which may range from 1 to 100. The basic weight reflects the basic importance of the rule in the reconstruction process, and the basic weight is adjusted in real time according to the complexity of the mode.

Rule matching for any two patternsAndThey are respectively matched with grammar treeSubtrees of (3)AndIf there is a nesting relationship between the matched patterns, calculating the nesting value between them;

If it isThen reconstructIf (1)Then reconstructIf (1)Then compareAndThe weights of the two are reconstructed earlier with higher weight.

Wherein, the method comprises the steps of,、AndIs an adjusting coefficient for controlling the influence degree of each factor on the nesting depth, the value ranges are 0,1,Is thatAndIs used to determine the most recent common ancestor node of (c),Is thatIs provided with a root node of (c),Is thatAndExpressed as the difference in node number or subtree height,Is thatAndIs expressed as a proportion of their number of overlapping nodes to the total number of nodes; Is a grammar tree In (a)Is (root node depth is 0),Is a grammar treeIn (a)Is a depth of (c).

The reconstruction process comprises the following steps:

constructing a priority queue Q, initially placing all pattern matching rules into the priority queue Q in descending order of priority (obtained based on nested values), and when traversing to a node n', taking out the pattern matching rule corresponding to the highest priority from the Q And matching the matching modes of the subtrees taking n' as the root, if the matching is successful, executing a reconstruction action to obtain a reconstructed subtree, and adjusting other mode matching rules in Q according to the reconstruction action (if the reconstruction action breaks the matching modes of other mode matching rules, the mode matching rules need to be put into a queue again).

Further, the method for extracting the key code segments comprises the following steps:

the key code snippets include the content of the code snippet (in the form of a string), the code snippet type (function definition, loop statement, conditional statement, etc.), the location information of the code snippet in the source code (file name, line number, column number range), and parent node information of the code snippet (for determining context).

Traversing each node of the abstract syntax tree using a depth-first traversal or breadth-first traversal algorithm, and for each node, identifying key parts in the code, such as function definitions, function calls, variable assignments, control flow statements, etc., according to predefined rules, the key parts typically corresponding to a particular node type or node combination pattern in the abstract syntax tree.

For the identified key code snippets, the corresponding source code text is extracted (the corresponding source code location and scope can be traced back from the grammar tree node, and the corresponding code text is extracted).

The calculation mode of the information entropy value and the context correlation weight of each key code segment comprises the following steps:

each key-code snippet is represented as a corresponding sequence of tokens, where each token is a key, identifier, or literal quantity.

Counting each unique tokenIn key code segmentFrequency of occurrence in (a)Frequency-basedCalculating to obtain key code fragmentsInformation entropy value of (2);

Wherein, the method comprises the steps of, wherein,Is thatIs used to determine the complexity of the code segment of (c),Is thatIs a data stream complexity of (1); And And the information adjustment coefficient is used for controlling the influence degree of the complexity of the code segment and the complexity of the data stream on the information entropy value.

Wherein, the method comprises the steps of,Is thatIs used to determine the degree of complexity of the cycle,Is thatIs used for the branching complexity of the (c) signal,Is thatLength (total number of tokens);、 And For weight adjustment parameters, the influence of variables on the complexity of the code segment is controlled.

The loop complexity refers to the sum of the layers of nested loops in the key code segment, and the branch complexity refers to the sum of the number of conditional branch sentences in the code segment.

Wherein, the method comprises the steps of,Is thatIs used to determine the scope complexity of the (c) system,Is thatIs used to determine the degree of complexity of the dependency of (1),Is thatNumber of medium variables.

It should be explained that, in the abstract syntax subtree, all identifier nodes serving as leaf nodes correspond to variables, specifically, traversing the abstract syntax subtree corresponding to the key code segment to find all the leaf nodes, where the leaf nodes are either constant values or identifiers (variable names), and for the leaf nodes are identifiers, obtaining the variable names of the identifiers, and counting to obtain the total number of the variable names.

Scope complexity refers to the sum of the nesting layers of variable scopes in a key code segment, and dependency complexity refers to the complexity of data dependencies between variables in a key code segment.

Wherein, the method comprises the steps of,Is the firstThe weight coefficient of each variable reflects the importance of the variable in the key code segment,Is the firstThe dependency depth of a variable, i.e. the number of other variables on which the variable depends,Controlling the influence degree of the dependency term on the dependency complexity for the dependency adjustment parameter; Is the first Complexity of individual control dependency, for each control dependency in a key code snippet (e.g., conditional statement, loop statement, etc.), its complexity is calculatedThe complexity of the control dependency is evaluated by weighting according to the number of nested layers and the number of branches.

Fragmenting key codeAnd its context is converted into a vector representation and the cosine similarity between them is calculated as semantic similarity。

Computing key code snippetsStructural similarity mean with other key code segmentsComputing statistical similarity using an n-gram model or other statistical model。

The calculation method of the structural similarity mean value comprises the following steps:

The definition tree editing operation comprises node renaming, node deleting and node inserting, and a cost value is respectively assigned to the node renaming, the node deleting and the node inserting, wherein the cost value can be fixed or can be obtained by dynamic calculation according to factors such as node types, subtree sizes and the like.

For the followingCorresponding subtreeAnd any sub-treeThe calculation willConversion toMinimum editing cost, i.e. tree editing distance。

The calculation mode of the minimum editing cost comprises the following steps:

Then Wherein, the method comprises the steps of,For the number of all sub-trees,AndRespectively representAndIs a node number of (a) in the network.

Based on semantic similaritySimilarity of structuresAnd statistical similarityCalculating to obtain key code fragmentsContext correlation weight of (c);

Wherein, the method comprises the steps of,The first adjustment coefficient is used for controlling the influence degree of the semantic similarity on the context correlation weight; And the second adjustment coefficient is used for controlling the influence degree of the structural similarity and the statistical similarity on the context correlation weight.

By adjusting the first adjusting coefficient and the first adjusting coefficient, the influence degree of semantic similarity, structural similarity and statistical similarity on the context correlation weight is adjusted, the correlation between the code segment and the context at the semantic, structural and statistical feature level is considered, the correlation degree between the code segment and the context can be evaluated more comprehensively, and therefore a more accurate result is provided for the calculation of the context correlation weight.

And carrying out weighted summation on the information entropy value and the context correlation weight of each key code segment to obtain a comprehensive grading value of each key code segment, and carrying out descending order arrangement on all the key code segments according to the size of the comprehensive grading value to obtain an ordered code segment sequence.

The acquisition mode of the target code file comprises the following steps:

Traversing the ordered code segment sequence, mapping each key code segment to an equivalent code segment of the target language, and fusing the mapped target code segments according to the context information (such as variable scope, function call relation and the like) of the key code segments in the source code file to obtain the target code file.

Specifically, for each key code segment, it is mapped into equivalent target language code segments (equivalent code segments) according to the grammar and API of the target language (for each medical device), and it is noted that this process needs to consider the differences between languages, such as grammar structure, keywords, data types, built-in functions, use equivalent alternative implementations for key code segments that cannot be mapped directly, or insert necessary auxiliary codes (such as type conversion, function wrapping, etc.).

The specific method for fusing the mapped target code segments comprises the following steps:

According to the scope of the variable, correctly declaring and referring to the variable, inserting function declaration, parameter transfer and return value processing codes according to the function call relation, generating codes such as class definition, method rewriting and the like according to the inheritance relation of the class, and organizing code fragments according to control flow structures such as conditional statements, circulating statements and the like.

Further, the reverse mapping is performed by converting the object code file into a machine code or an intermediate code directly executed by the terminal medical device.

Different terminal medical devices can reversely map and execute the mapped object code files to realize cross-platform code translation and operation, and it should be noted that, because the hardware and software environments of different devices are greatly different, the reverse mapping and executing processes may need to be optimized and adjusted pertinently to ensure efficient operation of codes in various heterogeneous environments.

According to the embodiment, through deep lexical analysis and abstract grammar tree generation, the quality and efficiency of code translation are improved, different terminal medical devices can execute the translated and optimized target codes efficiently, the performance and reliability of a medical informatization system are improved, an introduced information entropy value and a context correlation weight computing mechanism generate more high-quality and efficient target codes, the running speed of the system is improved, the semantic correctness of the codes in various heterogeneous environments can be ensured, potential errors and potential safety hazards are reduced, flexible code conversion and optimization are realized, development and maintenance cost is remarkably reduced, automation and intellectualization of a code translation process are greatly reduced, the workload of manual intervention and coding is improved, the translation efficiency and accuracy are improved, and the response speed and user experience of a medical platform are effectively improved.

Example 2

Referring to fig. 2, the detailed description of the embodiment is not shown in the description of embodiment 1, and a multi-terminal code mapping translation system for an internet medical platform is provided, which includes:

the source code acquisition and analysis module is used for acquiring a source code file to be translated of the medical platform, wherein the source code file comprises n sections of code fragments, performing lexical analysis on the source code file and generating an abstract syntax tree;

The distribution mapping module is used for distributing the object code files to different terminal medical equipment, the different terminal medical equipment reversely maps and operates the operation object code files, and the modules are connected in a wired and/or wireless mode to realize data transmission among the modules.

Example 3

The embodiment discloses an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the running mode of the multi-terminal code mapping translation method for the internet medical platform when executing the computer program.

Since the electronic device described in this embodiment is an electronic device for implementing a multi-terminal code mapping translation method for an internet medical platform according to the embodiment of the present application, based on the multi-terminal code mapping translation method for an internet medical platform described in the embodiment of the present application, those skilled in the art can understand the specific implementation of the electronic device and various modifications thereof, so how to implement the method in the embodiment of the present application in this electronic device will not be described in detail herein. It is within the scope of the present application to provide an electronic device for implementing a multi-terminal code mapping translation method for an internet medical platform according to the embodiments of the present application.

The above formulas are all formulas with dimensionality removed and numerical calculation, the formulas are formulas with the latest real situation obtained by software simulation through collecting a large amount of data, and preset parameters and threshold selection in the formulas are set by those skilled in the art according to the actual situation.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to those skilled in the art without departing from the principles of the present invention are intended to be comprehended within the scope of the present invention.

Claims

1. A multi-terminal code mapping and translation method for an Internet medical platform, characterized by comprising: S1, obtaining a source code file to be translated of a medical platform, the source code file comprising n code segments, performing lexical analysis on the source code file, and generating an abstract syntax tree;

S2, traverse the abstract syntax tree, extract key code snippets, and calculate the information entropy value and context relevance weight of each key code snippet;

S3, sorting each key code snippet according to the corresponding information entropy value and context relevance weight to obtain an ordered code snippet sequence;

S4, mapping and translating the ordered code fragment sequence to generate a target code file;

S5. Distribute the target code file to different terminal medical devices, and the different terminal medical devices reversely map and run the target code file;

The information entropy value and context relevance weight of each key code snippet are calculated as follows:

Represent each key code snippet as a corresponding sequence of tokens, where each token is a keyword, identifier, or literal;

Count each unique token In the key code snippet Frequency of occurrence ; Based on frequency Calculate the key code snippet Information entropy value ;

,in, for The complexity of the code snippet is for The data flow complexity of and is the information adjustment coefficient;

;in, for The cyclomatic complexity of for The branch complexity of for Length; , and is the weight adjustment parameter;

Cyclomatic complexity refers to the sum of the number of nested loops in the key code snippet; branch complexity refers to the sum of the number of conditional branch statements in the code snippet;

;in, for The scope complexity of for The dependency complexity, for The number of variables in the code; scope complexity refers to the sum of the nesting levels of variable scopes in key code snippets;

;in, It is The weight coefficient of the variable, For the The dependency depth of each variable, Adjust parameters for dependencies; For the A control over the complexity of dependencies;

The key code snippet and its context are converted into vector representations, and the cosine similarity between them is calculated as the semantic similarity ;

Calculate key code snippets The mean structural similarity with other key code snippets ; Use the n-gram model to calculate statistical similarity ;

Based on semantic similarity , mean structural similarity and statistical similarity , calculate the key code snippet The contextual relevance weight ,

;in, is the first adjustment coefficient; is the second adjustment coefficient;

The calculation method of the structural similarity mean includes:

Traverse the abstract syntax tree and extract the subtree corresponding to each key code fragment according to the location information of the extracted key code fragment;

Define tree editing operations including node renaming, node deletion and node insertion, and assign a cost value to each of node renaming, node deletion and node insertion;

for The corresponding subtree and any subtree , the calculation will Convert to The minimum edit cost of ;

The calculation method of the minimum editing cost includes:

Initialize a two-dimensional table with rows and columns corresponding to and The nodes in the table calculate the value of each cell in the two-dimensional table from bottom to top, indicating that The subtree is converted to The minimum editing cost of the subtree of ;

but ;in, is the number of all subtrees, and Respectively and The number of nodes;

The method of obtaining the ordered code snippet sequence includes:

The information entropy value and context relevance weight of each key code snippet are weighted and summed to obtain a comprehensive score value of each key code snippet; all key code snippets are arranged in descending order according to the size of the comprehensive score value to obtain an ordered code snippet sequence;

The object code file is obtained by:

Traversing the ordered code snippet sequence, mapping each key code snippet to an equivalent code snippet in the target language, and merging the mapped target code snippets according to the context information of the key code snippet in the source code file to obtain a target code file;

The reverse mapping method is to convert the target code file into machine code or intermediate code that is directly executed by the terminal medical device.

2. According to a multi-terminal code mapping and translation method for an Internet medical platform according to claim 1, it is characterized in that the method of generating an abstract syntax tree includes:

Convert the source code file into a character stream, extract lexical units from the character stream, and generate a lexical unit sequence; perform grammatical analysis on the lexical unit sequence, build a syntax tree, preprocess the syntax tree, and reorganize the hierarchical structure of the nodes in the syntax tree to obtain an abstract syntax tree.

3. According to a multi-terminal code mapping and translation method for an Internet medical platform according to claim 2, it is characterized in that the method of generating a lexical unit sequence includes:

Initialize an empty sequence shell, read the characters in the character stream one by one, and use a pointer to record the current reading position in real time;

Define the lexical unit type and build a corresponding prefix tree for each lexical unit type; starting from the current read position, try to match the prefix tree corresponding to each lexical unit type in turn; if the prefix tree successfully matches the current character, extract the corresponding lexical unit;

Determine the type and value of the lexical unit according to the corresponding prefix tree, and record the position information of the lexical unit in the source code file. Based on the type, value and position information of the lexical unit, construct a lexical unit instance and add it to the sequence shell. Move the pointer to the end position of the lexical unit extracted this time, and repeat until the character stream is read; the final sequence shell is the lexical unit sequence.

4. A multi-terminal code mapping and translation method for an Internet medical platform according to claim 3, characterized in that the method of converting the source code file into a character stream comprises:

Initially define an initial memory buffer with a size of , use file I/O operation to open the source code file, obtain the file stream object, read the file stream object byte by byte, obtain m file blocks, store the m file blocks in the memory buffer in sequence, and dynamically adjust the size of the memory buffer to ;

;in, is the total size of m file blocks, is the size of the file block that has been read, is the total memory size, is the currently used memory, The upper limit of the preset memory buffer size. and is the regulating factor, and is the exponential factor; is the adjustment function, is the I/O throughput, is the CPU utilization, is the degree of memory fragmentation;

Adjustment function ;in, is the throughput adjustment factor, is the utilization adjustment coefficient, is the memory fragmentation adjustment factor;

The file blocks in the memory buffer are preprocessed to obtain preprocessed file blocks, and the preprocessed file blocks are converted into character streams; that is, characters in the preprocessed file blocks are read one by one and form a character sequence, namely, a character stream.

5. According to a multi-terminal code mapping and translation method for an Internet medical platform according to claim 4, it is characterized in that the method of preprocessing the file blocks in the memory buffer includes:

Define a quintuple ,in, is the state set, To input the alphabet, is the state transfer function, is the initial state, and ; is a set of terminal states, and ;

State transfer function ,in, is the current state, is the input character, For the new state;

Based on m file blocks, define the state set and input alphabet ;

Read characters one by one from the file block in the memory buffer , according to the current state and the characters read , the new state is obtained by calculating the state transfer function If the new state belong , then perform the corresponding preprocessing operations; update the current state to , repeat until all characters in the file block in the memory buffer are read; that is, the preprocessed file block is obtained.

6. A multi-terminal code mapping and translation method for an Internet medical platform according to claim 5, characterized in that the method of performing grammatical analysis on the lexical unit sequence includes:

Define grammar rules, that is, analyze the grammar of the programming language; based on the defined grammar rules, initialize a grammar tree, create a root node, extract lexical units from the lexical unit sequence, match them with the grammar rules, and create a new node as the left child of the root node; match the first lexical unit, make it the child of the newly created left child node, match the operator, and create a new node as the right child of the root node; continue matching according to the written production rule, create a new node as the right child node te of the right child node of the root node, match the second lexical unit, make it the right child node of the newly created te, match the operator, create a new node as the left child node tw of the right child node te; continue matching the third lexical unit according to the written production rule, create a new node as the right child node of tw, and repeat until the extraction of lexical units in the lexical unit sequence is completed; that is, the construction of the grammar tree is completed.

7. A multi-terminal code mapping and translation method for an Internet medical platform according to claim 6, characterized in that the method of reorganizing the hierarchical structure of the nodes in the syntax tree comprises:

The tree reconstruction rules are defined as a set of pattern matching rules, each of which consists of a matching pattern and a reconstruction action;

There are m2 matching modes, which define the structure of the subtree that needs to be reconstructed; the reconstruction actions include node promotion, node merging, and node splitting;

For each pattern matching rule Assign a priority weight ; For any two pattern matching rules and , which match the syntax tree Subtree in and ; If the matching patterns they match have a nested relationship, calculate the nested value between them ;

like , then reconstruct first ;like , then reconstruct first ;like , then compare and Their respective weights, the one with higher weight will be reconstructed first;

;in, , and is the adjustment coefficient, for and The nearest common ancestor node, for The root node of for and The size difference, for and the degree of overlap; For syntax tree middle The depth of For syntax tree middle Depth

The reconstruction process includes:

Construct a priority queue Q. Initially, put all pattern matching rules into the priority queue Q in descending order of priority. Each time a node n' is traversed, take out the pattern matching rule corresponding to the highest priority from Q. , match the matching pattern in the subtree with n' as the root. If the match is successful, perform the reconstruction action to obtain the reconstructed subtree, and adjust other pattern matching rules in Q according to the reconstruction action;

If the match fails, continue to match the pattern matching rule corresponding to the next priority until all pattern matching rules are traversed and all reconstructed subtrees replace the original syntax tree, thus completing the generation of the abstract syntax tree.

8. A multi-terminal code mapping and translation system for an Internet medical platform, which is used to implement a multi-terminal code mapping and translation method for an Internet medical platform as described in any one of claims 1 to 7, characterized in that it comprises: a source code acquisition and analysis module, which is used to acquire a source code file to be translated of the medical platform, the source code file comprising n code segments, perform lexical analysis on the source code file, and generate an abstract syntax tree;

The fragment extraction and analysis module is used to traverse the abstract syntax tree, extract key code fragments, and calculate the information entropy value and context relevance weight of each key code fragment;

A fragment sorting module is used to sort each key code fragment according to the corresponding information entropy value and context relevance weight to obtain an ordered code fragment sequence;

A code mapping and translation module is used to map and translate the ordered code snippet sequence to generate a target code file;

The distribution mapping module is used to distribute the target code file to different terminal medical devices, and the different terminal medical devices reversely map and run the target code file.