[go: up one dir, main page]

CN110347448A - A method of the model when operation of construction terminal applies behavior - Google Patents

A method of the model when operation of construction terminal applies behavior Download PDF

Info

Publication number
CN110347448A
CN110347448A CN201910498727.XA CN201910498727A CN110347448A CN 110347448 A CN110347448 A CN 110347448A CN 201910498727 A CN201910498727 A CN 201910498727A CN 110347448 A CN110347448 A CN 110347448A
Authority
CN
China
Prior art keywords
activity
activities
runtime
model
terminal application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910498727.XA
Other languages
Chinese (zh)
Other versions
CN110347448B (en
Inventor
蔡华谦
黄罡
张颖
刘譞哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201910498727.XA priority Critical patent/CN110347448B/en
Publication of CN110347448A publication Critical patent/CN110347448A/en
Priority to PCT/CN2019/119272 priority patent/WO2020248512A1/en
Application granted granted Critical
Publication of CN110347448B publication Critical patent/CN110347448B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4482Procedural

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

本发明公开了一种构造终端应用行为的运行时模型的方法,通过行为解释器,生成一个完整、准确、详实的应用行为自述,即终端应用应用行为的运行时模型,克服了现有技术在动态、多变、难控的应用运行时环境对终端应用应用行为的监测上的不足,实现了对终端应用应用行为的灵活、完整的监测,为后续实现对终端应用应用行为的指令级控制提供了技术保障。

The invention discloses a method for constructing a runtime model of a terminal application behavior. Through a behavior interpreter, a complete, accurate, and detailed self-statement of the application behavior is generated, that is, a runtime model of the terminal application behavior, which overcomes the limitations of the prior art. The dynamic, changeable, and difficult-to-control application runtime environment has insufficient monitoring of terminal application behaviors, and realizes flexible and complete monitoring of terminal application behaviors, providing a basis for the subsequent realization of command-level control of terminal application behaviors. technical support.

Description

一种构造终端应用行为的运行时模型的方法A Method for Constructing a Runtime Model of Terminal Application Behavior

技术领域technical field

本发明涉及计算机技术,尤其涉及一种构造终端应用行为的运行时模型的方法。The present invention relates to computer technology, in particular to a method for constructing a runtime model of terminal application behavior.

背景技术Background technique

网构软件(也称终端应用)是在互联网开放、动态和多变环境下软件系统基本形态的一种抽象,它既是传统软件结构的自然延伸,又具有区别于集中封装环境下发展起来的传统软件形态的独有基本特征:1)自主性,指网构软件系统中的软件实体具有相对独立性、主动性和自适应性。自主性使其区别于传统软件系统中软件实体的依赖性和被动性;2)协同性,指网构软件系统中软件实体与软件实体之间可按多种静态连接和动态合作方式在开放的网络环境下加以互连、互通、协作和联盟。协同性使其区别于传统软件系统在封闭集中环境下单一静态的连接模式;3)反应性,指网构软件具有感知外部运行和使用环境并对系统演化提供有用信息的能力。反应性使网构软件系统具备了适应开放、动态和多变环境的感知能力;4)演化性,指网构软件结构可根据应用需求和网络环境变化而发生动态演化,主要表现在其实体元素数目的可变性,结构关系的可调节性和结构形态的动态可配置性。演化性使网构软件系统具备了适应比开放、动态和多变环境的应变能力;5)多态性,指网构软件系统的效果体现出相容的多目标性。它可根据某些基本协同原则,在动态变化的网络环境下,满足多种相容的目标形态。多态性使网构软件系统在网络环境下具备了一定的柔性和满足个性化需求的能力。Internet architecture software (also known as terminal application) is an abstraction of the basic form of software systems in the open, dynamic and changeable environment of the Internet. It is not only a natural extension of traditional software structures, but also has a traditional Unique basic features of software form: 1) Autonomy, which means that software entities in Internet-based software systems have relative independence, initiative and adaptability. Autonomy makes it different from the dependence and passivity of software entities in traditional software systems; 2) Collaboration, which means that software entities in Internet-based software systems can communicate with each other in an open network according to various static connections and dynamic cooperation methods. Interconnection, intercommunication, collaboration and alliance in the network environment. Synergy makes it different from the single static connection mode of traditional software systems in a closed and centralized environment; 3) Reactivity refers to the ability of Internet-based software to perceive external operating and usage environments and provide useful information for system evolution. Reactivity enables the Internet-based software system to have the perception ability to adapt to open, dynamic and changeable environments; 4) Evolvability, which means that the Internet-based software structure can dynamically evolve according to application requirements and changes in the network environment, mainly manifested in its physical elements The variability of the number, the adjustability of the structural relationship and the dynamic configurability of the structural form. Evolvability enables Internet-based software systems to have the ability to adapt to open, dynamic and changeable environments; 5) Polymorphism, which means that the effects of Internet-based software systems reflect compatible multi-objectives. It can meet a variety of compatible target forms in a dynamically changing network environment according to some basic synergy principles. Polymorphism makes Internet-structured software systems have a certain degree of flexibility and the ability to meet individual needs in the network environment.

上述网构软件特征的实现,往往需要在运行态修改软件以保障或改善质量、优化或新增功能。经典软件工程方法与技术强调在开发态修改软件,不支持运行态直接修改软件。The realization of the above-mentioned Internet-structured software features often requires software modification in the running state to ensure or improve quality, optimize or add new functions. Classical software engineering methods and techniques emphasize modifying software in the development state, and do not support direct modification of software in the running state.

与之对应,编程语言、操作系统、中间件等系统软件,提供了一种常见的运行态监测与控制应用的主要机制——计算反射(computational reflection,简称反射)。基于计算反射可以实现各种开发框架、测试框架,以提高开发人员在代码开发、测试甚至运行部署中的效率。在计算机领域,B.Smith给出了通用的反射性的定义:反射性是实体具有按照描述、操作和处理实体所面临的主要问题域的相同方式来描述、操作和处理实体自身的一种能力。该定义后续被解释为:反射性是程序具有在运行时刻操纵一组数据的能力,这组数据描述了该程序的运行状态,操纵有两方面含意:1)监测(Introspection),程序可以观测并推理自身的状态;2)控制(Intercession),程序可以改变自身的运行或语义。而这两方面都需要能将程序执行的状态编码为数据,而提供这种编码便称为反射也就是说,反射其实是将程序的运行状态映射为一组可操作的数据。前一部分组成基层实体,后一部分组成元层实体,而基层实体与元层实体之间保持了因果关联。根据基层实体的不同,计算反射主要分为结构反射和行为反射。结构反射的基层实体为当前程序及其抽象数据类型(可视为应用的状态),而行为反射的基层实体则是当前程序的执行行为及其执行所需的数据(可视为应用的行为)。Correspondingly, system software such as programming languages, operating systems, and middleware provide a common main mechanism for operating state monitoring and control applications—computational reflection (reflection for short). Based on computational reflection, various development frameworks and testing frameworks can be implemented to improve the efficiency of developers in code development, testing, and even deployment. In the computer field, B.Smith gave a general definition of reflectivity: reflectivity is the ability of an entity to describe, manipulate and process the entity itself in the same way as it describes, manipulates and processes the main problem domains faced by the entity . This definition is subsequently interpreted as: reflexivity is the ability of a program to manipulate a set of data at runtime. This set of data describes the running status of the program. Manipulation has two meanings: 1) Monitoring (Introspection), the program can observe and reasoning about its own state; 2) control (Intercession), the program can change its own operation or semantics. Both of these aspects need to be able to encode the state of program execution into data, and providing this encoding is called reflection. In other words, reflection actually maps the running state of the program into a set of operable data. The former part constitutes the base-level entity, and the latter part constitutes the meta-level entity, and the causal relationship between the base-level entity and the meta-level entity is maintained. According to different base entities, computational reflection is mainly divided into structural reflection and behavioral reflection. The basic entity of structural reflection is the current program and its abstract data type (which can be regarded as the state of the application), while the basic entity of behavioral reflection is the execution behavior of the current program and the data required for its execution (which can be regarded as the behavior of the application) .

结构反射是指编程语言提供对当前程序及其抽象数据类型反射的能力,由于与编程语言框架(runtime或framework)的能力类似而自然存在,是大多数编程语言框架固有的能力。Structural reflection refers to the ability of a programming language to provide reflection on the current program and its abstract data types. It exists naturally because it is similar to the ability of a programming language framework (runtime or framework), and is an inherent ability of most programming language frameworks.

行为反射是指编程语言提供对自身的执行语义及其执行所需的数据反射的能力,也就是编程语言框架自身需要被反射,行为反射在监测和控制上面临两个挑战:其一,需要完整描述既有的应用行为,也就是对应用的执行进行监测。应用的执行可视为一组运行时活动的集合,活动的粒度越细,监测的信息也越丰富,监测功能占用的资源就越大,其与业务逻辑之间的资源竞争也就越严重。此时,应用行为监测的复杂性和规模性就成为终端应用行为反射的首要挑战。其二,现有编程语言以及操作系统和中间件等系统软件的行为反射,均不支持指令级的行为控制,根本原因在于指令序列蕴含的复杂的数据和控制依赖,因此,应用行为的指令级控制就成为终端应用行为反射的主要难点。Behavior reflection refers to the ability of a programming language to provide its own execution semantics and the data reflection required for its execution, that is, the programming language framework itself needs to be reflected. Behavior reflection faces two challenges in monitoring and control: first, it needs complete Describe the existing application behavior, that is, monitor the execution of the application. The execution of an application can be regarded as a collection of runtime activities. The finer the granularity of the activities, the richer the monitoring information will be. The more resources the monitoring function will occupy, the more serious the resource competition between it and the business logic will be. At this time, the complexity and scale of application behavior monitoring become the primary challenge for terminal application behavior reflection. Second, the existing programming language and the behavior reflection of system software such as operating system and middleware do not support instruction-level behavior control. The root cause is the complex data and control dependencies contained in the instruction sequence. Control becomes the main difficulty of terminal application behavior reflection.

发明内容Contents of the invention

本发明主要目的在于,提供一种构造终端应用行为的运行时模型的方法,克服上述第一个挑战,实现对终端应用运行时行为的完整监测。The main purpose of the present invention is to provide a method for constructing a runtime model of terminal application behavior, overcome the first challenge above, and realize complete monitoring of terminal application runtime behavior.

本发明是通过如下技术方案实现的:The present invention is achieved through the following technical solutions:

为解决上述技术问题,本发明提出了一种构造终端应用行为的运行时模型的方法,所述运行时模型包括运行时栈模型和运行时堆模型,所述方法包括构造所述终端应用行为的运行时栈模型的步骤和构造所述终端应用行为的运行时堆模型的步骤;In order to solve the above technical problems, the present invention proposes a method for constructing a runtime model of terminal application behavior, the runtime model includes a runtime stack model and a runtime heap model, and the method includes constructing a runtime model of the terminal application behavior Steps of a runtime stack model and a step of constructing a runtime heap model of the behavior of the terminal application;

所述构造所述终端应用行为的运行时栈模型的步骤包括:The step of constructing the runtime stack model of the terminal application behavior includes:

在所述终端应用运行时,获取所述终端应用的内存中真正执行的代码,并对所述真正执行的代码进行抽象,生成控制流图;When the terminal application is running, obtain the code actually executed in the memory of the terminal application, and abstract the code actually executed to generate a control flow graph;

针对所述控制流图,将需要监测的控制流图输入至预设的行为解释器;For the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;

利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动;Using the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate stack activities when the terminal application is running;

在所述终端应用运行时,生成所述栈活动的控制流间的依赖关系,得到所述终端应用行为的运行时栈模型;When the terminal application is running, generate a dependency between the control flows of the stack activities, and obtain a runtime stack model of the terminal application behavior;

所述构造所述终端应用行为的运行时堆模型的步骤包括:The step of constructing the runtime heap model of the terminal application behavior includes:

在所述终端应用运行时,生成堆区的初始状态;When the terminal application is running, generate an initial state of the heap;

生成堆操作活动,得到所述终端应用行为的运行时堆模型。A heap operation activity is generated to obtain a runtime heap model of the terminal application behavior.

进一步的,所述方法包括类筛选器和活动类型筛选器;其中,所述类筛选器基于包和类名正则匹配的粗粒度筛选,用于去除开发人员不关心的程序活动;所述活动类型筛选器基于活动类型的细粒度筛选,用于去除与开发者不关心的活动类型。Further, the method includes a class filter and an activity type filter; wherein, the class filter is based on a coarse-grained screening of a regular match between a package and a class name, and is used to remove program activities that developers do not care about; the activity type Filters are based on fine-grained filtering of activity types and are used to remove activity types that developers do not care about.

进一步的,所述栈活动的活动类型包括方法开始与方法结束,字段读,数组读和同步指令;Further, the activity type of the stack activity includes method start and method end, field read, array read and synchronization instruction;

利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动的步骤包括:Using the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, the step of generating the stack activity when the terminal application is running includes:

利用对所述终端应用的应用行为具有监测功能的行为解释器对所述需要监测的控制流图进行解释执行,获得所述终端应用运行时的活动;Using a behavior interpreter capable of monitoring the application behavior of the terminal application to interpret and execute the control flow graph that needs to be monitored, to obtain the running activities of the terminal application;

根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的栈活动;According to the concerned class, use the class filter to perform coarse-grained screening on the activities of the terminal application during runtime, and generate stack activities caused by the class;

针对所述栈活动的活动类型,利用所述活动类型筛选器对所述栈活动进行细粒度筛选。For the activity type of the stack activity, the activity type filter is used to perform fine-grained screening on the stack activity.

进一步的,所述构造所述终端应用行为的运行时堆模型的步骤包括:Further, the step of constructing the runtime heap model of the terminal application behavior includes:

所述终端应用运行时的活动包括实例化活动、修改活动和回收活动。The running activities of the terminal application include instantiation activities, modification activities and recycling activities.

进一步的,所述堆操作活动的活动类型包括对象实例化,数组实例化,对象字段写,数组元素写,清除活动和压缩活动;Further, the activity type of the heap operation activity includes object instantiation, array instantiation, object field writing, array element writing, clearing activity and compression activity;

所述生成堆操作活动的步骤包括:The steps of generating heap operation activities include:

根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的堆操作活动;According to the concerned class, use the class filter to perform coarse-grained screening on the activities of the terminal application during runtime, and generate heap operation activities caused by the class;

针对所述堆操作活动的活动类型,利用所述活动类型筛选器对所述堆操作活动进行细粒度筛选。For the activity type of the heap operation activity, the activity type filter is used to perform fine-grained screening on the heap operation activity.

进一步的,所述依赖关系包括同步依赖和通信依赖。Further, the dependencies include synchronization dependencies and communication dependencies.

进一步的,在生成控制流间的同步依赖关系时,对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系,对于一个活动的开始依赖另一个活动的结束的情况,先对当前线程进行检查,如果该活动为当前线程中的第一个执行的活动,则该活动是依赖另一个线程结束活动的,否则该活动仅是正常的方法调用,并不依赖另一个线程的活动。Furthermore, when generating synchronization dependencies between control flows, for the case where the end of one method depends on the end of another method, use the timestamp to find matching activities in other threads from the back to the front, and if found, it corresponds to A synchronization dependency. For the case where the start of an activity depends on the end of another activity, the current thread is checked first. If the activity is the first activity executed in the current thread, the activity is dependent on the end of another thread. activity, otherwise the activity is just a normal method call and does not depend on another thread's activity.

进一步的,在生成所述栈活动的控制流间的依赖关系时,对所有与活动间通信依赖相关的类进行总结,并将所述类的相关方法与线程依赖相关的方法一起作为生成通信依赖的知识库。Further, when generating the dependencies between the control flows of the stack activities, summarize all the classes related to the communication dependencies between the activities, and use the related methods of the classes and the methods related to the thread dependencies as the generated communication dependencies knowledge base.

进一步的,在所述运行时模型生成时,将所述运行时模型中的活动序列存放于可配置大小的缓冲区中,当活动数量超过预设时,则将所述缓冲区的活动进行序列化,并持久化到本地存储中。Further, when the runtime model is generated, the activity sequence in the runtime model is stored in a buffer with a configurable size, and when the number of activities exceeds a preset value, the activities in the buffer are sequenced and persist to local storage.

进一步的,所述运行时堆模型以巴科斯范式的形式表示。Further, the runtime heap model is expressed in the form of Backus-Naur Form.

与现有技术相比,本发明通过行为解释器,生成一个完整、准确、详实的应用行为自述,即终端应用应用行为的运行时模型,克服了现有技术在动态、多变、难控的应用运行时环境对终端应用应用行为的监测上的不足,实现了对终端应用应用行为的灵活、完整的监测,为后续实现对终端应用应用行为的指令级控制提供了技术保障。Compared with the prior art, the present invention uses a behavior interpreter to generate a complete, accurate, and detailed application behavior statement, that is, a runtime model of the terminal application application behavior, which overcomes the dynamic, changeable, and difficult-to-control problems of the prior art. The lack of monitoring of terminal application behavior by the application runtime environment enables flexible and complete monitoring of terminal application behavior, and provides technical support for the subsequent realization of command-level control of terminal application behavior.

附图说明Description of drawings

图1是现有3G无线资源控制状态机;FIG. 1 is an existing 3G radio resource control state machine;

图2(a)是一个网络请求合并的实例中合并前的网络请求控制流示意图;Fig. 2 (a) is the network request control flow schematic diagram before merging in the example of a network request merging;

图2(b)是一个网络请求合并的实例中合并后的网络请求控制流示意图;Fig. 2 (b) is a network request control flow diagram after merging in an example of network request merging;

图3是一个线程之间通信依赖的例子——生产者-消费者模式示意图;Figure 3 is an example of communication dependencies between threads - a schematic diagram of the producer-consumer model;

图4是本发明一种构造终端应用行为的运行时模型的方法的步骤流程图;FIG. 4 is a flow chart of steps of a method for constructing a runtime model of terminal application behavior in the present invention;

图5是安卓多线程编程示例;Fig. 5 is an example of Android multi-thread programming;

图6是一个多线程编程间依赖示例;Figure 6 is an example of dependencies between multi-threaded programming;

图7(a)是执行前堆区对象;Figure 7(a) is the heap object before execution;

图7(b)是执行后堆区对象;Figure 7(b) is the heap object after execution;

图8是本发明示例Reflectall模型生成子系统架构示意图;Fig. 8 is a schematic diagram of the architecture of the reflectall model generation subsystem of the present invention;

图9是本发明示例Reflectall的接口运行子系统的结构示意图;Fig. 9 is the structural representation of the interface operating subsystem of the example Reflectall of the present invention;

图10(a)是开源应用集上的实验结果;Figure 10(a) is the experimental result on the open source application set;

图10(b)是闭源应用集上的实验结果;Figure 10(b) is the experimental result on the closed-source application set;

图11是Reflectall与Emma生成代码覆盖率报告时的应用启动时间结果对比图。Figure 11 is a comparison of application startup time results when Reflectall and Emma generate code coverage reports.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚明白,下面结合实施例和附图,对本发明作进一步详细说明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the embodiments and accompanying drawings.

为了更好理解本申请的技术问题,本发明采用选取两个典型案例的应用功能演化场景进行分析,以此来明确现有的行为反射不适用的根本原因。In order to better understand the technical problems of this application, the present invention adopts the application function evolution scenarios of two typical cases for analysis, so as to clarify the root cause of the inapplicability of the existing behavioral reflection.

案例一:Case number one:

随着智能手机的发展,终端的移动应用越来越依赖云端提供的软件、硬件资源,以提供更好的服务。然而,云端与终端之间的通讯需要消耗大量的电能。联网应用(如天气、邮件、新闻等)呈现了网构软件典型的构件化特点,利用网络实现了终端与云端各构件的通讯。特别在3G/4G环境下,联网应用在后台长时间、间隔式地利用网络获取相应的推送消息。这种长时间、间隔式的消息推送给电池容量有限的智能手机的续航带来了巨大压力。3G和4G是当前主流使用的移动蜂窝网络,其耗电特点更为复杂。一方面,因为蜂窝网络移动性较强,对于一个移动设备,随着物理位置的移动,可能快速切换到不同蜂窝网络基站。因此,对于蜂窝网络基站,不可能将一个信道一直分配给一台移动设备。另一方面,由于移动设备续航有限,而长时间连接至蜂窝网络基站,会大大提高其功耗,影响续航。因此,蜂窝网络标准中,进一步对无线资源控制(Radio Resource Control,RRC)模块的状态进行了规定。With the development of smart phones, terminal mobile applications increasingly rely on software and hardware resources provided by the cloud to provide better services. However, the communication between the cloud and the terminal consumes a lot of power. Networked applications (such as weather, email, news, etc.) present the typical component-based characteristics of Internet architecture software, and use the network to realize the communication between the terminal and the components of the cloud. Especially in the 3G/4G environment, networked applications use the network to obtain corresponding push messages in the background for a long time and at intervals. This kind of long-term and intermittent message push has brought great pressure on the battery life of smartphones with limited battery capacity. 3G and 4G are currently the mainstream mobile cellular networks, and their power consumption characteristics are more complicated. On the one hand, because the mobility of the cellular network is strong, a mobile device may quickly switch to a different cellular network base station as the physical location moves. Therefore, it is not possible for a cellular network base station to assign a channel to a mobile device all the time. On the other hand, due to the limited battery life of mobile devices, connecting to a cellular network base station for a long time will greatly increase its power consumption and affect battery life. Therefore, in the cellular network standard, the state of the radio resource control (Radio Resource Control, RRC) module is further specified.

以移动设备中的3G网络模块为例,一共包含三个状态,如图1所示。Taking the 3G network module in the mobile device as an example, it contains three states in total, as shown in FIG. 1 .

(1)IDLE:即空闲状态,在此状态下,3G模块的耗电最低,同时也不能发送、接收任何数据。在这个状态下,如果要发送或接收数据,则会转移至CELL_DCH状态。(1) IDLE: It is the idle state. In this state, the power consumption of the 3G module is the lowest, and it cannot send or receive any data. In this state, if data is to be sent or received, it will transfer to the CELL_DCH state.

(2)CELL_DCH:在这个状态下,3G模块的带宽达到最大,此时能以最大的速率进行数据传输,同时它的功耗也是最大的。如果持续一段时间,仍然没有数据传输的话,它会转移至CELL_FACH状态。根据不同的运营商的设置,持续运行于CELL_DCH状态的时间通常是5秒至10秒。(2) CELL_DCH: In this state, the bandwidth of the 3G module reaches the maximum. At this time, data transmission can be performed at the maximum rate, and its power consumption is also the maximum. If there is still no data transmission for a period of time, it will transfer to the CELL_FACH state. According to the settings of different operators, the time for continuous operation in the CELL_DCH state is usually 5 seconds to 10 seconds.

(3)CELL_FACH:在这个状态下,3G模块的耗电比CELL_DCH要节省50%,同时,在这个状态下,其网络传输速率也较低。如果在这个状态,发送或接收的数据大于某个阀值,则会重新转移至CELL_DCH状态。而如果在CELL_FACH状态持续一段时间没有发送或接收数据,则会转移至IDLE状态。一般来说这段时间通常是10秒至15秒。(3) CELL_FACH: In this state, the power consumption of the 3G module is 50% less than that of CELL_DCH. At the same time, in this state, its network transmission rate is also low. If in this state, the data sent or received is greater than a certain threshold, it will re-transfer to the CELL_DCH state. And if no data is sent or received in the CELL_FACH state for a period of time, it will transfer to the IDLE state. Generally speaking, this period of time is usually 10 seconds to 15 seconds.

图2展示了一个网络请求合并的实例。图2(a)为合并前的网络请求与无线通讯模块耗电,其横轴为时间,上半部分为无线通讯模块的功耗;下半部分的虚线为发起这两个网络请求的线程;下半部分的实线为其控制流。首先,一个后台的新闻推送线程唤醒了负责发送网络请求的线程①;该线程被唤醒后,发起了网络请求②,此时,无线通讯模块的功耗也由IDLE状态下的低功耗,变为CELL_DCH状态下的高功耗;当整个请求完成后,负责发送网络请求的线程将结果返回给新闻推送线程③,此时虽然无线通讯模块没有接收或发送数据,但它仍会保持在高功耗状态,由此开始的无线通讯模块耗电被称之为“尾时间耗电”,对应为图2(a)中用的斜线部分;新闻推送线程在收到返回结果后④,对结果进行处理,并在通知栏上进行提示⑤。又过了一段时间后,另一个版本更新线程也执行了类似的逻辑⑥,发送了网络请求。如图2(a)所示,由于这两个网络请求间隔了几十秒,导致无线通讯模块被唤醒了两次,因此也有了对应的两次“尾时间”,由此导致了额外的网络能耗。Figure 2 shows an example of network request coalescing. Figure 2(a) shows the power consumption of the network request and the wireless communication module before merging, the horizontal axis is time, the upper half is the power consumption of the wireless communication module; the dotted line in the lower half is the thread that initiated the two network requests; The solid line in the lower part is its control flow. First, a background news push thread wakes up the thread responsible for sending network requests ①; after the thread is woken up, it initiates a network request ②. At this time, the power consumption of the wireless communication module also changes from the low power consumption in the IDLE state to It is high power consumption in the CELL_DCH state; when the entire request is completed, the thread responsible for sending the network request will return the result to the news push thread ③, although the wireless communication module does not receive or send data at this time, it will still maintain high power consumption The power consumption of the wireless communication module starting from this point is called "tail time power consumption", which corresponds to the slash part used in Figure 2(a); after the news push thread receives the returned result ④, it checks the result Process and prompt on the notification bar ⑤. After another period of time, another version update thread also executed similar logic ⑥ and sent a network request. As shown in Figure 2(a), since the two network requests are separated by dozens of seconds, the wireless communication module is woken up twice, so there are also two corresponding "tail times", which leads to additional network requests. energy consumption.

对于安卓应用,有很大一部分后台请求可以被延迟几十秒、甚至两、三分钟,而不会影响用户的体验。例如上述的新闻推送、版本更新推送等。对于这些网络请求,如果在时间维度上进行合并,即两个请求同时发送,而不是间隔了几十秒发送,就可以减少“尾时间”的网络耗电。图2(b)为图2(a)中两个请求进行了合并后的控制流及其无线通讯模块耗电情况。首先,负责发送网络请求的线程被新闻推送线程唤醒后,并不直接发送网络请求,而是进入一个等待状态⑦。一段时间后,另一个网络请求线程被后台更新推送线程唤醒,同时,它也进入一个等待状态⑧。在等待状态结束后⑨,这两个线程同时发送网络请求,对应的无线通讯模块也只被唤醒一次。如图2(b)所示,合并后的网络请求的耗电要远小于合并前的网络请求耗电。For Android applications, a large part of background requests can be delayed for tens of seconds, or even two or three minutes, without affecting the user experience. For example, the above-mentioned news push, version update push, etc. For these network requests, if they are combined in the time dimension, that is, two requests are sent at the same time instead of being sent with an interval of tens of seconds, the network power consumption of the "tail time" can be reduced. Fig. 2(b) shows the control flow and the power consumption of the wireless communication module after the two requests in Fig. 2(a) are merged. First of all, after the thread responsible for sending network requests is awakened by the news push thread, it does not directly send network requests, but enters a waiting state⑦. After a period of time, another network request thread is woken up by the background update push thread, and at the same time, it also enters a waiting state⑧. After the waiting state is over⑨, the two threads send network requests at the same time, and the corresponding wireless communication module is only woken up once. As shown in Figure 2(b), the power consumption of the combined network request is much smaller than that of the network request before the combination.

为了实现网络请求合并,需要1)一种网络请求调度机制,即,使原本直接发送的网络请求可被延迟发送;2)一种网络请求调度算法,即找出可被延迟调度的请求,同时利用调度机制进行延迟发送。利用结构反射可以实现自动化重构移动应用的网络请求执行逻辑,并把调度机制内置进应用。然而这就要求不同应用的开发者均使用同一个自动重构框架,并且需要对所有应用进行重新的编译、部署和运行。这对于大量的、分属于不同应用开发者的闭源应用显然不现实。In order to achieve network request merging, 1) a network request scheduling mechanism is needed, that is, the network requests that were originally sent directly can be delayed; 2) a network request scheduling algorithm is to find out the requests that can be delayed scheduling, and at the same time Use the scheduling mechanism for delayed sending. Structural reflection can be used to automatically refactor the network request execution logic of mobile applications, and build the scheduling mechanism into the application. However, this requires developers of different applications to use the same automatic refactoring framework, and all applications need to be recompiled, deployed, and run. This is obviously unrealistic for a large number of closed-source applications belonging to different application developers.

案例二:Case two:

随着微信的普及,微信已经不仅仅是一个简单的通讯应用,它还成为了工作交流的必备工具;催生了利用微信朋友圈、公众号进行营销的“微商”;成为了最大的自媒体的发布平台。微信其核心作为一个通讯工具,其功能还是以满足普通用户为主。即便如此,它也很难满足普通用户的特定需求。例如,随着微信使用的时间越来越长,其缓存的聊天记录文件也越来越大,对于普通用户而言,很难对自己的聊天日志进行管理。更进一步地,微信难以满足微商、自媒体人等特殊群体的特定的需求。要实现微信应用内数据与功能的开放共享,就需要将面向用户的用户界面接口转换为面向互操作的可编程接口。一般而言,对于面向用户的用户界面接口,其执行的起点在于用户界面元素的点击、拖动和输入等操作。经过部分的逻辑处理,以网络请求、数据库查询、文件读写的方式,访问外部资源,获取相应的数据或实现相应的功能。在该处理流程中,有大部分的逻辑是与面向互操作的可编程接口的执行逻辑是相似的,只是其执行的起点不同。然而,现有的行为反射监测和控制的粒度为方法级别。基于现有的行为反射,在既有应用的执行流程中插入某些执行逻辑的方式,难以实现将面向用户的用户界面接口转换为面向互操作的可编程接口:既有的功能可对应于运行时的一组程序活动,以方法为粒度的行为反射其监测的内容有限,无法监测方法内的指令的执行,继而也无法控制。这就导致了现有的解决方案往往是基于既有的代码和文档,对于一个开发团队而言,开发人员的流动、文档的缺失、甚至是不规范的源码注释都会使得移动应用的迭代开发变得难以进行。With the popularity of WeChat, WeChat is not just a simple communication application, it has also become an essential tool for work communication; it has given birth to the "WeChat business" that uses WeChat Moments and official accounts for marketing; it has become the largest self- Media publishing platform. The core of WeChat is a communication tool, and its function is mainly to satisfy ordinary users. Even so, it struggles to meet the specific needs of the average user. For example, as WeChat is used for a longer period of time, its cached chat log files are also getting larger and larger. For ordinary users, it is difficult to manage their own chat logs. Furthermore, it is difficult for WeChat to meet the specific needs of special groups such as WeChat merchants and self-media people. In order to realize the open sharing of data and functions in the WeChat application, it is necessary to convert the user-oriented user interface interface into an interoperable programmable interface. Generally speaking, for a user-oriented user interface interface, the starting point of its execution lies in operations such as clicking, dragging, and inputting elements of the user interface. After some logical processing, access to external resources in the form of network requests, database queries, and file reading and writing, to obtain corresponding data or realize corresponding functions. In this processing flow, most of the logic is similar to the execution logic of the interoperability-oriented programmable interface, but the starting point of its execution is different. However, the granularity of existing behavioral reflection monitoring and control is method level. Based on the existing behavioral reflection, it is difficult to convert the user-oriented user interface interface into an interoperable programmable interface by inserting some execution logic into the execution flow of the existing application: the existing functions can correspond to the running A set of program activities at the time, the behavior at the granularity of the method reflects the limited monitoring content, and it is impossible to monitor the execution of the instructions in the method, and then it cannot be controlled. This has led to the fact that existing solutions are often based on existing codes and documents. For a development team, the flow of developers, lack of documents, and even non-standard source code comments will make iterative development of mobile applications change. It is difficult to carry out.

由上述两个案例分析可以看出,难以实现移动应用互操作接口根本原因在于现有的工作由于缺乏对应用行为的一个完整、详实的描述,同时也没有对这种指令粒度的自描述的控制的方法。因而,能否给出一个完整描述应用行为、并且可操作的运行时模型也就成了解决本发明问题的难点和关键。From the analysis of the above two cases, it can be seen that the root cause of the difficulty in realizing the mobile application interoperability interface is that the existing work lacks a complete and detailed description of the application behavior, and at the same time, there is no control over the self-description of the instruction granularity. Methods. Therefore, whether a runtime model that fully describes the application behavior and is operable can be given becomes the difficulty and the key to solve the problem of the present invention.

针对上述技术问题,本发明实施例提出了一种构造终端应用行为的运行时模型的方法,所述运行时模型包括运行时栈模型和运行时堆模型。In view of the above technical problems, an embodiment of the present invention proposes a method for constructing a runtime model of terminal application behavior, and the runtime model includes a runtime stack model and a runtime heap model.

应用在操作系统中运行起来之后,可视为一个或多个进程,操作系统将移动应用所需的可执行文件加载到内存中,并开始执行。一般而言,一个进程所占的内存可分为三个区域:After the application runs in the operating system, it can be regarded as one or more processes, and the operating system loads the executable files required by the mobile application into the memory and starts execution. In general, the memory occupied by a process can be divided into three areas:

代码段:存放执行代码的一块内存区域,具有只读属性;Code segment: a memory area that stores the execution code, which has a read-only attribute;

堆区:可分为用于存放全局变量的一块内存区域(数据段),和用于进程运行中动态分配的内存区域,例如,面向对象编程语言Java中,线程创建一个新的对象相当于在堆区中申请了一片内存;Heap area: It can be divided into a memory area (data segment) for storing global variables and a memory area for dynamic allocation during process running. For example, in the object-oriented programming language Java, creating a new object by a thread is equivalent to A piece of memory is applied for in the heap area;

栈区:用于临时存局部变量等。例如,面向对象编程语言Java中,线程调用一个方法时,会新申请一个帧(frame),帧中保存了方法所需的参数等数据。Stack area: used for temporary storage of local variables, etc. For example, in the object-oriented programming language Java, when a thread calls a method, it will apply for a new frame (frame), which stores data such as parameters required by the method.

经过发明人仔细研究发现,终端应用在运行时,代码段的执行会引起堆区和栈区内存数据的变化。应用的运行时模型需要能反映应用在一段时间内的:1)代码的执行情况:在开发时,移动应用的代码可抽象为控制流图,那么对应于运行时,代码的执行情况可抽象为控制流图的一条或多条路径;2)内存数据(如堆区)的变化:在开发时,开发人员会设计好各种数据结构以表示应用的数据模型(Data Model),而在运行时,代码的执行引起对这些数据结构的实例的创建、修改、删除,也就是对应于一组内存的分配和修改操作。从内存区域角度上看,程序执行所影响的最主要的区域是内存的栈区和堆区。1)中的控制流图中的路径可视为对栈变化的一个描述,而2)中主要体现的是堆区数据的变化。After careful research by the inventor, it is found that when the terminal application is running, the execution of the code segment will cause changes in the memory data in the heap area and the stack area. The runtime model of the application needs to be able to reflect the following: 1) The execution of the code: during development, the code of the mobile application can be abstracted as a control flow graph, and corresponding to the runtime, the execution of the code can be abstracted as One or more paths of the control flow graph; 2) Changes in memory data (such as the heap area): During development, developers will design various data structures to represent the application's data model (Data Model), while at runtime , the execution of the code causes the creation, modification, and deletion of instances of these data structures, that is, the allocation and modification operations corresponding to a set of memory. From the perspective of memory area, the most important area affected by program execution is the stack area and heap area of memory. The path in the control flow graph in 1) can be regarded as a description of the stack change, while 2) mainly reflects the change of the heap area data.

因此,本发明所构造的应用运行时模型包括一个描述栈变化的运行时栈模型和一个描述堆变化的运行时堆模型。其中运行时栈模型还包括对代码的获取,以此将一个进程所占的内存完整的分为三个区域。通过本发明实施例的运行时栈模型,可以了解在任意时刻移动应用的代码执行情况;而通过运行时堆模型,可以了解在任意时刻代码执行所依赖的对象数据状态。Therefore, the application run-time model constructed by the present invention includes a run-time stack model describing stack changes and a run-time heap model describing heap changes. The runtime stack model also includes the acquisition of codes, so that the memory occupied by a process is completely divided into three areas. Through the runtime stack model of the embodiment of the present invention, it is possible to know the code execution status of the mobile application at any time; and through the runtime heap model, it is possible to know the object data state on which the code execution depends at any time.

运行时栈模型runtime stack model

控制流图为一个有向图G=<B,P>;The control flow graph is a directed graph G=<B, P>;

其中,B={b1,b2,…,bn}为基本块;Among them, B={b 1 , b 2 ,..., b n } is a basic block;

为控制流路径; is the control flow path;

对于任意pi=(bi1,bi2),pi∈P,当且仅当bi2可能bi1之后执行。而在运行时,控制流图会实例化为一个或多个控制流,并按照控制流图中的路径执行基本块。本发明称在某一时刻执行的基本块为活动,则一段时间内的运行时栈模型是由控制流图,一个或多个控制流,和一组活动序列组成。当基本块的粒度为指令粒度时,活动序列就是指令执行序列。下面给出本发明所述的运行时栈模型的形式化定义。For any p i =(b i1 , b i2 ), p i ∈ P is executed if and only if b i2 is possible after b i1 . At runtime, the control flow graph will be instantiated into one or more control flows, and the basic blocks will be executed according to the path in the control flow graph. The present invention calls the basic blocks executed at a certain moment as activities, and then the runtime stack model within a period of time is composed of a control flow graph, one or more control flows, and a group of activity sequences. When the granularity of the basic block is the instruction granularity, the active sequence is the instruction execution sequence. The formal definition of the runtime stack model of the present invention is given below.

定义运行时栈模型为一个或多个控制流在一段时间内发生的活动的集合M=<G,T,A,I,E>,Define the runtime stack model as a set of activities M=<G, T, A, I, E> of one or more control flows occurring over a period of time,

其中,G=<B,P>为控制流图,T={t1,t2,…,tn}为一组时刻,I={i1,i2,…,in},表示t1至tn时刻的程序的堆区状态。Among them, G=<B, P> is the control flow graph, T={t 1 , t 2 ,…,t n } is a set of time instants, I={i 1 , i 2 ,…,i n }, representing t The heap status of the program from time 1 to t n .

令F={f1,f2,…,fn},为一组控制流,则A=F×I×T×B为一段时间内发生的活动的集合,表示两个活动发生的前后关系的集合。Let F={f 1 , f 2 ,...,f n } be a group of control flows, then A=F×I×T×B is a collection of activities that occur within a period of time, Represents a collection of contexts in which two activities occur.

运行时栈模型可视为控制流图的多条路径的集合,因此,在运行时栈模型中的边必须在控制流图中有相应的边。即:其中ai(fi1,ti2,bi3),aj=(fj1,tj2,bj3)aj=(fj1,tj2,bj3),有(bi3,bj3)∈P。除此之外,运行时栈模型的边表示两个活动发生的前后关系,对于在同一控制流中的两个活动,具有时间上的前后顺序关系;对于在不同控制流中的两个活动如果存在边,则表明这两个活动间还具有依赖关系。The run-time stack model can be viewed as a collection of multiple paths in the control flow graph, therefore, edges in the run-time stack model must have corresponding edges in the control flow graph. which is: Where a i (f i1 , t i2 , b i3 ), a j = (f j1 , t j2 , b j3 ) a j = (f j1 , t j2 , b j3 ), there is (b i3 , b j3 )∈ p. In addition, the edge of the runtime stack model represents the context of two activities. For two activities in the same control flow, there is a sequence relationship in time; for two activities in different control flows, if The presence of an edge indicates that there is also a dependency between the two activities.

在同一个控制流中,若有两个活动具有前后发生关系,则对于任意的其他活动,都不可能在同一控制流的这两个活动之间发生,即其中ai=(fi1,ti2,bi3),aj=(fj1,tj2,bj3),如果fi1≠fj1,则 In the same control flow, if there are two activities that have the relationship of happening before and after, it is impossible for any other activity to happen between these two activities in the same control flow, that is where a i = (f i1 , t i2 , b i3 ), a j = (f j1 , t j2 , b j3 ), if f i1 ≠ f j1 , then

在不同控制流中,若两个活动具有前后发生关系,则对于后一个活动所在的控制流,在发生前一活动的时刻之后,可以先发生其他活动。其中ai=(fi1,ti2,bi3),aj=(fj1,tj2,bj3),如果fi1≠fj1,则ti2<tj2In different control flows, if two activities have a occurrence relationship, then for the control flow where the latter activity is located, after the moment when the previous activity occurs, other activities can occur first. Where a i =(f i1 , t i2 , b i3 ), a j =(f j1 , t j2 , b j3 ), if f i1 ≠f j1 , then t i2 <t j2 .

定义程序活动aj同步依赖于程序活动ai,如果aj的开始或结束由ai的执行决定,一般地,而ai往往是一些线程同步操作。称aj通信依赖于ai,如果aj的某项数据依赖是由ai活动产生。以面向对象编程语言Java为例,基本块的粒度为源代码的基本块的粒度。运行栈模型的每一个控制流对应于一个Java线程的执行序列。线程的状态转移共有六种状态:Define program activity a j synchronization depends on program activity a i , if the start or end of a j is determined by the execution of a i , generally, and a i is often some thread synchronization operation. It is said that the communication of a j depends on a i , if a certain data dependency of a j is generated by the activity of a i . Taking the object-oriented programming language Java as an example, the granularity of the basic block is the granularity of the basic block of the source code. Each control flow of the execution stack model corresponds to the execution sequence of a Java thread. There are six states in thread state transition:

创建:线程对象刚创建,还未开始时处于该状态;Creation: The thread object is just created and is in this state before it starts;

运行:线程处于正在运行的状态,在该状态的线程可能等待一些系统资源,如CPU;Running: The thread is in the running state, and the thread in this state may wait for some system resources, such as CPU;

阻塞:线程正在等待某个监控锁(Monitor Lock),例如线程在进入synchronized关键字修改的方法或是代码块时,线程会进入阻塞状态;Blocking: The thread is waiting for a monitor lock (Monitor Lock). For example, when the thread enters the method or code block modified by the synchronized keyword, the thread will enter the blocked state;

等待/定时等待:线程正在等待,例如,当线程调用某个对象的wait方法进入等待状态。当该对象的notify方法被该用时,线程会重新进入运行状态;Waiting/timed waiting: The thread is waiting, for example, when the thread calls the wait method of an object to enter the waiting state. When the notify method of the object is used, the thread will re-enter the running state;

死亡:当一个线程的run方法执行结束后,会进入死亡状态。Death: When the execution of the run method of a thread ends, it will enter the death state.

由以上的状态转换可以发现,在某些情况下,某一个处于运行状态的线程可以唤醒另一个处于非运行状态的线程进入运行状态。本发明称线程之间的这种关系为同步依赖关系。在这些线程间唤醒发生时,对应于运行时栈模型中,处于运行状态的线程发生的活动会与非运行状态线程进入运行状态时发生的活动存在一条跨线程(跨控制流)的边。从Java的语言层面,可将这些线程依赖归纳为四类,如表1所示。From the above state transitions, it can be found that in some cases, a thread in the running state can wake up another thread in the non-running state to enter the running state. The present invention calls this relationship between threads a synchronization dependency. When these inter-thread wakeups occur, corresponding to the runtime stack model, there is a cross-thread (cross-control flow) edge between the activities of the running thread and the activities of the non-running thread entering the running state. From the Java language level, these thread dependencies can be classified into four categories, as shown in Table 1.

表1:Java语言层面的同步依赖关系分类Table 1: Classification of synchronization dependencies at the Java language level

在表1中,每个运行线程的活动都会与非运行线程的活动对应。因此,称表1中的线程间依赖关系为同步依赖关系。基于以上线程的状态转换,Java提供了多种多线程编程库,以支持,例如java.util.concurrent中提供了读写锁、可重入锁、阻塞锁和线程池等。In Table 1, the activities of each running thread correspond to the activities of non-running threads. Therefore, the inter-thread dependencies in Table 1 are called synchronization dependencies. Based on the state transitions of the above threads, Java provides a variety of multi-threaded programming libraries to support, for example, java.util.concurrent provides read-write locks, reentrant locks, blocking locks, and thread pools.

图3展示了一个线程之间通信依赖的例子——生产者-消费者模式。在该例子中,Task类表示计算任务;静态字段tasks表示待处理任务队列;postTask方法表示生成并提交任务;handleTask方法表示处理任务。如图3所示,存在两个线程:1)线程1表示生产者线程,会提交任务至待处理任务队列;2)线程2表示消费者线程,每隔一段时间便会查看待处理任务队列,并处理相应的任务。在这个例子中,生产者线程与消费者线程之间并无同步依赖关系——消费者线程每隔一段时间便自动由定时等待状态转为运行状态,但是却存在通信的依赖关系——如果生产者线程不提交任务,消费者线程的task.run方法便不会被调用。Figure 3 shows an example of communication dependencies between threads - producer-consumer pattern. In this example, the Task class represents computing tasks; the static field tasks represents the pending task queue; the postTask method represents generating and submitting tasks; the handleTask method represents processing tasks. As shown in Figure 3, there are two threads: 1) thread 1 represents the producer thread, which will submit tasks to the pending task queue; 2) thread 2 represents the consumer thread, which will check the pending task queue every once in a while, and handle the corresponding tasks. In this example, there is no synchronization dependency between the producer thread and the consumer thread - the consumer thread automatically changes from the timing waiting state to the running state every once in a while, but there is a communication dependency - if the production If the consumer thread does not submit the task, the task.run method of the consumer thread will not be called.

由上述例子可以发现,运行时栈模型中的活动关系的生成必须依赖运行时的相应数据。在经典的数据流分析中,数据流分析算法会根据控制流图的结构计算数据流方程,并迭代至稳定点。因此,应用的运行时模型除了上述的运行时栈模型外,还需一个描述内存堆区数据状态变化的运行时堆模型。It can be found from the above examples that the generation of activity relations in the runtime stack model must depend on the corresponding data at runtime. In classic data flow analysis, the data flow analysis algorithm calculates the data flow equation according to the structure of the control flow graph, and iterates to a stable point. Therefore, in addition to the above-mentioned runtime stack model, the runtime model of the application also needs a runtime heap model that describes the data state changes in the memory heap area.

运行时堆模型runtime heap model

经典的数据流图常用于需求分析阶段。软件利用数据流图由抽象到具体、逐层分解待开发的软件系统。数据流图为一个有向图,包含了两种不同类型的边和多种不同的节点用以描述数据从一个初始节点出发,进行层层计算,最后得到最终结果。在运行时,数据流图的某个节点本质上对应了一组内存数据的变化。因此,本发明的应用的行为运行时模型的堆模型重点关注内存数据的变化,而不是变化的操作。本发明所述的运行时的堆模型仅从内存数据变化角度对应用运行时的内存的堆区进行建模。The classic data flow diagram is often used in the requirements analysis phase. The software uses the data flow graph to decompose the software system to be developed layer by layer from abstraction to concrete. The data flow graph is a directed graph, which contains two different types of edges and a variety of different nodes to describe data starting from an initial node, performing layer-by-layer calculations, and finally obtaining the final result. At runtime, a node in the data flow graph essentially corresponds to a set of memory data changes. Therefore, the heap model of the behavior runtime model of the application of the present invention focuses on the change of memory data, rather than the changed operation. The runtime heap model described in the present invention only models the memory heap area when the application is running from the perspective of memory data changes.

运行时堆模型是由一组内存数据的初始值和在一段时间内发生堆的内存修改活动的集合M=<D,A,T,R>;The runtime heap model is a set M=<D, A, T, R> of a set of initial values of memory data and heap memory modification activities that occur within a period of time;

其中,D={d1,d2,…,dn},为一组内存地址的初始值,A={i1,i2,…,in},为引起内存数据变化的活动,T={t1,t2,…,tn},为时间戳。Among them, D={d 1 , d 2 ,..., d n }, is the initial value of a group of memory addresses, A={i 1 , i 2 ,..., i n }, is the activity that causes the memory data to change, T ={t 1 , t 2 , . . . , t n }, which is the time stamp.

对于不同的面向对象编程语言,它们会提供不同的应用编程接口以实现内存的动态分配和回收。例如C/C++语言,通过在标准库函数中提供malloc和free函数实现内存的分配和回收;而Java语言,通过new关键字可创建新的对象,实现内存的分配,通过自动的垃圾回收机制,实现对内存的回收。For different object-oriented programming languages, they will provide different application programming interfaces to realize the dynamic allocation and recovery of memory. For example, the C/C++ language realizes memory allocation and recovery by providing malloc and free functions in the standard library functions; while the Java language can create new objects through the new keyword to achieve memory allocation, and through the automatic garbage collection mechanism, Realize the recovery of memory.

针对本发明的技术问题,接下来对如何构造本发明实施例的上述运行时模型进行详细阐述。Aiming at the technical problem of the present invention, how to construct the above runtime model of the embodiment of the present invention will be described in detail next.

参照图4,示出了本发明一种构造终端应用行为的运行时模型的方法的步骤流程图,所述方法包括构造所述终端应用行为的运行时栈模型的步骤和构造所述终端应用行为的运行时堆模型的步骤。Referring to FIG. 4 , it shows a flow chart of steps of a method for constructing a runtime model of a terminal application behavior according to the present invention, the method including the steps of constructing a runtime stack model of the terminal application behavior and constructing the terminal application behavior The steps of the runtime heap model.

所述构造所述终端应用行为的运行时栈模型的步骤具体可以包括:The step of constructing the runtime stack model of the terminal application behavior may specifically include:

步骤S401:在所述终端应用运行时,获取所述终端应用的内存中真正执行的代码,并对所述真正执行的代码进行抽象,生成控制流图;Step S401: when the terminal application is running, obtain the actually executed code in the memory of the terminal application, and abstract the actually executed code to generate a control flow graph;

步骤S402:针对所述控制流图,将需要监测的控制流图输入至预设的行为解释器;Step S402: For the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;

步骤S403:利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动;Step S403: using the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate the stack activity when the terminal application is running;

步骤S404:在所述终端应用运行时,生成所述栈活动的控制流间的依赖关系,得到所述终端应用行为的运行时栈模型;Step S404: When the terminal application is running, generate dependencies between control flows of the stack activities, and obtain a runtime stack model of the terminal application behavior;

在本发明实施例中,构建运行时栈模型包含以下三个基本要素:1)控制流图:包含了所有可能的活动和所有可能的活动关系,是程序的源代码,或是中间代码的抽象表示;2)运行时发生的一组活动,即控制流图中的一条路径,可视为栈模型中的一组节点;3)运行时发生的活动之间的关系,即栈模型的边。In the embodiment of the present invention, building the runtime stack model includes the following three basic elements: 1) Control flow graph: contains all possible activities and all possible activity relationships, and is the source code of the program, or the abstraction of the intermediate code Represents; 2) a group of activities that occur at runtime, that is, a path in the control flow graph, which can be regarded as a group of nodes in the stack model; 3) the relationship between activities that occur at runtime, that is, the edges of the stack model.

栈模型的构造需要着力解决三方面的挑战:第一,由于编译优化、运行时即时编译等技术,应用的源代码和编译生成的字节码可能与运行时内存中的代码段不同,如何保障控制流图与运行时的活动能正确映射;第二,如何生成不同粒度的活动以描述复杂的应用运行状态的改变;第三,由于现在的应用大量使用多线程编译以保证界面的响应速度,提升用户体验,如何生成运行时控制流之间的依赖关系。针对以上三个挑战,本发明实施例的第一步是通过在运行时获取内存中真正执行的代码,对当前执行的代码进行抽象,可以保证控制流图与运行时活动的准确映射。第二步提出一种行为解释器,该行为解释器以上步生成的控制流图作为输入,对其进行解释执行。第三步,解释执行过程中生成应用运行时的活动;而模型生成的最后一步,是在运行时生成控制流之间的依赖关系。The construction of the stack model needs to focus on solving three challenges: First, due to technologies such as compilation optimization and runtime just-in-time compilation, the source code of the application and the bytecode generated by compilation may be different from the code segment in the runtime memory. The control flow graph and runtime activities can be correctly mapped; second, how to generate activities of different granularities to describe the change of the complex application running state; third, because current applications use a lot of multi-threaded compilation to ensure the response speed of the interface, To improve user experience, how to generate dependencies between runtime control flows. In response to the above three challenges, the first step of the embodiment of the present invention is to abstract the currently executed code by obtaining the code actually executed in the memory at runtime, so as to ensure the accurate mapping between the control flow graph and the runtime activities. The second step proposes a behavior interpreter, which takes the control flow graph generated in the previous step as input, and interprets and executes it. The third step is to explain the activities of generating application runtime during execution; and the last step of model generation is to generate dependencies between control flows at runtime.

下面,对运行时栈模型的生成步骤作进一步概述。Below, a further overview of the steps for generating the runtime stack model is given.

一、控制流图生成。在应用的开发过程中,由于安装包发布会包含应用编译过的中间代码,因此,出于保护的目的,应用通过会使用各种混淆工具对生成的中间代码,例如安卓下的dex字节码,进行混淆。这会导致直接提供的源代码难以与应用运行时执行的活动进行映射。这些混淆过的代码会通过应用运行时环境进行加载执行。例如,在安卓应用的Dex字节码会在Android Runtime(ART)中执行。本发明采用修改应用运行时的方式获取应用的字节码。这种方式可以带来两个好处,一是无需提供与匹配的中间代码或源代码,提高了方法的实用性;二是在应用运行时生成的中间代码可保证与执行的活动的一致性,从而保障了控制流图与运行时的控制流的匹配。1. Control flow graph generation. During the development process of the application, since the installation package release contains the compiled intermediate code of the application, for the purpose of protection, the application will use various obfuscation tools to generate the intermediate code, such as the dex bytecode under Android , for confusion. This makes it difficult to map the directly provided source code with the activities performed by the application at runtime. These obfuscated codes will be loaded and executed through the application runtime environment. For example, the Dex bytecode in an Android application will be executed in the Android Runtime (ART). The present invention obtains the bytecode of the application by modifying the running time of the application. This method can bring two advantages. First, there is no need to provide matching intermediate code or source code, which improves the practicability of the method. Second, the intermediate code generated when the application is running can ensure consistency with the executed activities. Thus, the matching between the control flow graph and the control flow at runtime is ensured.

具体的,控制流图的生成:Specifically, the generation of the control flow graph:

根据指令类别得出基本块的边界,一条指令是基本块的开始当且仅当:1)它是一个方法的第一条指令或2)有某一条指令可能跳转至当前指令。而一条指令是基本块的结束,当且仅当:1)它是方法的返回,如return,throw指令;或者2)它是一条跳转指令,如if,goto,或者该指令可能抛出异常。在界定好基本块的起始和结束后,本发明的控制流图生成算法分为以下三个步骤:The boundary of the basic block is derived according to the instruction category. An instruction is the beginning of a basic block if and only if: 1) it is the first instruction of a method or 2) there is an instruction that may jump to the current instruction. And an instruction is the end of a basic block if and only if: 1) it is a method return, such as return, throw instruction; or 2) it is a jump instruction, such as if, goto, or the instruction may throw an exception . After defining the start and end of the basic block, the control flow graph generation algorithm of the present invention is divided into the following three steps:

计算所有跳转指令(包括显式跳转和异常跳转)的目标地址,将该地址的指令标记为可作为基本块开始的指令。Calculate the target address of all jump instructions (including explicit jumps and exception jumps), and mark the instruction at this address as an instruction that can be used as the start of a basic block.

初始化基本块队列为空,并由低至高遍历每一条指令,若该指令是基本块的开始,或是当前基本块为空,则新建一个基本块,将其作为当前基本块,并放到基本块队列末尾;若该指令是基本块的结束,则将其放到当前基本块内,并新建一个基本块作为当前基本块,并放到基本块队列末尾。Initialize the basic block queue to be empty, and traverse each instruction from low to high, if the instruction is the beginning of the basic block, or the current basic block is empty, create a new basic block, use it as the current basic block, and put it in the basic block The end of the block queue; if the instruction is the end of the basic block, put it into the current basic block, and create a new basic block as the current basic block, and put it at the end of the basic block queue.

遍历整个基本块队列,建立基本块的前趋和后继关系:一个基本块的最后一条指令如果是可跳转的指令,则在该基本块和跳转的目标基本块添加一条有向边;如果一个基本块不是return或goto指令,则与队列中的下一个基本块添加一条有向边。Traverse the entire basic block queue to establish the predecessor and successor relationship of the basic block: if the last instruction of a basic block is a jumpable instruction, add a directed edge between the basic block and the basic block of the jump target; if If a basic block is not a return or goto instruction, add a directed edge to the next basic block in the queue.

二、按需重分配控制流图的执行,以及由行为解释器生成应用运行时的活动。在应用运行时,每一个线程会对应一条控制流,每一条控制流可视为一组有序的活动。这组活动可视为上步生成的控制流图的一条路径。因此,本发明提出一个适用于监测程序执行的行为解释器。根据配置,将需要监测的控制流图分配至行为解释器中执行。如果将每一条指令的执行都对应于一个活动,会导致这组指令序列变得规模极大而难以处理:1、数值计算语句难以与语义对应;2、程序循环产生的大量活动会湮没真正的处理逻辑。因此,本发明将活动分为数值计算、分支控制、方法调用等,并在行为解释器中实现了提供多种粒度的活动筛选的活动筛选器,以便生成合适的栈模型。Second, on-demand redistribution of the execution of the control flow graph, and the generation of the activities of the application runtime by the behavioral interpreter. When the application is running, each thread corresponds to a control flow, and each control flow can be regarded as a set of ordered activities. This set of activities can be regarded as a path of the control flow graph generated in the previous step. Therefore, the present invention proposes a behavioral interpreter suitable for monitoring program execution. According to the configuration, the control flow graph that needs to be monitored is assigned to the behavior interpreter for execution. If the execution of each instruction corresponds to an activity, this group of instruction sequences will become extremely large and difficult to handle: 1. Numerical calculation statements are difficult to correspond to semantics; 2. A large number of activities generated by program loops will obliterate the real processing logic. Therefore, the present invention divides activities into numerical calculation, branch control, method invocation, etc., and implements an activity filter that provides activity screening of various granularities in the behavior interpreter, so as to generate a suitable stack model.

在本发明实施例的构造方法中,包括类筛选器和活动类型筛选器;其中,所述类筛选器基于包和类名正则匹配的粗粒度筛选,用于去除开发人员不关心的程序活动;所述活动类型筛选器基于活动类型的细粒度筛选,用于去除与开发者不关心的活动类型。In the construction method of the embodiment of the present invention, a class filter and an activity type filter are included; wherein, the class filter is based on a coarse-grained screening of a regular match between a package and a class name, and is used to remove program activities that developers do not care about; The activity type filter is based on fine-grained filtering of activity types, and is used to remove activity types that are not concerned by developers.

其中,所述栈活动的活动类型包括方法开始与方法结束,字段读,数组读和同步指令;基于上述活动类型,步骤S403的实现方法包括:Wherein, the activity type of the stack activity includes method start and method end, field read, array read and synchronization instruction; based on the above activity type, the implementation method of step S403 includes:

利用对所述终端应用的应用行为具有监测功能的行为解释器对所述需要监测的控制流图进行解释执行,获得所述终端应用运行时的活动;Using a behavior interpreter capable of monitoring the application behavior of the terminal application to interpret and execute the control flow graph that needs to be monitored, to obtain the running activities of the terminal application;

根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的栈活动;According to the concerned class, use the class filter to perform coarse-grained screening on the activities of the terminal application during runtime, and generate stack activities caused by the class;

针对所述栈活动的活动类型,利用所述活动类型筛选器对所述栈活动进行细粒度筛选。For the activity type of the stack activity, the activity type filter is used to perform fine-grained screening on the stack activity.

本发明实施例通过灵活地指定特定的包、类及指令类型,可以生成所需的栈模型,提高了易用性。The embodiment of the present invention can generate a required stack model by flexibly specifying specific packages, classes and instruction types, thereby improving usability.

为提高构造模型的准确性,本发明把执行方法调用指令的开始和结束都视作活动并进行记录。从Java的方法调用上看,其调用显现的是一种树状的结构:对于某一个方法调用,在执行过程中可能又发生多个方法的调用。因此,为了保证生成的序列能还原成这种树状的结构,本发明以下标s表示方法调用开始的活动,下标e表示方法调用结束的活动。对于上述示例的两种程序执行情况会对应为两个不同的序列:In order to improve the accuracy of constructing the model, the invention regards the beginning and the end of executing the method calling instruction as activities and records them. From the method call of Java, its call shows a tree structure: for a certain method call, multiple method calls may occur during the execution process. Therefore, in order to ensure that the generated sequence can be restored to this tree structure, the subscript s in the present invention indicates the activity starting from the method call, and the subscript e indicates the activity ending the method call. For the two program execution situations of the above example, there will be two different sequences:

1)如果calculate都是在doInBackground中被调用,则序列为ds→cs→ce→cs→ce→de1) If calculate is called in doInBackground, the sequence is d s →c s →c e →c s →c e →d e ;

2)如果有一个calculate是自身的递归调用,则序列为ds→cs→cs→ce→ce→de2) If there is a recursive call of calculate itself, the sequence is d s →c s →c s →c e →c e →d e .

将生成的活动序列重新构造成树状结构的方法调用可采用调用树构建算法。算法的过程其实是模拟Java虚拟机执行流程的过程。算法开始时,每一个线程的活动对应一个actions对象。对于每一个线程,维护两个数据结构:1)为已执行过的子控制流队列;2)为当前控制流的执行的函数栈。按序遍历actions中的各个活动,并进行以下判断:A call tree construction algorithm may be used to reconstruct the generated sequence of activities into a tree-structured method call. The process of the algorithm is actually the process of simulating the execution flow of the Java virtual machine. At the beginning of the algorithm, the activities of each thread correspond to an actions object. For each thread, two data structures are maintained: 1) a sub-control flow queue that has been executed; 2) a function stack for the execution of the current control flow. Traverse each activity in actions in order, and make the following judgments:

如果没有当前控制流,则实例化一个,并压入函数栈。If there is no current control flow, one is instantiated and pushed onto the function stack.

如果当前活动为方法开始类型,则实例化一个新的子控制流,将新实例化的子控制流压入函数栈,并添加进当前控制流的活动队列。最后将当前控制流设置为刚实例化的子控制流。If the current activity is a method start type, instantiate a new sub-control flow, push the newly instantiated sub-control flow onto the function stack, and add it to the activity queue of the current control flow. Finally, set the current control flow to the sub-control flow that was just instantiated.

如果当前活动为方法结束类型,则进行弹栈操作。如果弹栈结束后,函数栈为空,则说明当前线程的子控制流已经执行完毕,可以添加进已执行的子控制流队列中;如果函数栈不为空,则当前的控制流设置为函数栈顶的那个子控制流。If the current activity is a method end type, pop the stack. If the function stack is empty after popping the stack, it means that the sub-control flow of the current thread has been executed and can be added to the executed sub-control flow queue; if the function stack is not empty, the current control flow is set to function The child control flow at the top of the stack.

否则将当前活动压入子控制流的活动队列中。Otherwise push the current activity onto the activity queue of the child control flow.

类似于方法调用指令,其他类型的指令都可以有指令开始执行和执行结束两种活动。因为这些指令具有原子性,即在同一线程中,指令执行的开始与结束之间具有不会发生其他活动,所以,对于这些类型的指令仅需有指令开始执行的活动即可。Similar to method call instructions, other types of instructions can have two activities: instruction start execution and execution end. Because these instructions are atomic, that is, in the same thread, there is no other activity between the beginning and the end of the instruction execution, so for these types of instructions, only the activity that the instruction starts to execute is required.

在具体实现中,运行时的活动表示实现可以有存储形式:可以是内存中的对象,也可以是持久化的二进制文件或ASIC II文件。本发明中,运行时堆模型可以巴科斯范式的形式表示。In a specific implementation, the runtime activity representation implementation can have a storage form: it can be an object in memory, or a persistent binary file or an ASIC II file. In the present invention, the runtime heap model can be expressed in the form of Backus-Naur Form.

本发明通过一个活动的序列化与反序列化的机制来实现这种伸缩性。在运行时模型生成时,将运行时模型中的活动序列存放于可配置大小的缓冲区中,当活动数量超过预设时,则将缓冲区的活动进行序列化,并持久化到本地存储中。The present invention realizes this scalability through an active serialization and deserialization mechanism. When the runtime model is generated, the activity sequence in the runtime model is stored in a buffer with a configurable size. When the number of activities exceeds the preset, the activities in the buffer are serialized and persisted to the local storage .

三、控制流之间依赖关系的生成。多线程编程已成为安卓应用开发中重要的一部分。利用多线程编程可实现用户界面的高效响应,和多项计算任务的并行加速。多线程编程中的线程同步和相互唤醒(称之为线程依赖关系),可抽象为对于栈模型中控制流之间的边。线程的依赖关系是与时间相关的一个关系:在某时刻,主线程可以向后台线程发送计算任务,此时为后台线程执行的活动依赖主线程的活动;而在下一时刻,后台线程完成计算任务后,通知主线程进行界面更新;此时,为主线程执行的活动依赖后台线程的活动。因此,本发明对这些线程间依赖关系进行分类,并对不同类型的依赖进行处理,以在运行时生成这些依赖关系。Third, the generation of dependencies between control flows. Multithreaded programming has become an important part of Android application development. Utilizing multi-thread programming can realize efficient response of user interface and parallel acceleration of multiple computing tasks. Thread synchronization and mutual awakening in multithreaded programming (called thread dependencies) can be abstracted as edges between control flows in the stack model. The thread dependency is a relationship related to time: at a certain moment, the main thread can send computing tasks to the background thread, and the activities performed for the background thread at this time depend on the activities of the main thread; and at the next moment, the background thread completes the computing tasks After that, notify the main thread to update the interface; at this time, the activities performed by the main thread depend on the activities of the background thread. Therefore, the present invention classifies these inter-thread dependencies and processes different types of dependencies to generate them at runtime.

在本发明实施例中,所述依赖关系包括同步依赖和通信依赖。线程间利用Java语言规范中提供的线程状态转移相关的方法,如Thread.join,Object.wait,Object.notify等实现多个线程之间的协同,本发明称这些线程间的依赖关系为同步依赖。而线程间利用对象,实现多线程间的协同的,本发明称这些线程间的依赖关系为通信依赖。在实际开发中,应用开发者会复用框架层提供的各种多线程编程类,以提升开发效率。框架层虽然提供良好语义的应用编程接口给应用层的类,屏蔽了实现细节,但是为了保障框架的性能和鲁棒性,往往实现较为复杂。利用这些编程框架实现的程序,在运行时线程之间既可能是同步依赖,也可能是通信依赖。In the embodiment of the present invention, the dependencies include synchronization dependencies and communication dependencies. Threads use the thread state transfer related methods provided in the Java language specification, such as Thread.join, Object.wait, Object.notify, etc. to realize the collaboration between multiple threads. The present invention calls the dependency between these threads as synchronization dependency. . However, if objects are used between threads to realize coordination among multiple threads, the present invention refers to the dependency between these threads as communication dependency. In actual development, application developers will reuse various multi-threaded programming classes provided by the framework layer to improve development efficiency. Although the framework layer provides application programming interfaces with good semantics for classes in the application layer and shields the implementation details, in order to ensure the performance and robustness of the framework, the implementation is often more complicated. Programs implemented using these programming frameworks may be either synchronization dependencies or communication dependencies between runtime threads.

以图5中的BackgroundTask.execute方法调用的开始至onPostExecute方法调用的结束为例,全局共有两个活跃的线程,它们之间相互发生了同步依赖和通信依赖。该过程的方法调用如图6所示:图中的上下两条轴分别表示前台线程和后台线程随时间变化的方法栈的情况;图中的框表示方法,其中灰色的框表示框架层的方法,白色的框表示应用层的方法,即应用开发人员实现的方法;图中的箭头表示线程间的依赖关系,其中实线箭头表示同步依赖,而虚线箭头表示通信依赖。在图6所展示的方法调用过程中,BackgroundTask.execute方法在执行过程中(活动①)会调用ThreadPoolExecutor.execute方法,进而调用后台线程对象的start方法(活动②),进一步会引起后台线程的run方法的调用(活动③)。在后台线程的run方法开始执行后,经过层层调用,最终会调用BackgroundTask.doInBackground方法(活动④),在该方法的执行过程,除了会调用calculate方法进行计算任务外,还会调用AsyncTask.publishProgress方法(活动⑤),以使前台线程调用onProgressUpdated方法(活动⑥)对界面进行更新。随后,后台线程结束计算任务后,会再次用类似的方式通知前台进程当前任务已结束。其中,活动②与活动③之间为同步依赖,而活动⑤与活动⑥之间为通信依赖。Take the BackgroundTask.execute method call in Figure 5 from the beginning to the end of the onPostExecute method call as an example, there are two active threads globally, and there are synchronization dependencies and communication dependencies between them. The method call of this process is shown in Figure 6: the upper and lower axes in the figure respectively represent the method stack of the foreground thread and the background thread over time; the boxes in the figure represent the methods, and the gray boxes represent the methods of the framework layer , the white box represents the method of the application layer, that is, the method implemented by the application developer; the arrows in the figure represent the dependencies between threads, where the solid arrows represent synchronization dependencies, and the dashed arrows represent communication dependencies. In the method call process shown in Figure 6, the BackgroundTask.execute method will call the ThreadPoolExecutor.execute method during execution (activity ①), and then call the start method of the background thread object (activity ②), which will further cause the background thread to run Method call (activity ③). After the run method of the background thread starts to execute, after layers of calls, the BackgroundTask.doInBackground method (activity ④) will eventually be called. During the execution of this method, in addition to calling the calculate method to perform calculation tasks, AsyncTask.publishProgress will also be called method (activity ⑤), so that the foreground thread calls the onProgressUpdated method (activity ⑥) to update the interface. Afterwards, after the background thread finishes the calculation task, it will notify the foreground process again that the current task has ended in a similar way. Among them, activity ② and activity ③ are synchronization dependencies, and activities ⑤ and activities ⑥ are communication dependencies.

同步依赖的生成:Generation of synchronous dependencies:

为了实现同步依赖的生成,Java中与同步依赖相关的方法都会被认为是需要收集的活动。这样运行时栈模型就能收集到如表1的各种与同步依赖相关的活动。In order to realize the generation of synchronous dependencies, methods related to synchronous dependencies in Java will be considered as activities that need to be collected. In this way, the runtime stack model can collect various activities related to synchronization dependencies as shown in Table 1.

对于存在同步依赖的两个活动,后发生的活动可能是一个方法的结束或是一个方法的开始,据此,可将同步依赖分为两种,并进行分别处理:For two activities that have synchronization dependencies, the activity that occurs later may be the end of a method or the start of a method. Accordingly, synchronization dependencies can be divided into two types and processed separately:

对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系。例如Thread.join的结束依赖Thread.run的结束;Object.wait的结束依赖Object.notify的结束,对于如Thread.join或Object.wait的方法结束活动,就可利用时间戳由后往前寻找其他线程中可匹配的活动。如果找到了,则对应于一个同步依赖关系。For the case where the end of one method depends on the end of another method, use the timestamp to find matching activities in other threads from the back to the front. If found, it corresponds to a synchronization dependency. For example, the end of Thread.join depends on the end of Thread.run; the end of Object.wait depends on the end of Object.notify. For methods such as Thread.join or Object.wait to end activities, you can use the timestamp to find other Matchable activities in the thread. If found, it corresponds to a sync dependency.

在生成控制流间的同步依赖关系时,对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系;例如Thread.join的结束依赖Thread.run的结束;Object.wait的结束依赖Object.notify的结束,对于如Thread.join或Object.wait的方法结束活动,就可利用时间戳由后往前寻找其他线程中可匹配的活动。如果找到了,则对应于一个同步依赖关系。When generating synchronization dependencies between control flows, when the end of one method depends on the end of another method, use the timestamp to find matching activities in other threads from the back to the front. If found, it corresponds to a synchronization dependency. Relationship; for example, the end of Thread.join depends on the end of Thread.run; the end of Object.wait depends on the end of Object.notify. For methods such as Thread.join or Object.wait to end activities, timestamps can be used from back to front Look for matching activity in other threads. If found, it corresponds to a sync dependency.

对于一个活动的开始依赖另一个活动的结束的情况,先对当前线程进行检查,如果该活动为当前线程中的第一个执行的活动,则该活动是依赖另一个线程结束活动的,否则该活动仅是正常的方法调用,并不依赖另一个线程的活动。例如Thread.run的开始依赖Thread.start的结束,由于在Java中,一个活动的开始(即某个方法的调用)可以在任意地方进行任意次数,包括Thread.run方法。因此,要判断一个Thread.run方法的调用是否依赖Thread.start方法的调用不能直接根据时间戳由后往面查找匹配,需要先对当前的Thread.run进行检查:如果该活动为当前线程中的第一个执行的活动,则它是依赖另一个线程Thread.start结束活动的,否则它仅是正常的方法调用,并不依赖另一个线程的活动。For the case where the start of an activity depends on the end of another activity, first check the current thread, if the activity is the first executed activity in the current thread, then the activity depends on another thread to end the activity, otherwise the Activities are just normal method calls and do not depend on another thread's activity. For example, the beginning of Thread.run depends on the end of Thread.start, because in Java, the beginning of an activity (that is, the call of a certain method) can be performed any number of times anywhere, including the Thread.run method. Therefore, to determine whether a Thread.run method call depends on the Thread.start method call, you cannot directly search for a match based on the timestamp from the back, you need to check the current Thread.run first: if the activity is The first activity to execute, it depends on another thread Thread.start to end the activity, otherwise it is just a normal method call and does not depend on the activity of another thread.

通信依赖的生成:Generation of communication dependencies:

线程间不基于Java提供的能实现线程状态转移的方法,实现多线程间的协同的,本发明称这些线程间的依赖关系为通信依赖。The thread is not based on the method provided by Java that can realize thread state transfer and realize the coordination among multiple threads. The present invention refers to the dependency relationship between these threads as communication dependency.

以图6中活动⑤与⑥为例,其具体实现是基于MessageQueue的next方法与enqueueMessage方法。在这过程中,如果前台线程的待处理队列中还有元素排队等待处理的话,则活动⑤引起的enqueueMessage方法,仅会将当前任务添加到该队列中,并不会显式地唤醒前台线程。但从逻辑上,可认为对于某一MessageQueue对象而言,其next方法所返回的Message对象,会被Handler以参数的形式传入dispatchMessage方法中,因此可认为MessageQueue的next方法的结束依赖MessageQueue.enqueueMessage,并且该依赖是以参数Message为匹配对象的而不是MessageQueue。Taking activities ⑤ and ⑥ in Figure 6 as an example, the specific implementation is based on the next method and enqueueMessage method of MessageQueue. During this process, if there are still elements waiting to be processed in the pending queue of the foreground thread, the enqueueMessage method triggered by activity ⑤ will only add the current task to the queue, and will not explicitly wake up the foreground thread. But logically, it can be considered that for a MessageQueue object, the Message object returned by its next method will be passed into the dispatchMessage method by the Handler as a parameter, so it can be considered that the end of the next method of the MessageQueue depends on MessageQueue.enqueueMessage , and the dependency takes the parameter Message as the matching object instead of MessageQueue.

在生成控制流间的通信依赖关系时,对所有与活动间通信依赖相关的类进行总结,并将这些类的相关方法与线程依赖相关的方法一起作为生成通信依赖的知识库。该知识库亦可支持应用的进行自定义。When generating communication dependencies between control flows, all classes related to communication dependencies between activities are summarized, and the related methods of these classes and methods related to thread dependencies are used as a knowledge base for generating communication dependencies. The knowledge base can also support the customization of the application.

参照图4,所述构造所述终端应用行为的运行时堆模型的步骤具体可以包括:Referring to FIG. 4, the step of constructing the runtime heap model of the terminal application behavior may specifically include:

步骤S405:在所述终端应用运行时,生成堆区的初始状态;Step S405: When the terminal application is running, generate an initial state of the heap;

步骤S406:生成堆操作活动,得到所述终端应用行为的运行时堆模型。Step S406: Generate a heap operation activity to obtain a runtime heap model of the terminal application behavior.

在本发明实施例中,运行时堆模型包含以下基本要素:1)一个堆区的初始状态;2)运行时发生的一组影响堆区数据的活动。本发明首先给出堆区初始状态的描述方法,并在运行时生成符合该表示的堆数据初始状态。其次,本发明给出了堆操作活动的描述方法,并在运行时构造运行时堆模型中的活动。最后,给出了堆区初始状态与堆操作活动的BNF表示。In the embodiment of the present invention, the runtime heap model includes the following basic elements: 1) the initial state of a heap; 2) a group of activities that affect the data in the heap during runtime. The invention first provides a description method of the initial state of the heap area, and generates the initial state of the heap data conforming to the representation at runtime. Secondly, the invention provides a description method for heap operation activities, and constructs the activities in the heap model during runtime. Finally, the BNF representation of the initial state of the heap area and the operation activities of the heap is given.

下面,对运行时堆模型的生成步骤作进一步概述。Below, a further overview of the steps involved in generating the runtime heap model is given.

一、是对堆区初始状态的生成。堆区的初始状态为开始时刻的堆区数据的状态。在Java虚拟机规范中,仅对堆区给出了最简单的描述:堆是运行时用于分析所有类实例和数组的区域,该区域由一个自动存储管理系统(即垃圾回收器)进行管理。堆中的对象从来都不会显式地回收,而是会由垃圾回收器自动回收。堆区的初始状态可视为某一时刻堆区数据的快照,因此,如果在生成内存堆区数据状态时,如果有别的线程继续执行,并进行堆区操作(例如创建对象、执行垃圾回收等),就会破坏初始状态的原子性。因此,本发明首先给出了一种描述堆区初始状态的BNF表示,并在生成堆区数据初始状态时采用“冻结”堆区数据的方式,保证了初始状态生成过程的原子性。One is to generate the initial state of the heap area. The initial state of the heap is the state of the heap data at the beginning. In the Java virtual machine specification, only the simplest description is given for the heap area: the heap is the area used to analyze all class instances and arrays at runtime, and this area is managed by an automatic storage management system (ie, garbage collector) . Objects in the heap are never explicitly reclaimed, but are automatically reclaimed by the garbage collector. The initial state of the heap area can be regarded as a snapshot of the heap area data at a certain moment. Therefore, if other threads continue to execute and perform heap area operations (such as creating objects, performing garbage collection, etc.) etc.), it will destroy the atomicity of the initial state. Therefore, the present invention first provides a BNF representation for describing the initial state of the heap area, and adopts the method of "freezing" the heap area data when generating the initial state of the heap area data, which ensures the atomicity of the initial state generation process.

二、堆模型中的活动的生成。在应用运行时,Java的垃圾回收器可以产生回收内存的活动。除了这些活动外,其他活动可视为运行时栈模型中的活动的一个子集。一方面,如果将每一个会影响堆区的数据的操作都对应于一个活动,会导致这组活动数量变得极大而难以处理。例如某应用中存在大文件的I/O操作,如果全部操作都以活动的形式记录下来,则活动的数据量将不小于大文件的数据量;另一方面,类似于控制流模型,可能仅关注部分类、方法的执行,生成过大的堆模型反而难以分析。这里扩展了活动的描述使其支持描述垃圾回收活动,类似于运行时栈模型,提供多种粒度的活动选择筛选选项,以便生成合适的堆模型。生成的堆模型详实地描述了所关心的对象的变化情况,因此,利用基于时间戳的堆对象状态查询算法可以查询堆对象在任意时刻的状态。Second, the generation of activities in the heap model. While the application is running, Java's garbage collector can generate activities to reclaim memory. In addition to these activities, other activities can be considered as a subset of the activities in the runtime stack model. On the one hand, if each operation that affects the data in the heap corresponds to an activity, the number of activities in this group will become extremely large and unmanageable. For example, if there are large file I/O operations in an application, if all operations are recorded in the form of activities, the data volume of activities will not be less than the data volume of large files; on the other hand, similar to the control flow model, it may only Pay attention to the execution of some classes and methods, and generate an excessively large heap model, which is difficult to analyze. The description of activities is extended here to support the description of garbage collection activities, which is similar to the runtime stack model and provides multiple granularity of activity selection and filtering options in order to generate a suitable heap model. The generated heap model describes the changes of the concerned objects in detail. Therefore, the state of the heap objects at any time can be queried by using the time stamp-based heap object state query algorithm.

接下来,采用一个具体的例子对运行时堆模型建模过程进行介绍:Next, a specific example is used to introduce the modeling process of the runtime heap model:

Java堆区的数据仅包括实例化的对象和数组。对于应用而言,对象的创建可能发生在应用层代码或框架层代码,因此,我们将应用中的对象分为应用层和框架层,以图5实现的代码为例,在触发点击事件前,堆区的对象如图7(a)所示。图中每一个圆表示一个对象,而圆与圆之间的连线表示引用关系。图7(a)中与应用业务逻辑相关对象有展示界面FloatActivity、可触发后台计算任务的按钮Button、待处理的后台任务BackgroundTask、用于展示任务计算结果的TextView和点击事件监听对象OnClickListener。除了与业务逻辑相关的对象,还有许多框架层的对象。例如,框架层一个MessageQueue对象。实现后台处理任务可以通知前台进行更新。图7中的实线箭头表示对象的引用关系,即对象A如果有某个字段指向另一个对象B,则由A至B有一条有向连接;虚线箭头表示在整个事件的处理过程中,存在过对象的引用关系,而在结束时,已没有引用关系了。The data in the Java heap includes only instantiated objects and arrays. For the application, the object creation may occur in the application layer code or the framework layer code. Therefore, we divide the objects in the application into the application layer and the framework layer. Taking the code implemented in Figure 5 as an example, before triggering the click event, The objects in the heap area are shown in Figure 7(a). Each circle in the figure represents an object, and the lines between the circles represent the reference relationship. The objects related to the application business logic in Figure 7(a) include the display interface FloatActivity, the button Button that can trigger the background calculation task, the pending background task BackgroundTask, the TextView used to display the task calculation result, and the click event listening object OnClickListener. In addition to objects related to business logic, there are many framework layer objects. For example, a MessageQueue object in the framework layer. Implementing background processing tasks can notify the foreground to update. The solid arrow in Figure 7 indicates the reference relationship of the object, that is, if an object A has a field pointing to another object B, there is a directed connection from A to B; the dotted arrow indicates that during the entire event processing, there is There is no reference relationship between objects, and at the end, there is no reference relationship.

在事件触发过程中,会创建如下对象:在BackgroundTask.execute方法在执行过程中,会调用ThreadPoolExecutor.execute方法,此时,由于是该对象首次执行execute方法,该对象会新建一个Thread对象①,为后台执行的线程;在调用后台线程对象的start方法之后,进一步会引起后台线程的run方法的调用并正式开始调用doInBackground方法。在该方法中,除了会调用calculate方法进行计算任务外,还会调用AsyncTask.publishProgress方法,在执行该方法前,传入的参数会被封装到新创建的Integer[]对象中②;而在方法执行过程中,则是会新创建一个Message对象③,并放至全局的一个MessageQueue队列中。当前台线程接收到该Message,并执行onProgressPublished时,会创建一个StringBuilder对象④,以构造setText所需的参数,并通过StringBuilder.toString方法实例化出一个新的String对象⑤。而当doInBackground方法执行结束前,又会创建一个新的StringBuilder对象⑥,并且计算出返回值String对象⑦。该String对象会被再次封装到一个新创建的Message对象⑧中,并通知前台线程执行onPostExecute方法。上述过程已对部分步骤进行简化,在实际运行中会存在更多的对象创建。例如,Thread对象并不是直接依赖BackgroundTask而是会经过FutureTask,Callable等对象的层层封装,间接依赖BackgroundTask对象。在该过程结束后,对象①至⑧都可能会在某次垃圾回收中被回收。During the event triggering process, the following objects will be created: during the execution of the BackgroundTask.execute method, the ThreadPoolExecutor.execute method will be called. At this time, since this object executes the execute method for the first time, the object will create a new Thread object①, as The thread executed in the background; after calling the start method of the background thread object, it will further cause the call of the run method of the background thread and officially start calling the doInBackground method. In this method, in addition to calling the calculate method to perform calculation tasks, the AsyncTask.publishProgress method will also be called. Before executing this method, the incoming parameters will be encapsulated into the newly created Integer[] object②; and in the method During execution, a new Message object ③ will be created and placed in a global MessageQueue queue. When the foreground thread receives the Message and executes onProgressPublished, it will create a StringBuilder object ④ to construct the parameters required by setText, and instantiate a new String object ⑤ through the StringBuilder.toString method. And before the execution of the doInBackground method ends, a new StringBuilder object ⑥ will be created, and the return value String object ⑦ will be calculated. The String object will be encapsulated again into a newly created Message object ⑧, and notify the foreground thread to execute the onPostExecute method. Some steps have been simplified in the above process, and there will be more object creation in actual operation. For example, the Thread object does not directly depend on the BackgroundTask, but indirectly depends on the BackgroundTask object through layer-by-layer encapsulation of FutureTask, Callable and other objects. After the process ends, objects ① to ⑧ may all be recycled in a garbage collection.

在堆模型中,本发明将图7(b)中的每个对象的实例化、字段赋值以及回收都视为一个活动。类似运行时栈模型的表示,下面本发明优选以巴科斯范式的形式给出一种运行时堆模型的表示。In the heap model, the present invention regards the instantiation, field assignment and recycling of each object in Fig. 7(b) as an activity. Similar to the representation of the runtime stack model, the following present invention preferably presents a runtime heap model representation in the form of Backus-Naur Form.

其中,DataAction与运行时栈模型构造关键技术中所描述的ControlAction类似,ControlAction用于描述执行的指令情况,而DataAction用于描述内存数据的变化情况。Number表示数字类型,可以是数值,也可以是内存地址;String表示字符串类型。从上述表示中,可以发现,模型的复杂度主要取决于:1)初始状态中对象的数量;2)堆区数据活动的数量。Among them, DataAction is similar to ControlAction described in the key technology of runtime stack model construction. ControlAction is used to describe the executed instructions, and DataAction is used to describe the change of memory data. Number represents a numeric type, which can be a value or a memory address; String represents a character string type. From the above representation, it can be found that the complexity of the model mainly depends on: 1) the number of objects in the initial state; 2) the number of data activities in the heap area.

在安卓的具体实现中,其堆区可分为三个子区域:1)应用堆(App Heap),当前应用实例化对象和数组时使用的内存区域;2)镜像堆(Image Heap),装载了当前应用镜像的内存区域;3)孵化堆(Zygote Heap),系统启动时加载的系统类存放的内存区域。本发明所述的初始状态主要关注运行时变化最大的应用堆。对于安卓平台上的Dalvik虚拟机和ART虚拟机而言,它们实现了可以在任一时刻,将当前应用堆的状态以文件的状态保存下来(堆转储操作)。该文件是一种私有的内存镜像格式,通过安卓开发者工具可将其转换为符合J2EE平台规定的hprof格式。In the specific implementation of Android, its heap area can be divided into three sub-areas: 1) Application Heap (App Heap), the memory area used when the current application instantiates objects and arrays; 2) Image Heap (Image Heap), loaded with The memory area of the current application image; 3) Incubation heap (Zygote Heap), the memory area where the system classes loaded when the system starts are stored. The initial state described in the present invention mainly focuses on the application heap that changes the most during runtime. For the Dalvik virtual machine and the ART virtual machine on the Android platform, they realize that the state of the current application heap can be saved in the state of a file at any time (heap dump operation). This file is a private memory image format, which can be converted to the hprof format compliant with the J2EE platform through the Android developer tool.

然而直接基于现在的应用堆转储仅能反映某一时刻的堆状态,难以实现反映一段时间内的任意时刻的堆状态。首先,执行一次堆转储操作需要挂起所有线程,耗时高,生成的文件从几十兆字节至几百兆字节不等,难以通过每隔一段时间执行一次堆转储操作实现反映一段时间内的任意时刻的堆状态。其次,执行堆转储操作并不会转储那些被回收器回收的对象,包括那些在执行过程中产生的临时对象,然而对于一个执行过程而言,产生的临时对象对描述过程的执行也十分重要,例如,图7中的对象②至⑧都是临时对象,这些对象不能直接利用堆转储进行持久化。However, based on the current application heap dump, it can only reflect the heap state at a certain moment, and it is difficult to reflect the heap state at any time within a period of time. First of all, performing a heap dump operation needs to suspend all threads, which takes a long time, and the generated files range from tens of megabytes to hundreds of megabytes, so it is difficult to reflect by performing a heap dump operation every once in a while. The state of the heap at any moment in a period of time. Secondly, performing a heap dump operation will not dump those objects recovered by the collector, including those temporary objects generated during execution. However, for an execution process, the generated temporary objects are also very important to the execution of the description process. Important, for example, objects ② to ⑧ in Figure 7 are all temporary objects, and these objects cannot be directly persisted by using the heap dump.

在本发明实施例中,所述构造所述终端应用行为的运行时堆模型的步骤包括:所述终端应用运行时的活动包括实例化活动、修改活动和回收活动。In the embodiment of the present invention, the step of constructing the runtime heap model of the terminal application behavior includes: the runtime activities of the terminal application include instantiation activities, modification activities and recycling activities.

其中,实例化活动(NewAction),即创建新的对象、新的数组的活动,可对应于字节码中的newInstance、newArray等指令在运行时的执行。Wherein, the instantiation activity (NewAction), that is, the activity of creating a new object or a new array, may correspond to the execution of instructions such as newInstance and newArray in the bytecode at runtime.

修改活动(ModifiyAction),即修改类的静态字段、对象的字段、数组的元素的值的活动,可对应于字节码中的sput,iput,aput等指令。The modification activity (ModifiyAction), that is, the activity of modifying the value of the static field of the class, the field of the object, and the element of the array, can correspond to sput, iput, aput and other instructions in the bytecode.

回收活动(GCAction),即执行垃圾回收时,对堆中的对象产生影响的活动。针对回收活动,垃圾回收机制是一种自动的内存管理机制。当一片内存中的数据不再会被用到时,会予以回收释放,以便于进行下次的分配。具体的垃圾回收算法实现有引用计数法、可达性分析算法等。Recycling activity (GCAction), that is, the activity that affects the objects in the heap when garbage collection is performed. For recycling activities, the garbage collection mechanism is an automatic memory management mechanism. When the data in a piece of memory is no longer used, it will be reclaimed and released for the next allocation. The specific implementation of garbage collection algorithm includes reference counting method, reachability analysis algorithm, etc.

回收活动在运行时是对应不到dex字节码的指令的,因为它的具体实现是在虚拟机层面。回收活动可进一步细分为清除活动和压缩活动。所谓清除活动就是清除不再需要的对象;而所谓的压缩活动则是对活跃的对象整理到连续的内存空间上,以避免分配大内存时由于碎片化导致分配失败的情况。The recycling activity does not correspond to the instructions of the dex bytecode at runtime, because its specific implementation is at the virtual machine level. Recycling activities can be further subdivided into removal activities and compaction activities. The so-called clearing activity is to clear objects that are no longer needed; the so-called compaction activity is to organize active objects into a continuous memory space, so as to avoid allocation failure due to fragmentation when allocating large memory.

除此之外,除了实现的应用层的代码会创建对象以外,框架层的代码也会创建大量对象,在某些情况下,甚至会是几倍于应用层创建的对象。需要提供一种堆模型复杂性管理的机制,以保证生成的运行时栈模型的准确性和易用性。与前述两级筛选机制类似,对于堆模型的活动生成,也有一个两级的筛选机制。基于包和类名正则匹配的粗粒度筛选和基于活动类型的细粒度筛选。In addition, in addition to the objects created by the code of the implemented application layer, the code of the framework layer will also create a large number of objects, in some cases, even several times the number of objects created by the application layer. It is necessary to provide a mechanism for managing the complexity of the heap model to ensure the accuracy and ease of use of the generated runtime stack model. Similar to the aforementioned two-level screening mechanism, there is also a two-level screening mechanism for the activity generation of the heap model. Coarse-grained filtering based on regular matching of package and class names and fine-grained filtering based on activity type.

本发明优选提供了6种堆操作活动,所述堆操作活动的活动类型包括对象实例化,数组实例化,对象字段写,数组元素写,清除活动和压缩活动;The present invention preferably provides 6 kinds of heap operation activities, and the activity types of the heap operation activities include object instantiation, array instantiation, object field writing, array element writing, clearing activity and compression activity;

所述生成堆操作活动的步骤包括:The steps of generating heap operation activities include:

根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的堆操作活动;According to the concerned class, use the class filter to perform coarse-grained screening on the activities of the terminal application during runtime, and generate heap operation activities caused by the class;

针对所述堆操作活动的活动类型,利用所述活动类型筛选器对所述堆操作活动进行细粒度筛选。For the activity type of the heap operation activity, the activity type filter is used to perform fine-grained screening on the heap operation activity.

针对行为反射在控制上面临的挑战,即本发明背景技术所阐述的第二个挑战,支持指令级的行为控制,由于不是本发明的重点,在此不多赘述。The challenge faced in the control of behavior reflection, that is, the second challenge described in the background of the present invention, supports instruction-level behavior control. Since it is not the focus of the present invention, it will not be described here.

下面,采用一具体实例验证本发明实施例对终端应用行为监测的有效性。Next, a specific example is used to verify the effectiveness of the embodiment of the present invention in monitoring terminal application behavior.

针对移动互联网中广泛使用的安卓移动应用,给出行为反射框架的原型系统实现:Reflectall。Reflectall的全称为Reflection at low level interpreter,具有两重含意,一是基于底层的行为解释器实现的反射;二是它可以监测和控制指令级别的应用行为。Reflectall基于安卓操作系统开源项目。为了实现移动应用行为的监测与控制,Reflectall平台可分为行为运行时模型构造子系统、模型分析与代码生成子系统和运行子系统,实现了行为反射框架中的监测和控制。For Android mobile applications widely used in the mobile Internet, a prototype system implementation of the behavioral reflection framework is given: Reflectall. The full name of Reflectall is Reflection at low level interpreter, which has two meanings. One is the reflection based on the underlying behavior interpreter; the other is that it can monitor and control the application behavior at the instruction level. Reflectall is based on the Android OS open source project. In order to realize the monitoring and control of mobile application behavior, the Reflectall platform can be divided into the behavior runtime model construction subsystem, the model analysis and code generation subsystem and the running subsystem, realizing the monitoring and control in the behavior reflection framework.

参照图8,为Reflectall模型生成子系统架构示意图。Reflectall的行为运行时模型构造子系统实现了移动应用行为运行时模型的构造,其核心实现在系统层,由优化-反优化器、行为解释器、模型构建和接口层四个模块组成。这个四个模块实现了移动应用行为的监测与控制。Referring to Figure 8, a schematic diagram of the subsystem architecture is generated for the Reflectall model. The behavioral runtime model construction subsystem of Reflectall realizes the construction of the mobile application behavioral runtime model. Its core implementation is at the system layer, which consists of four modules: optimization-de-optimizer, behavior interpreter, model construction and interface layer. These four modules realize the monitoring and control of mobile application behavior.

其中,优化-反优化器:安卓运行时环境可加载CPU可直接执行的原生指令。因此,需要将以原生指令切换为字节码,即反优化,并通过行为解释器进行解释执行,从而实现对移动应用运行时活动的监测。移动应用由于其复杂性,难以监测移动应用执行中的所有活动,因此引入了一个两级筛选机制。优化-反优化器实现了两级筛选机制中的类筛选机制,通过优化-反优化器,可以按需地将待监测的类反优化为字节码,并进行解释执行;而对于未被监测的类,则仍然在原生执行器中执行。优化-反优化器会在如下三种情况下触发:1)收到开始监测的命令时,会根据所配置的参数,对当前已经加载的类的方法进行筛选,并进行反优化;2)收到结束监测命令时,会对当前已反优化后的类进行再优化,令其重新进入原生执行器中执行;3)类链接器加载新的类时,会对类进行类似于情况1)中的筛选和反优化过程。为了保证程序执行的正确性,反优化的过程需要和部分垃圾回收算法一样,暂时挂起所有线程的执行,待反优化执行结束后,再恢复线程的执行。通过这种局部反优化,并保持解释执行与原生执行共存的状态,可以大大减少监测的性能开销。Among them, optimization-deoptimizer: the Android runtime environment can load native instructions that can be directly executed by the CPU. Therefore, it is necessary to switch native instructions into bytecodes, that is, de-optimize, and interpret and execute them through a behavioral interpreter, so as to monitor the runtime activities of mobile applications. Due to the complexity of mobile applications, it is difficult to monitor all activities in the execution of mobile applications, so a two-level screening mechanism is introduced. The optimization-de-optimizer implements the class screening mechanism in the two-level screening mechanism. Through the optimization-de-optimizer, the class to be monitored can be de-optimized into bytecode as needed, and interpreted and executed; while for unmonitored class, it is still executed in the native executor. The optimization-de-optimizer will be triggered in the following three situations: 1) When receiving the command to start monitoring, it will filter the methods of the currently loaded class according to the configured parameters and perform de-optimization; 2) Receive When the monitoring command ends, the class that has been deoptimized will be re-optimized to re-enter the native executor for execution; 3) When the class linker loads a new class, it will perform a class similar to the case 1) screening and de-optimization process. In order to ensure the correctness of program execution, the de-optimization process needs to be the same as some garbage collection algorithms, temporarily suspend the execution of all threads, and resume the execution of threads after the de-optimization execution is completed. Through this kind of local anti-optimization and maintaining the coexistence of interpreted execution and native execution, the performance overhead of monitoring can be greatly reduced.

行为解释器:行为解释器是解释执行dex格式字节码的解释器,能在解释执行的过程中,监测当前的程序执行中发生的活动。移动应用行为运行时模型中的活动大多是由行为解释器生成。除了行为解释器所生成的活动,垃圾回收器也可以生成部分活动——垃圾回收活动。行为解释器还实现了两级筛选机制中的活动筛选机制,可根据配置的活动收集粒度产生不同种类的活动。Behavior interpreter: Behavior interpreter is an interpreter that interprets and executes bytecodes in dex format, and can monitor the activities that occur in the current program execution during the process of interpreting and executing. Most of the activities in the mobile application behavior runtime model are generated by the behavior interpreter. In addition to the activities generated by the behavior interpreter, the garbage collector can also generate some activities - garbage collection activities. The behavior interpreter also implements the activity screening mechanism in the two-level screening mechanism, which can generate different kinds of activities according to the configured activity collection granularity.

模型构建器:行为解释器与垃圾回收器所产生的活动会在模型构建器中构建。当运行时产生的活动较多时,会导致内存占用较大。因此,模型构建器实现了在线和离线的模型构建。当活动较少时,模型构建器运行于在线的模型构建模式时,当活动数量达到配置的阀值,模型构建器会将当前已产生的活动序列持久化,并以文件的形式持久化至存储。ModelBuilder: The activities generated by the Behavior Interpreter and Garbage Collector are built in ModelBuilder. When there are many activities generated at runtime, it will lead to a large memory usage. Therefore, Model Builder enables both online and offline model building. When there are few activities, when the model builder runs in the online model building mode, when the number of activities reaches the configured threshold, the model builder will persist the currently generated activity sequence and persist it to the storage in the form of a file .

接口层:对优化-反优化器、行为解释器和模型构建器所提供的功能进行封装。同时也提供反序列化活动所需的接口,例如根据地址查找对象和给定对象转换为地址等。Interface layer: Encapsulates the functions provided by optimizer-deoptimizer, behavior interpreter and model builder. At the same time, it also provides the interfaces required for deserialization activities, such as finding objects based on addresses and converting a given object to an address.

本发明实例所实现的原型中,可以生成两种移动应用行为运行时模型:1)包含了运行时数据依赖的精细化模型;2)不包含运行时数据依赖的精简模型。基于系统层的实现,在框架层,Reflectall包括一组行为反射接口,可监测不同粒度的应用的活动,生成不同粒度的应用行为运行时模型;一组远程调试连接接口,可控制应用活动监测的开始与结束。在应用层,对框架层的接口进行封装,实现了一个安卓应用,可以以Web服务的形式对外提供远程调试接口。In the prototype realized by the example of the present invention, two types of runtime models of mobile application behavior can be generated: 1) a refined model that includes runtime data dependencies; 2) a simplified model that does not include runtime data dependencies. Based on the implementation of the system layer, at the framework layer, Reflectall includes a set of behavioral reflection interfaces that can monitor the activities of applications with different granularities and generate runtime models of application behaviors with different granularities; a set of remote debugging connection interfaces that can control the monitoring of application activities start and end. In the application layer, the interface of the framework layer is encapsulated, and an Android application is implemented, which can provide a remote debugging interface in the form of a Web service.

Reflectall的分析与代码生成子系统为浏览器-服务端架构。分析与代码生成子系统实现了:Reflectall's analysis and code generation subsystem is a browser-server architecture. The analysis and code generation subsystem implements:

版本管理:利用git管理不同版本的移动应用与互操作接口。同时支持服务器端编译,并利用客户端的接口管理应用将编译好的dex字节码推送至客户端。Version management: use git to manage different versions of mobile applications and interoperable interfaces. At the same time, it supports server-side compilation, and uses the client-side interface management application to push the compiled dex bytecode to the client.

栈模型可视化:提供了一个树状的视图,并支持基于关键字的数据依赖沾染分析。Stack model visualization: Provides a tree view and supports keyword-based data dependency contamination analysis.

Reflectall的接口运行子系统在安卓开源项目的框架层上添加了一个行为反射类加载器,如图9所示,为本发明示例Reflectall的接口运行子系统的结构示意图。当应用进程启动时,会检查是否有可加载的行为反射接口字节码文件。如果存在适合当前应用的行为反射接口字节码文件,则通过行为反射类加载器将其加载进应用进程,同时,利用Binder通信机制向接口管理应用注册当前应用提供的互操作接口。接口管理应用提供了接口转发、状态检测等服务。调用者进程通过接口管理应用可实现与指定应用互操作。The interface operation subsystem of Reflectall adds a behavioral reflection class loader on the framework layer of the Android open source project, as shown in FIG. 9 , which is a schematic structural diagram of the interface operation subsystem of the example Reflectall of the present invention. When the application process starts, it will check whether there is a behavior reflection interface bytecode file that can be loaded. If there is a behavior reflection interface bytecode file suitable for the current application, load it into the application process through the behavior reflection class loader, and at the same time, use the Binder communication mechanism to register the interoperability interface provided by the current application with the interface management application. The interface management application provides services such as interface forwarding and status detection. The caller process can interoperate with the specified application through the interface management application.

具体验证时,本发明以一个包含69个开源安卓应用的开源应用集和一个包含39个闭源应用的闭源应用集对Reflectall模型生成的性能进行验证。During specific verification, the present invention verifies the performance generated by the Reflectall model with an open source application set containing 69 open source Android applications and a closed source application set containing 39 closed source applications.

移动应用行为运行时模型的构造开销与模型的活动数量正相关——应用越复杂、活动越多,生成行为运行时模型的开销也就越大。相比闭源应用,开源应用在实现的复杂程度远低于闭源应用。开源应用集的类的数量的中位数为58个,方法的数量的中位数为246个;75%的开源应用集中的应用的类的数量不大于167个;方法的数量不大于859个。而闭源应用集中的应用,应用的类的数量的中位数为14266,方法的数量的中位数是87717个,是开源应用集中对应的数值的245倍和102倍。实验所使用的硬件配置如下:1)使用Android智能手机红米2A,它的CPU是1.5GHz,内存为1GB,运行安卓操作系统版本为5.1.1。2)实验使用一台普通PC作为远程控制端控制手机进行实验,该PC的CPU为Intel酷睿i5 3427U(1.8GHz),内存为4GB,运行OSX 10.11操作系统。The construction cost of a mobile application behavioral runtime model is positively related to the number of activities in the model—the more complex the application and the more activities, the more expensive it is to generate the behavioral runtime model. Compared with closed-source applications, the implementation complexity of open-source applications is much lower than that of closed-source applications. The median number of classes in the open source application set is 58, and the median number of methods is 246; the number of classes in 75% of the open source application sets is not greater than 167; the number of methods is not greater than 859 . For the applications in the closed-source application collection, the median number of application classes is 14,266, and the median number of methods is 87,717, which are 245 times and 102 times the corresponding values in the open-source application collection. The hardware configuration used in the experiment is as follows: 1) Use the Android smart phone Redmi 2A, its CPU is 1.5GHz, the memory is 1GB, and the running Android operating system version is 5.1.1. 2) The experiment uses an ordinary PC as a remote control The terminal controls the mobile phone for experiments. The CPU of the PC is Intel Core i5 3427U (1.8GHz), the memory is 4GB, and the OSX 10.11 operating system is running.

目前,监测应用执行流程的方法除了本发明所述的实现行为解释器的方式外,还包括运行时元消息绑定和编译时字节码重构两种方式。表2给出了三种方式所支持的活动监测的粒度。Reflectall在活动监测的粒度上比基于运行时绑定的方法要更细:支持到指令级别的活动监测;同时也比基于字节码重构的方法适应范围更广:基于字节码重构的方式需要修改原应用的编译流程,难以直接在经过混淆、加固的应用上使用。At present, in addition to the method of implementing the behavior interpreter described in the present invention, the method for monitoring the execution flow of the application also includes two methods of runtime meta-message binding and compilation-time bytecode reconstruction. Table 2 shows the granularity of activity monitoring supported by the three methods. Reflectall is finer in the granularity of activity monitoring than the method based on runtime binding: it supports activity monitoring at the instruction level; it is also more adaptable than the method based on bytecode refactoring: based on bytecode refactoring This method needs to modify the compilation process of the original application, and it is difficult to directly use it on the obfuscated and hardened application.

表2:监测程序执行流程的方法比较Table 2: Comparison of methods for monitoring program execution flow

执行流程监测的粒度Granularity of Execution Process Monitoring 是否需要字节码Is bytecode required ReflectallReflect all 支持方法级别、指令级别的监测Support method-level and instruction-level monitoring 不需要unnecessary 基于运行时绑定的方法Runtime Binding Based Approach 支持方法级别的监测Supports method-level monitoring 不需要unnecessary 基于字节码重构的方法Method based on bytecode reconstruction 支持方法级别、指令级别的监测Support method-level and instruction-level monitoring 需要need

本发明在实验一对比Reflectall与基于运行时绑定的方法在监测程序执行流程方面的性能。在实验二对比Reflectall与基于字节码重构的方法在监测指令粒度的活动的性能。In Experiment 1, the present invention compares the performance of Reflectall and the method based on runtime binding in monitoring program execution flow. In Experiment 2, the performance of Reflectall and the method based on bytecode reconstruction is compared in monitoring the activity of instruction granularity.

实验一:对比运行时元消息绑定的方法Experiment 1: Comparing the methods of runtime meta-message binding

Xposed框架是一款可以在不修改APK的情况下监测和修改程序运行行为的框架服务(rovo89,2012)。类似于Reflectall,Xposed框架也是在安卓操作系统的系统层上进行修改。Xposed框架实现了元消息模型的行为反射,即Xposed在应用运行时根据配置对指定的方法绑定相应的元对象。在后续的执行中,这些绑定了元对象的方法在执行前、执行后会调用元对象中的before、after方法。本文利用Xposed框架,实现了与Reflectall类似监测程序执行的Xposed模块。本节以应用启动时间作为指标,在两台硬件配置相同的红米2A手机上,分别部署了Reflectall和基于Xposed的监测模块,安卓操作系统均为5.1.1。通过以下6个不同的实验场景,比较Reflectall与基于Xposed框架的方法在开源应用集和闭源应用集上监测应用的所有应用类执行的性能,如表3所示。在每个场景下,每个应用启动10次,实验的结果如图10所示。The Xposed framework is a framework service that can monitor and modify program running behavior without modifying the APK (rovo89, 2012). Similar to Reflectall, the Xposed framework is also modified at the system level of the Android operating system. The Xposed framework implements the behavior reflection of the meta-message model, that is, Xposed binds the corresponding meta-object to the specified method according to the configuration when the application is running. In subsequent executions, these methods bound to the meta-object will call the before and after methods in the meta-object before and after execution. This article uses the Xposed framework to implement an Xposed module that monitors program execution similar to Reflectall. In this section, the application startup time is used as an indicator. On two Redmi 2A mobile phones with the same hardware configuration, Reflectall and Xposed-based monitoring modules are respectively deployed, and the Android operating system is 5.1.1. Through the following six different experimental scenarios, compare the performance of Reflectall and the method based on the Xposed framework in monitoring the execution performance of all application classes of the application on the open source application set and the closed source application set, as shown in Table 3. In each scenario, each application is started 10 times, and the experimental results are shown in Figure 10.

表3:监测程序执行流程的方法比较Table 3: Comparison of methods for monitoring program execution flow

图10(a)为开源应用集上的实验结果。在开源应用集中,69个应用在上述6个场景下均能正常启动。图10(a)中的实线部分为部署了Reflectall的三个场景;虚线部分为部署了Xposed框架的三个场景。在不监测程序执行流程的情况下,部署Reflectall的手机(场景1)平均启动时间为392毫秒,而部署了Xposed框架的手机(场景3)的平台启动时间为449毫秒。这是由于Xposed框架的实现即便不绑定元对象,也在应用加载时会有一定的开销。而Reflectall的优化-反优化器实现了在不监测时,所有代码均在原生执行器中执行。在场景2下,Reflectall的平均启动时间为486毫秒,相比不监测情况(392毫秒),额外开销为23%;而基于Xposed框架的方法,在场景2下的平均启动时间高达2078毫秒,相比不监测情况(449毫秒),其额外开销为368%。而在生成更复杂的行为运行时模型时(场景3和场景6),Reflectall的额外开销仅为27%,而基于Xposed框架的方法高达477%。Figure 10(a) shows the experimental results on the open source application set. In the open source application collection, 69 applications can be started normally in the above six scenarios. The solid line in Figure 10(a) is the three scenarios where Reflectall is deployed; the dotted line is the three scenarios where the Xposed framework is deployed. Without monitoring the program execution process, the average startup time of the mobile phone deployed with Reflectall (Scenario 1) was 392 milliseconds, while the platform startup time of the mobile phone deployed with the Xposed framework (Scene 3) was 449 milliseconds. This is because even if the implementation of the Xposed framework does not bind the meta-object, there will be a certain overhead when the application is loaded. And Reflectall's optimization-deoptimizer realizes that all codes are executed in the native executor when it is not monitored. In scenario 2, the average startup time of Reflectall is 486 milliseconds, compared with the case of no monitoring (392 milliseconds), the additional overhead is 23%; while the method based on the Xposed framework, the average startup time in scenario 2 is as high as 2078 milliseconds, compared with Compared to the case of no monitoring (449 ms), the overhead is 368%. While generating more complex behavioral runtime models (Scenario 3 and Scenario 6), the additional overhead of Reflectall is only 27%, while the method based on the Xposed framework is as high as 477%.

图10(b)为闭源应用集上的实验结果。在不监测的场景下,相比于实现简单的开源应用,闭源应用的平均启动时间为936毫秒(Reflectall)和1010毫秒(Xposed)。而在场景2下,由于应用的复杂性,Reflectall有3个应用无响应,剩余的36个应用的平均启动时间为1601毫秒,相比不监测的场景(936毫秒)额外开销为71%。而场景4下有22个应用无响应,剩余的17个应用的平均启动时间为4593毫秒,相比不监测的场景(1010毫秒),其额外开销为354%。而在生成更复杂的行为运行时模型时(场景3和场景6),Reflectall的额外开销为98%,而基于Xposed框架的方法高达470%。Figure 10(b) shows the experimental results on the closed-source application set. In the non-monitoring scenario, compared with simple open source applications, the average startup time of closed source applications is 936 milliseconds (Reflectall) and 1010 milliseconds (Xposed). In scenario 2, due to the complexity of the application, 3 applications of Reflectall are unresponsive, and the average startup time of the remaining 36 applications is 1601 milliseconds, compared with the non-monitoring scenario (936 milliseconds), the additional overhead is 71%. In scenario 4, 22 applications are unresponsive, and the average startup time of the remaining 17 applications is 4593 milliseconds. Compared with the scenario without monitoring (1010 milliseconds), the additional overhead is 354%. While generating more complex behavioral runtime models (Scenario 3 and Scenario 6), Reflectall has an additional overhead of 98%, while the method based on the Xposed framework is as high as 470%.

Reflectall生成行为运行时模型的性能在开源应用集上和闭源应用集的差异较大,这是由于闭源应用集中的应用实现更为复杂,因此生成的模型规模也更大,可能引起多次的垃圾回收和活动持久化过程,由此带来了更多的性能开销。而基于Xposed的方法在这两个应用集中开销接近,原因在于利用Xposed进行监测时,有22个应用无响应,占整个闭源应用集的57%。Reflectall的性能开销比基于Xposed框架的方法低的一大原因是基于Xposed框架的方法的所使用的编程语言是Java,而Reflectall所实现的行为解释器编程语言为C++。当应用的执行流程较复杂时,基于Xposed框架的方法对内存的分配和回收会比Reflectall更加频繁。通过上述实验表明,本发明Reflectall的这种实现方式可以处理更复杂的应用。The performance of the behavior runtime model generated by Reflectall is quite different between the open source application set and the closed source application set. Garbage collection and activity persistence process, which brings more performance overhead. However, the cost of the Xposed-based method in these two application sets is close. The reason is that when Xposed is used for monitoring, 22 applications are unresponsive, accounting for 57% of the entire closed-source application set. One of the reasons why the performance overhead of Reflectall is lower than that of the method based on the Xposed framework is that the programming language used by the method based on the Xposed framework is Java, and the programming language of the behavior interpreter implemented by Reflectall is C++. When the execution process of the application is complex, the method based on the Xposed framework will allocate and reclaim memory more frequently than Reflectall. The above experiments show that the implementation of Reflectall in the present invention can handle more complex applications.

实验二:对比基于字节码重构的方法Experiment 2: Comparing methods based on bytecode reconstruction

很多常用的Java库都用到了字节码重构框架。字节码重构的一个很重要的使用场景就是程序分析。例如,流行的bug定位工具FindBugs在底层就使用了ASM来分析字节码并定位漏洞。另外一个常见的使用场景就是利用字节码重构生成程序的代码覆盖率报告,例如Emma(Roubtsov,2005)、JCover(JCover,2017)。Reflectall所生成的精简模型可转换为代码覆盖率报告。本实验将对比Reflectall与Emma在生成代码覆盖报告上的区别。由于基于字节码重构的方法不适合应用于仅有应用安装包的闭源应用。因此,这部分的仅针对开源应用集进行。本实现仍以应用启动时间作为指标,在两台硬件配置相同的红米2A手机上,分别部署了Reflectall和未修改的安卓系统,安卓操作系统均为5.1.1。本实验在部署了Reflectall的手机上安装原版应用;在未经修改的安卓手机上安装经过Emma插桩过的应用,在以下3个不同的实验场景,比较Reflectall与Emma生成开源应用集上生成代码覆盖报告的性能。在每个场景下,每个应用启动10次,实验的结果如图11所示。Many commonly used Java libraries use bytecode reconstruction frameworks. A very important usage scenario for bytecode reconstruction is program analysis. For example, the popular bug location tool FindBugs uses ASM at the bottom to analyze bytecode and locate vulnerabilities. Another common usage scenario is to use bytecode refactoring to generate program code coverage reports, such as Emma (Roubtsov, 2005), JCover (JCover, 2017). The compact models generated by Reflectall can be converted into code coverage reports. This experiment will compare the difference between Reflectall and Emma in generating code coverage reports. Because the method based on bytecode reconstruction is not suitable for closed-source applications that only have application installation packages. Therefore, this part is only for open source application sets. In this implementation, the application startup time is still used as an indicator. On two Redmi 2A mobile phones with the same hardware configuration, Reflectall and the unmodified Android system are respectively deployed, and the Android operating system is 5.1.1. In this experiment, the original application is installed on the mobile phone where Reflectall is deployed; the application that has been instrumented by Emma is installed on the unmodified Android mobile phone, and the code generated on the open source application set generated by Reflectall and Emma is compared in the following three different experimental scenarios Override reported performance. In each scenario, each application is started 10 times, and the experimental results are shown in Figure 11.

实验结果表明,Reflectall与Emma的应用平均启动时间接近,平均启动时间为442毫秒和455毫秒,额外开销分别为13%和16%。但是从生成的代码覆盖信息上,Reflectall要比Emma更丰富。表4给出了Reflectall在代码覆盖报告的监测粒度和部署运行的区别。Emma对块覆盖的报告可能存在不准确,并且不支持分支执行次数,而Reflectall基于行为解释器可以保证覆盖报告的准确性,同时也实现了对分支指令(如If-gt、Packed-Switch)的各个分支的执行次数的统计。另一区别是Emma需要进行一定配置,以对字节码进行重构,并且重构之后需要重打包;而Reflectall不需要这些配置,也不会改变移动应用的编译流程。因此,Reflectall比Emma这类基于字节码重构的工具更具易用性和实用性。The experimental results show that the average startup time of Reflectall and Emma's applications is close, the average startup time is 442 milliseconds and 455 milliseconds, and the overhead is 13% and 16% respectively. But from the generated code coverage information, Reflectall is richer than Emma. Table 4 shows the difference between the monitoring granularity and deployment operation of Reflectall in the code coverage report. Emma's report on block coverage may be inaccurate, and does not support the number of branch executions, while Reflectall's behavior-based interpreter can ensure the accuracy of the coverage report, and also implements branch instructions (such as If-gt, Packed-Switch) Statistics of execution times of each branch. Another difference is that Emma needs to be configured to refactor the bytecode, and needs to be repackaged after refactoring; while Reflectall does not need these configurations and will not change the compilation process of mobile applications. Therefore, Reflectall is easier to use and more practical than bytecode refactoring tools such as Emma.

表4:Reflectall与Emma的监测粒度对比Table 4: Comparison of monitoring granularity between Reflectall and Emma

对比类别Compare categories EmmaEmma ReflectallReflect all 类覆盖class coverage 支持support 支持support 方法覆盖method override 支持support 支持support 块覆盖block coverage 部分支持partial support 支持support 分支覆盖branch coverage 部分支持partial support 支持support 行覆盖line coverage 支持support 支持support 分支执行次数number of branch executions 不支持not support 支持support 指令覆盖instruction coverage 不支持not support 支持support 是否需要字节码Is bytecode required 需要need 不需要unnecessary 是否需要重打包Do you need to repackage 需要need 不需要unnecessary

综上,本发明提供的一种构造终端应用行为的运行时模型的方法,能将应用的执行可视为编程语言框架(例如解释器、虚拟机),根据应用的代码段,对内存进行读写操作。什么样的方法被执行,可对应为编程语言框架在栈上的操作;什么样的对象数据被修改,则可对应为编程语言框架对堆上的操作。实现了对终端应用应用行为的灵活、完整的监测,为后续实现对终端应用应用行为的指令级控制提供了技术保障。依据该方法所设计的计算反射引擎,可作为一个单独的运行环境使用,也可被集成到各种主流的开发平台或商业软件中,给开发者提供对应用运行时监测的基本能力。In summary, the present invention provides a method for constructing a runtime model of terminal application behavior, which can regard the execution of the application as a programming language framework (such as an interpreter, a virtual machine), and read the memory according to the code segment of the application. write operation. What kind of method is executed can correspond to the operation of the programming language framework on the stack; what kind of object data is modified can correspond to the operation of the programming language framework on the heap. It realizes the flexible and complete monitoring of the application behavior of the terminal application, and provides a technical guarantee for the subsequent realization of the command-level control of the application behavior of the terminal application. The calculation reflection engine designed according to this method can be used as a separate operating environment, and can also be integrated into various mainstream development platforms or commercial software, providing developers with the basic ability to monitor application runtime.

上述实施例仅为优选实施例,并不用以限制本发明的保护范围,在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above-mentioned embodiments are only preferred embodiments, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention should be included in the protection scope of the present invention .

Claims (10)

1.一种构造终端应用行为的运行时模型的方法,其特征在于,所述运行时模型包括运行时栈模型和运行时堆模型,所述方法包括构造所述终端应用行为的运行时栈模型的步骤和构造所述终端应用行为的运行时堆模型的步骤;1. A method for constructing a runtime model of terminal application behavior, wherein the runtime model includes a runtime stack model and a runtime heap model, and the method includes constructing a runtime stack model of the terminal application behavior and the step of constructing a runtime heap model of the terminal application behavior; 所述构造所述终端应用行为的运行时栈模型的步骤包括:The step of constructing the runtime stack model of the terminal application behavior includes: 在所述终端应用运行时,获取所述终端应用的内存中真正执行的代码,并对所述真正执行的代码进行抽象,生成控制流图;When the terminal application is running, obtain the code actually executed in the memory of the terminal application, and abstract the code actually executed to generate a control flow graph; 针对所述控制流图,将需要监测的控制流图输入至预设的行为解释器;For the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter; 利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动;Using the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate stack activities when the terminal application is running; 在所述终端应用运行时,生成所述栈活动的控制流间的依赖关系,得到所述终端应用行为的运行时栈模型;When the terminal application is running, generate a dependency between the control flows of the stack activities, and obtain a runtime stack model of the terminal application behavior; 所述构造所述终端应用行为的运行时堆模型的步骤包括:The step of constructing the runtime heap model of the terminal application behavior includes: 在所述终端应用运行时,生成堆区的初始状态;When the terminal application is running, generate an initial state of the heap; 生成堆操作活动,得到所述终端应用行为的运行时堆模型。A heap operation activity is generated to obtain a runtime heap model of the terminal application behavior. 2.如权利要求1所述的方法,其特征在于,所述方法包括类筛选器和活动类型筛选器;其中,所述类筛选器基于包和类名正则匹配的粗粒度筛选,用于去除开发人员不关心的程序活动;所述活动类型筛选器基于活动类型的细粒度筛选,用于去除与开发者不关心的活动类型。2. The method according to claim 1, wherein the method comprises a class filter and an activity type filter; wherein the class filter is based on a coarse-grained screening of regular matching of a package and a class name, and is used to remove Program activities that developers don't care about; the activity type filter is based on fine-grained screening of activity types, and is used to remove activity types that developers don't care about. 3.如权利要求2所述的方法,其特征在于,所述栈活动的活动类型包括方法开始与方法结束,字段读,数组读和同步指令;3. The method according to claim 2, wherein the activity type of the stack activity includes method start and method end, field read, array read and synchronization instructions; 利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动的步骤包括:Using the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, the step of generating the stack activity when the terminal application is running includes: 利用对所述终端应用的应用行为具有监测功能的行为解释器对所述需要监测的控制流图进行解释执行,获得所述终端应用运行时的活动;Using a behavior interpreter capable of monitoring the application behavior of the terminal application to interpret and execute the control flow graph that needs to be monitored, to obtain the running activities of the terminal application; 根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的栈活动;According to the concerned class, use the class filter to perform coarse-grained screening on the activities of the terminal application during runtime, and generate stack activities caused by the class; 针对所述栈活动的活动类型,利用所述活动类型筛选器对所述栈活动进行细粒度筛选。For the activity type of the stack activity, the activity type filter is used to perform fine-grained screening on the stack activity. 4.如权利要求1所述的方法,其特征在于,所述构造所述终端应用行为的运行时堆模型的步骤包括:4. The method according to claim 1, wherein the step of constructing the runtime heap model of the terminal application behavior comprises: 所述终端应用运行时的活动包括实例化活动、修改活动和回收活动。The running activities of the terminal application include instantiation activities, modification activities and recycling activities. 5.如权利要求2所述的方法,其特征在于,所述堆操作活动的活动类型包括对象实例化,数组实例化,对象字段写,数组元素写,清除活动和压缩活动;5. The method according to claim 2, wherein the activity type of the heap operation activity includes object instantiation, array instantiation, object field writing, array element writing, clearing activity and compression activity; 所述生成堆操作活动的步骤包括:The steps of generating heap operation activities include: 根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的堆操作活动;According to the concerned class, use the class filter to perform coarse-grained screening on the activities of the terminal application during runtime, and generate heap operation activities caused by the class; 针对所述堆操作活动的活动类型,利用所述活动类型筛选器对所述堆操作活动进行细粒度筛选。For the activity type of the heap operation activity, the activity type filter is used to perform fine-grained screening on the heap operation activity. 6.如权利要求1所述的方法,其特征在于,所述依赖关系包括同步依赖和通信依赖。6. The method according to claim 1, wherein the dependencies include synchronization dependencies and communication dependencies. 7.如权利要求6所述的方法,其特征在于,在生成控制流间的同步依赖关系时,对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系;对于一个活动的开始依赖另一个活动的结束的情况,先对当前线程进行检查,如果该活动为当前线程中的第一个执行的活动,则该活动是依赖另一个线程结束活动的,否则该活动仅是正常的方法调用,并不依赖另一个线程的活动。7. The method according to claim 6, characterized in that when generating synchronization dependencies between control flows, when the end of one method depends on the end of another method, the time stamp is used to find other threads from the back to the front Matchable activities in , if found, correspond to a synchronization dependency; for the case where the start of one activity depends on the end of another activity, first check the current thread, if the activity is the first execution in the current thread , the activity depends on another thread to end the activity, otherwise the activity is just a normal method call and does not depend on the activity of another thread. 8.如权利要求6所述的方法,其特征在于,在生成所述栈活动的控制流间的依赖关系时,对所有与活动间通信依赖相关的类进行总结,并将所述类的相关方法与线程依赖相关的方法一起作为生成通信依赖的知识库。8. The method according to claim 6, wherein when generating the dependency between the control flows of the stack activities, all the classes related to the inter-activity communication dependencies are summarized, and the related classes of the classes are Methods together with methods related to thread dependencies serve as a knowledge base for generating communication dependencies. 9.如权利要求1所述的方法,其特征在于,在所述运行时模型生成时,将所述运行时模型中的活动序列存放于可配置大小的缓冲区中,当活动数量超过预设时,则将所述缓冲区的活动进行序列化,并持久化到本地存储中。9. The method according to claim 1, wherein when the runtime model is generated, the sequence of activities in the runtime model is stored in a buffer with a configurable size, and when the number of activities exceeds a preset , the activities of the buffer are serialized and persisted to the local storage. 10.如权利要求1所述的方法,其特征在于,所述运行时堆模型以巴科斯范式的形式表示。10. The method of claim 1, wherein the runtime heap model is represented in Backus-Naur Form.
CN201910498727.XA 2019-06-10 2019-06-10 Method for constructing runtime model of terminal application behavior Active CN110347448B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910498727.XA CN110347448B (en) 2019-06-10 2019-06-10 Method for constructing runtime model of terminal application behavior
PCT/CN2019/119272 WO2020248512A1 (en) 2019-06-10 2019-11-18 Method for constructing runtime model of terminal application behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910498727.XA CN110347448B (en) 2019-06-10 2019-06-10 Method for constructing runtime model of terminal application behavior

Publications (2)

Publication Number Publication Date
CN110347448A true CN110347448A (en) 2019-10-18
CN110347448B CN110347448B (en) 2021-02-12

Family

ID=68181744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910498727.XA Active CN110347448B (en) 2019-06-10 2019-06-10 Method for constructing runtime model of terminal application behavior

Country Status (2)

Country Link
CN (1) CN110347448B (en)
WO (1) WO2020248512A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020248512A1 (en) * 2019-06-10 2020-12-17 北京大学 Method for constructing runtime model of terminal application behavior
CN113971124A (en) * 2020-07-24 2022-01-25 腾讯科技(深圳)有限公司 Debugging method and device of sub-application, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010147010A1 (en) * 2009-06-17 2010-12-23 日本電気株式会社 Module classification analysis system, module classification analysis method, and module classification analysis program
CN103473400A (en) * 2013-08-27 2013-12-25 北京航空航天大学 Software FMEA (failure mode and effects analysis) method based on level dependency modeling
CN109189374A (en) * 2018-06-22 2019-01-11 北京大学 Object formation code generating method and system based on object reference chain
CN109189469A (en) * 2018-06-22 2019-01-11 北京大学 Android application micro services method and system based on reflection

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636256B (en) * 2015-02-17 2017-10-24 中国农业银行股份有限公司 A kind of abnormal detection method and device of internal storage access
US10157055B2 (en) * 2016-09-29 2018-12-18 Microsoft Technology Licensing, Llc Code refactoring mechanism for asynchronous code optimization using topological sorting
CN109240666B (en) * 2018-06-22 2020-08-25 北京大学 Method and system for function call code generation based on call stack and dependency path
CN110362301B (en) * 2019-06-10 2021-04-09 北京大学 A processing method of terminal application behavior reflection
CN110347448B (en) * 2019-06-10 2021-02-12 北京大学 Method for constructing runtime model of terminal application behavior
CN110362363B (en) * 2019-06-10 2021-03-12 北京大学 A method for controlling terminal applications based on runtime model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010147010A1 (en) * 2009-06-17 2010-12-23 日本電気株式会社 Module classification analysis system, module classification analysis method, and module classification analysis program
CN103473400A (en) * 2013-08-27 2013-12-25 北京航空航天大学 Software FMEA (failure mode and effects analysis) method based on level dependency modeling
CN109189374A (en) * 2018-06-22 2019-01-11 北京大学 Object formation code generating method and system based on object reference chain
CN109189469A (en) * 2018-06-22 2019-01-11 北京大学 Android application micro services method and system based on reflection

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020248512A1 (en) * 2019-06-10 2020-12-17 北京大学 Method for constructing runtime model of terminal application behavior
CN113971124A (en) * 2020-07-24 2022-01-25 腾讯科技(深圳)有限公司 Debugging method and device of sub-application, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2020248512A1 (en) 2020-12-17
CN110347448B (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN110362363B (en) A method for controlling terminal applications based on runtime model
CN110362301B (en) A processing method of terminal application behavior reflection
Faugere et al. Marte: Also an uml profile for modeling aadl applications
Karmani et al. Actor frameworks for the JVM platform: a comparative analysis
US9600411B2 (en) System and method for determining an object&#39;s lifetime in an object oriented environment
CN103473031B (en) Collaborative concurrent type frog messaging bus, driving member composition model and component method for splitting
CN111309291A (en) Modularized embedded software architecture, customization method and customization system thereof
CN101710281B (en) Dynamic integrated system and method of development platform based on Agent
CN102955721A (en) Device and method for pressure generation for testing
CN110347448B (en) Method for constructing runtime model of terminal application behavior
Kim et al. Extracting, specifying and predicting software system properties in component based real-time embedded software development
CN115904611A (en) A Java bytecode injection method, device, electronic equipment and storage medium
Andreoli et al. Cloudsim 7g: An integrated toolkit for modeling and simulation of future generation cloud computing environments
Johnsen et al. Dynamic resource reallocation between deployment components
Simão et al. A checkpointing‐enabled and resource‐aware Java Virtual Machine for efficient and robust e‐Science applications in grid environments
Becker et al. On mapping rt-uml specifications to rt-java api: bridging the gap
Broch Johnsen et al. Validating timed models of deployment components with parametric concurrency
CN103473032B (en) Independent driving member and driving member composition model and component method for splitting can be run
Yokoyama et al. A development method of time‐triggered object‐oriented software for embedded control systems
Tasneem et al. Android memory optimization
Miller High Velocity Operating Systems Development
Zhu et al. A Domain-Specific Language for Reconfigurable, Distributed Software
Royer et al. The Tiny Java Library for Maintaining Model Provenance
Bixio et al. Adaptive Real Time IoT Stream Processing in Microservices Architecture
Standish et al. Classdesc and graphcode: support for scientific programming in C++

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant