[go: up one dir, main page]

CN105389220B - A Method of Parallel Linear Algebra Computation in Interactive R Language Platform - Google Patents

A Method of Parallel Linear Algebra Computation in Interactive R Language Platform Download PDF

Info

Publication number
CN105389220B
CN105389220B CN201510755923.2A CN201510755923A CN105389220B CN 105389220 B CN105389220 B CN 105389220B CN 201510755923 A CN201510755923 A CN 201510755923A CN 105389220 B CN105389220 B CN 105389220B
Authority
CN
China
Prior art keywords
platform
interactive
language
linear algebra
parallel linear
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510755923.2A
Other languages
Chinese (zh)
Other versions
CN105389220A (en
Inventor
顾荣
王肇康
黄宜华
樊士庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201510755923.2A priority Critical patent/CN105389220B/en
Publication of CN105389220A publication Critical patent/CN105389220A/en
Application granted granted Critical
Publication of CN105389220B publication Critical patent/CN105389220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/541Client-server

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The method for the parallelization linear algebra calculating based on interactive R language platform that the invention discloses a kind of, the following steps are included: providing two computing platforms, one is interactive R language platform, another is parallel linear algebra computing platform, and two computing platforms are communicated by computer network;Then in interactive R language platform, the application programming interfaces that a parallel linear algebra calculates are realized in design;Finally in the distributed matrix class for the application programming interfaces that parallel linear algebra calculates, member variable comprising a R environmental form, and in the initialization procedure of the object in distributed matrix class, the garbage reclamation response mode of the member variable is registered to the Garbage Collector of interactive R language platform by the reg.finalizer function of R language.The present invention, which solves existing interactive mode R Programming with Pascal Language platform, can not carry out the deficiency of parallel linear algebra calculating, extend the computing capability of interactive R language platform.

Description

The method of parallel linear algebra calculating is carried out in interactive R language platform
Technical field
The present invention relates to parallel computing more particularly to it is a kind of can be in interactive (interactive) R language platform The method for carrying out parallel linear algebra calculating.
Background technique
R language is widely used a kind of programming language in data science field.R language provides for computer user A large amount of common statistics calculate functions, and support the user oneself to write program and be extended to the function of R language.R language It itself provides batch processing function library, they constitute original R language platform.User oneself can write program to R language Speech is extended, and the program that user writes generally is extended in R language platform in the form of R lingware packet.R language is with function Formula is programmed for main programming paradigm, while the modern procedures design method such as support Object-oriented Programming Design.
R language platform supports batch processing (batch) operation and interactive (interactive) to run two kinds of operation sides simultaneously Formula.The interactive method of operation provides an interactive order line console for computer user, and user can be alternatively to R Instruction is inputted in platform, R language platform calculate according to instruction and carries out response to user.The interactive method of operation allows User on one side design, while adjust, fed back near real-time, mistake and shortcoming made to be corrected and be supplemented in time, side R LISP program LISP development process, by the welcome of data science man.According to O ' Reilly publishing house of U.S. sum The investigation of the website KDNuggets, R language are widely used in data science person's community.Since R language platform is initially for list Computer environment design, therefore basic R language platform itself is unable to fully utilize multiple processors (CPU) or more meters The computation capability that calculation machine provides.In the big data era that data scale is increasing, basic R language platform is limited to list The computing capability of computer and large-scale data analysis task can not be handled.Basic R language platform processing capacity it is limited one A main performance is exactly: basic R language platform can not handle large-scale linear algebra computational problem (such as extensive matrix Multiplication).Extensive linear algebra calculating can be carried out by how extending basic R language platform, it has also become computing technique field One major issue for needing to solve.And extensive linear algebra calculates the means solution mainly by parallel computing at present Certainly.
On the other hand, the existing parallel computing based on message passing interface (MPI), is able to solve extensive line Property algebraic manipulation problem.Software library ScaLAPACK based on message passing interface (MPI) technology provides one group of application program and connects Mouth (API) function covers most linear algebra and calculates demand.ScaLAPACK software library is by using MPI parallel computation Technology breaches the limitation of unicomputer computing capability, can make full use of the computing capability of multiple stage computers.It is existing at present PbdR project (network address of the project is http://r-pbd.org/) carries out function to R language platform using ScaLAPACK software library It can extend, R language users is allow to carry out parallel linear algebra calculating.But write based on ScaLAPACK software library or Program based on pbdR item development can only be run in a manner of batch processing, and it is flat that they can not directly run on interactive R language In platform.This limitation is so that the user of interactive mode R language platform can not be carried out on a large scale using the library ScaLAPACK, pbdR project Parallel linear algebra calculate.It there is no method that can carry out parallel linear algebra calculating in interactive R language platform at present.
Summary of the invention
Goal of the invention: parallel linear algebra calculating can not be carried out not to overcome in existing interactive mode R language platform Foot, the present invention provides a kind of method for extending interactive mode R language platform, this method keeps user flat in interactive R language Parallel linear algebra calculating is carried out in platform, the specific implementation details calculated without understanding parallel linear algebra solves existing Interactive R Programming with Pascal Language platform can not carry out the deficiency of parallel linear algebra calculating, extend the computing capability of interactive R.
The technical solution of the invention is as follows: in order to achieve the above-mentioned object of the invention, the technical solution adopted by the present invention one Kind is based on the calculation method of client-server (Client-Server) model, and this method is by interactive computing platform and parallel Linear algebra computing platform mutually separates in structure, and the cooperated computing of two platforms is realized by computer network communication.Entirely The technical scheme comprises the following steps:
(1) method of the invention provides two computing platforms, and one is interactive R language platform, another is parallel Linear algebra computing platform, two computing platforms are communicated by computer network, cooperated computing;
(2) in interactive R language platform, a distributed matrix class is defined, such, which is provided, interactive to run Parallel linear algebra calculate application programming interfaces;
It (3) include the member variable of R environment (environment) type in distributed matrix class;
(4) garbage reclamation receptance function is registered for the member variable of R environmental form in distributed matrix class.
Further, two computing platforms in the step (1) are respectively: one, interactive mode R language platform, this is flat Platform is the interactive R language platform of a standard, and interactive mode R language platform is loaded with the extended software packet that the present invention realizes, And it is directly interacted with computer user;Two, parallel linear algebra computing platform, the computing platform are one based on MPI (message passing interface) technology and ScaLAPACK software library, realize the computing platform of parallel linear algebra computing function.Two A computing platform is communicated with each other by computer network.Interactive R language platform receives interactive calculating from computer user Order, and corresponding computations are sent to parallel linear algebra computing platform, the latter carries out specific parallel linear algebra meter It calculates and calculated result is returned into interactive R language platform, and result is fed back into computer by interactive R language platform and is used Family.
Further, in the step (2) " distributed matrix class " refer to a R language S3 class either S4 class. Such is write using R language, and is loaded into interactive R language platform, for users to use.The distributed matrix class provides One group of R language function calculated for parallel linear algebra, user are corresponding by calling on the object of the distributed matrix class Calculating function, to complete parallel linear algebra calculating task.
Further, indicate that " distributed matrix class " should include a special member variable in the step (3), The type of the member variable is R environment (environment) type.The major function of the member variable is: when interactive R language When Garbage Collector in platform recycles the object of distributed matrix class, Garbage Collector is registered by the member variable Receptance function, notify parallel linear algebra computing platform, deletion is synchronously carried out to corresponding matrix data.
Further, in the step (4), the process of registration garbage reclamation receptance function is the structure in distributed matrix class It makes and is completed in function, realized by the reg.finalizer function of R language itself.
The beneficial effects of the present invention are: (1) is by mutually dividing interactive R language platform with parallel linear algebra computing platform From both making with the different methods of operation while to run.By computer network communication, enable interactive R language platform The function of calling parallel linear algebra computing platform to provide, without losing its interactivity.The present invention solves cannot be in interactive R The problem of parallel linear algebra calculating is carried out in language platform.(2) by the way that all parallel linear algebras calculating function wrapping exists In one distributed matrix class, the present invention provided in interactive R language platform one it is user-friendly using interface.Meter Calculation machine user can be as operation with traditional R language single machine matrix general operation distributed matrix, to carry out parallel linear algebra calculating. The R language calculation code that the present invention writes keeps height consistent with primary R language calculation code, and alleviating computer user makes Learning cost when with the present invention.(3) present invention registers rubbish to Garbage Collector by the reg.finalizer function of R language Rubbish recycles receptance function, and the rubbish of the garbage reclamation and parallel linear algebra computing platform that solve interactive R language platform returns The stationary problem of receipts.The present invention technical solution that uses, which avoids, simultaneously in similar technique needs that C language is used to be programmed Problem allows the present invention all with R Programming with Pascal Language realization, to reduce the implementation complexity of entire technical solution.(4) make After extending interactive mode R language platform with the present invention, interactive R language platform can break through the processing capacity limit of unicomputer System, and carry out more massive linear algebra calculating.
Detailed description of the invention
Fig. 1 is disposed of in its entirety flow diagram of the invention.
Fig. 2 is the performance comparison figure for the single machine linear algebra computing system that the present invention is provided with standard R language platform.
Specific embodiment
In the following with reference to the drawings and specific embodiments, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate It the present invention rather than limits the scope of the invention, after the present invention has been read, those skilled in the art are to of the invention each The modification of kind equivalent form falls within the application range as defined in the appended claims.
Technical solution of the present invention is mainly made of two software modules: one is interactive R language platform, in addition one A is parallel linear algebra computing platform.Interactive R language platform is the R language computing platform an of standard, from R The realization (project network address is https: //www.r-project.org/) of Project project, this software is not belonging to the present invention Content.And parallel linear algebra computing platform is then the computing platform for meeting following feature: (1) computing platform can lead to It crosses computer network and receives computations, and can analyze the instruction, execute corresponding parallel linear algebra calculating task;(2) should Platform can storage matrix data in a distributed manner;(3) computing platform pass through MPI(message passing interface) technology and ScaLAPACK software library progress parallel linear algebra calculating (MPI technology and ScaLAPACK software library are not belonging to the content of present invention, Mentioned in description of the invention " ScaLAPACK software library " generation refer to that " compatible with ScaLAPACK application programming interfaces standard is soft Part library ");(4) identifier of calculated result or calculated result can be returned to calculating by computer network by the computing platform Instruct sender.The computing platform for all meeting above-mentioned 4 features can be considered as meeting the parallel of the technology of the present invention feature Linear algebra computing platform.
User is as shown in Figure 1 using the detailed process that the present invention carries out parallel linear algebra calculating.Interactive R language is flat Platform is directly interacted with final computer user, is received the instruction that user gives, is executed calculating, and calculated result is passed through Computer screen returns to user.For the parallel linear algebra computations that user assigns, interactive R language platform passes through meter The instruction is sent to parallel linear algebra computing platform, is passed through by parallel linear algebra computing platform by calculation machine network ScaLAPACK software library executes calculating operation, and calculated result is returned to interactive R language platform by computer network, Computer user is finally returned result to by interactive R language platform.
The initialization Booting sequences of two computing platforms of the present invention the following steps are included:
(1) user starts an interactive R language platform on their computer;
(2) user loads the R software package realized by technical solution of the present invention in interactive R language platform, and to interaction Formula R language platform issues the instruction that starting parallel linear algebra calculates;
(3) interactive mode R language platform passes through the Operation control mechanism of MPI technology, starts parallel linear algebra computing platform Corresponding MPI operation;
(4) after the starting of parallel linear algebra computing platform, interactive R language platform and parallel linear algebra computing platform Master computing node establishes computer network communication link;
(5) interactive mode R language platform creates an empty queue as global garbage reclamation queue, then computer is notified to use Family software language platform loading of the present invention finishes, and interactive R language platform enters waiting instruction phase;
(6) user executes parallel linear algebra calculating task by interactive mode R language platform.
User is not directly facing the library ScaLAPACK and MPI technology is programmed use in interactive R language platform, and It is to be programmed realization towards distributed matrix class provided by the invention.The embodiment of distributed matrix class of the invention are as follows: It is then such by way of programming firstly, defining the distributed matrix class using a S3 class or S4 class in R language Specific member variable is provided;Such member variable includes: (1) long-range matrix identifier variable, and the identifier is in R language Type can be integer or character string, as long as different long-range matrixes, (2) R environment can be distinguished (environment) variable of type, the variable are used for the garbage reclamation of distributed matrix object;Then, by R language towards Object program design mechanism provides several member functions for such, and the existing linear algebra of heavy duty R language calculates function, makes These calculate function and support distributed matrix class.
It is calculated in function in the member function either linear algebra of each distributed matrix class, completes specific parallel line The step of property algebraic manipulation task are as follows:
(1) interactive mode R language platform reads the remote of the distributed matrix class object for participating in calculating from function call parameter Journey matrix identifier;
(2) interactive mode R language platform counts this corresponding computations of linear algebra calculating operation and all participations The long-range matrix identifier of calculation, global garbage reclamation queue, pass to parallel linear algebra computing platform by computer network;
(3) parallel linear algebra computing platform receives corresponding computations and participates in the long-range matrix identification calculated Symbol, global garbage reclamation queue.Parallel linear algebra computing platform then reads out its preservation from global garbage reclamation queue The long-range matrix identifier of rubbish, the corresponding matrix function of the long-range matrix identifier of these rubbish is then deleted from computing platform According to.Then analytical Calculation instructs parallel linear algebra computing platform, and according to the long-range matrix identifier for participating in calculating, from memory It is middle to take out the distributed matrix operation handle being stored in the computing platform;
(4) parallel linear algebra computing platform by the distributed matrix taken out in previous step operate handle and The library ScaLAPACK carries out parallel linear algebra calculating;
(5) parallel linear algebra computing platform save this calculate generate as a result, the result be a distributed matrix, It and is the long-range matrix identifier of distribution of results formula matrix allocation;
(6) the long-range matrix identifier that parallel linear algebra computing platform generates previous step is packaged into replying instruction, Interactive R language platform is sent to by computer network;
(7) interactive mode R language platform receives the replying instruction that parallel linear algebra computing platform is sent back to, and is taken out long-range Matrix identifier, the object of a new distributed matrix class is initialized using the identifier, and the object is returned to user. The object of the distributed matrix class of return can be used in user, carries out other calculating tasks.
When using R language definition distributed matrix class, it is desirable to provide an initialization constructed fuction makes interactive R language Speech platform can correctly handle the garbage reclamation problem of the object of distributed matrix class.The initialization constructed fuction of distributed matrix class Workflow comprise the steps of:
(1) in calculator memory, initialization constructed fuction receives a long-range matrix identifier as the defeated of the function Enter parameter;
(2) initialization constructed fuction creates an empty distributed matrix object in memory;
(3) it is created in the long-range matrix identifier initialization step (1) in initialization constructed fuction utilization input parameter The long-range matrix identifier member variable of distributed matrix object;
(4) distributed matrix Object Creation one new R environment (environment) type to be created in step (1) Member variable, and by input parameter in long-range matrix identifier be stored in the member variable of the R environmental form;
(5) the reg.finalizer function for passing through R language, R environmental form member variable described in (4) step is existed It is registered in the Garbage Collector of interactive R language platform, by the R environmental form member variable and a customized rubbish Recycling receptance function is associated;
(6) by distributed matrix object creating in step (1), being handled by above-mentioned steps again, as construction letter Several return values returns.
Further, (5) step in the workflow of the initialization constructed fuction of above-mentioned distributed matrix class, is mentioned One customized garbage reclamation receptance function, the workflow of the function comprise the steps of:
(1) customized garbage reclamation receptance function receives a R environmental form variable as input parameter;
(2) receptance function reads long-range matrix identifier from R environmental form variable, which is added to global rubbish Rubbish recycles in queue.
The invention has the advantages that computer user is allow to carry out parallel linear generation in interactive R language platform Number calculates;By way of multiple stage computers parallel computation, the present invention can allow user to handle in interactive R language platform greatly When the linear algebraic manipulation problem of scale, obtains and calculate faster calculating speed than single machine linear algebra.The present invention is based on existing Some open source softwares realize a prototype system.It requires according to the technique and scheme of the present invention, prototype system includes two calculating Platform, wherein the interactive R language platform that interactive mode R language platform uses R Project project to provide, and parallel linear algebra Computing platform is then the prototype computing platform developed according to the technique and scheme of the present invention, has used pbdR project in exploitation The software that (project home page http://r-pbd.org/) is provided.The software that R Project project and pbdR project provide is not belonging to The contents of the present invention.According further to technical solution of the present invention, in prototype system further including one, to run on interactive R language flat The distributed matrix class of platform.Benchmark test is used as by using matrix multiplication operation (a kind of linear algebra calculating operation), to this The single machine linear algebra computing system that the prototype software system realized and existing R language platform provide is invented to be tested, Evaluation and test uses calculating time-consuming as Measure Indexes.In evaluation and test, the prototype software system that the present invention realizes has used 10 computers Parallel computation is carried out, and the single machine linear algebra computing system that existing R language platform provides then is limited to its function, is merely able to It is calculated using single computer.The result of evaluation and test is referring to fig. 2.In Fig. 2, solid line indicates the prototype software that the present invention realizes The evaluation result of system, dotted line indicate the evaluation result for the linear computing system of single machine that existing R language platform provides.With ginseng Add the increase of the matrix size of calculating, for the prototype software system that the present invention realizes when completing identical calculating task, institute is time-consuming Between it is fewer than previously described single machine linear algebra computing system.The prototype software system that evaluation and test shows that the present invention realizes is carrying out When extensive matrix multiplication operation, calculate time-consuming shorter, calculating speed faster, it was demonstrated that method proposed by the present invention it is effective Property, demonstrate beneficial effects of the present invention.

Claims (3)

1.一种在交互式R语言平台中进行并行线性代数计算的方法,包括以下步骤:1. A method for parallel linear algebra calculation in an interactive R language platform, comprising the following steps: (1)提供两个计算平台,一个是交互式R语言平台,另外一个是并行线性代数计算平台,两个计算平台通过计算机网络进行通信;(1) Provide two computing platforms, one is an interactive R language platform, the other is a parallel linear algebra computing platform, and the two computing platforms communicate through a computer network; (2)在交互式R语言平台中,设计一个用于并行线性代数计算的一个分布式矩阵类,作为应用程序接口提供给计算机用户;(2) In the interactive R language platform, a distributed matrix class for parallel linear algebra calculation is designed and provided to computer users as an application program interface; (3)在所述分布式矩阵类中,包含一个R语言环境类型的成员变量;(3) in the described distributed matrix class, a member variable of the R language environment type is included; (4)为所述R语言环境类型的成员变量注册垃圾回收响应函数方法;(4) registering a garbage collection response function method for the member variable of the R language environment type; (5)交互式R语言平台是一个标准的R语言计算平台,来自于R Project项目的实现,并行线性代数计算平台通过MPI技术和ScaLAPACK软件库进行并行线性代数计算;(5) The interactive R language platform is a standard R language computing platform, which comes from the realization of the R Project project. The parallel linear algebra computing platform performs parallel linear algebra computing through MPI technology and ScaLAPACK software library; (6)基于客户端-服务器模型,将交互式计算平台与并行线性代数计算平台在结构上相分离,通过计算机网络通信实现两个平台的协同计算;(6) Based on the client-server model, the interactive computing platform and the parallel linear algebra computing platform are structurally separated, and the collaborative computing of the two platforms is realized through computer network communication; (7)通过将交互式R语言平台和并行线性代数计算平台相分离,使两者能以不同的运行方式同时运行;(7) By separating the interactive R language platform and the parallel linear algebra computing platform, the two can run simultaneously in different operating modes; (8)在两个计算平台的初始化启动流程中,交互式R语言平台通过MPI技术的作业控制机制,启动并行线性代数计算平台对应的MPI作业,并行线性代数计算平台启动后,交互式R语言平台与并行线性代数计算平台的主计算节点建立计算机网络通讯链接;(8) In the initialization and startup process of the two computing platforms, the interactive R language platform starts the MPI job corresponding to the parallel linear algebra computing platform through the job control mechanism of the MPI technology. After the parallel linear algebra computing platform is started, the interactive R language The platform establishes a computer network communication link with the main computing node of the parallel linear algebra computing platform; (9)交互式R语言平台的交互式运行方式为计算机用户提供了一个交互式的命令行控制台,用户可以交互地向R平台中输入指令,R语言平台根据指令进行计算并对用户进行应答;交互式的运行方式允许用户一边设计、一边调整,近实时地获得反馈,使错误和不足之处及时得到改正和补充,方便了R语言程序的开发过程;(9) The interactive operation mode of the interactive R language platform provides computer users with an interactive command line console. The user can interactively input commands into the R platform, and the R language platform calculates and responds to the user according to the commands. ;The interactive operation mode allows users to design and adjust at the same time, and get feedback in near real time, so that errors and deficiencies can be corrected and supplemented in time, which facilitates the development process of R language programs; (10)交互式R语言平台从计算机用户处接受交互式计算命令,并向并行线性代数计算平台发送对应的计算指令,后者进行具体的并行线性代数计算并将计算结果返回给交互式R语言平台,并由交互式R语言平台将结果反馈给计算机用户;(10) The interactive R language platform accepts interactive computing commands from computer users, and sends corresponding computing instructions to the parallel linear algebra computing platform, which performs specific parallel linear algebra calculations and returns the calculation results to the interactive R language platform, and the results are fed back to computer users by the interactive R language platform; (11)通过多台计算机并行计算的方式,可以让用户在交互式R语言平台中处理大规模线性代数计算问题时,获得比单机线性代数计算更快的计算速度。(11) By means of parallel computing of multiple computers, users can obtain faster computing speed than single-computer linear algebra computing when dealing with large-scale linear algebra computing problems in the interactive R language platform. 2.根据权利要求1所述的一种在交互式R语言平台中进行并行线性代数计算的方法,其特征是:2. a kind of method that carries out parallel linear algebra calculation in interactive R language platform according to claim 1, it is characterized in that: 所述步骤(1)中并行线性代数计算平台的工作流程为:首先分布式地存储矩阵数据,然后通过计算机网络接收计算指令并解析计算指令,接着通过MPI技术和ScaLAPACK软件库执行相应的并行线性代数计算操作,最后通过计算机网络将计算结果或者计算结果的标识符返回给计算指令的发出者;The workflow of the parallel linear algebra computing platform in the step (1) is as follows: firstly, the matrix data is stored in a distributed manner, then the computing instructions are received and parsed through the computer network, and then the corresponding parallel linear algebra is executed through the MPI technology and the ScaLAPACK software library. Algebraic calculation operations, and finally return the calculation result or the identifier of the calculation result to the issuer of the calculation instruction through the computer network; 所述步骤(1)中的并行线性代数计算平台保存计算产生的结果,该结果为一个分布式矩阵,并为结果分布式矩阵分配远程矩阵标识符,并行线性代数计算平台将生成的远程矩阵标识符包装到回复指令中,通过计算机网络发送给交互式R语言平台。The parallel linear algebra computing platform in the described step (1) saves the result generated by the calculation, the result is a distributed matrix, and assigns a remote matrix identifier for the result distributed matrix, and the parallel linear algebra computing platform will generate the remote matrix identifier. The code is packaged into the reply command and sent to the interactive R language platform through the computer network. 3.根据权利要求1所述的一种在交互式R语言平台中进行并行线性代数计算的方法,其特征是:3. a kind of method that carries out parallel linear algebra calculation in interactive R language platform according to claim 1, it is characterized in that: 所述分布式矩阵类被定义在交互式R语言平台中,该类提供并行线性代数计算的应用程序接口;The distributed matrix class is defined in the interactive R language platform, and this class provides an application program interface for parallel linear algebra computation; 所述分布式矩阵类为R语言中的S3类或S4类;The distributed matrix class is the S3 class or the S4 class in the R language; 在分布式矩阵类的初始化构造函数的工作流程中,初始化构造函数接收一个远程矩阵标识符作为该函数的输入参数,为新建的分布式矩阵对象创建一个新的R环境类型的成员变量,并且将输入参数中的远程矩阵标识符存入该R环境类型的成员变量中;In the workflow of the initialization constructor of the distributed matrix class, the initialization constructor receives a remote matrix identifier as an input parameter of the function, creates a new member variable of the R environment type for the newly created distributed matrix object, and sets the The remote matrix identifier in the input parameter is stored in the member variable of the R environment type; 所述步骤(4)的工作流程为:在分布式矩阵类的对象初始化过程中,使用R语言的reg.finalizer函数,将该对象中的R语言环境类型的成员变量,与一个垃圾回收响应函数进行绑定,并注册到交互式R语言平台的垃圾回收器中;The workflow of the step (4) is: in the object initialization process of the distributed matrix class, using the reg.finalizer function of the R language, the member variables of the R language environment type in the object are combined with a garbage collection response function. Bind and register with the garbage collector of the interactive R language platform; 所述垃圾回收响应函数的工作流程为:接收一个R环境类型变量作为输入参数,从R环境类型变量中读出远程矩阵标识符,将该标识符加入到全局垃圾回收队列中;The workflow of the garbage collection response function is: receiving an R environment type variable as an input parameter, reading the remote matrix identifier from the R environment type variable, and adding the identifier to the global garbage collection queue; 交互式R语言平台将全局垃圾回收队列通过计算机网络传递给并行线性代数计算平台,并行线性代数计算平台接收相应的全局垃圾回收队列,从全局垃圾回收队列中读取出其保存的垃圾远程矩阵标识符,然后从计算平台中删除这些垃圾远程矩阵标识符对应的矩阵数据。The interactive R language platform transmits the global garbage collection queue to the parallel linear algebra computing platform through the computer network. The parallel linear algebra computing platform receives the corresponding global garbage collection queue and reads the stored garbage remote matrix identifier from the global garbage collection queue. identifiers, and then delete the matrix data corresponding to these junk remote matrix identifiers from the computing platform.
CN201510755923.2A 2015-11-09 2015-11-09 A Method of Parallel Linear Algebra Computation in Interactive R Language Platform Active CN105389220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510755923.2A CN105389220B (en) 2015-11-09 2015-11-09 A Method of Parallel Linear Algebra Computation in Interactive R Language Platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510755923.2A CN105389220B (en) 2015-11-09 2015-11-09 A Method of Parallel Linear Algebra Computation in Interactive R Language Platform

Publications (2)

Publication Number Publication Date
CN105389220A CN105389220A (en) 2016-03-09
CN105389220B true CN105389220B (en) 2019-02-15

Family

ID=55421527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510755923.2A Active CN105389220B (en) 2015-11-09 2015-11-09 A Method of Parallel Linear Algebra Computation in Interactive R Language Platform

Country Status (1)

Country Link
CN (1) CN105389220B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022483B (en) * 2016-05-11 2019-06-14 星环信息科技(上海)有限公司 The method and apparatus converted between machine learning model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5845120A (en) * 1995-09-19 1998-12-01 Sun Microsystems, Inc. Method and apparatus for linking compiler error messages to relevant information
CN1339743A (en) * 2000-08-23 2002-03-13 国际商业机器公司 Method and device for computer software analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5845120A (en) * 1995-09-19 1998-12-01 Sun Microsystems, Inc. Method and apparatus for linking compiler error messages to relevant information
CN1339743A (en) * 2000-08-23 2002-03-13 国际商业机器公司 Method and device for computer software analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"P_HPF并行编译系统EXTRINSIC过程调用及对ScaLAPACK并行线性代数算法包的支持";谢军;《http://xueshu.baidu.com/s?wd=paperuri%3A%285826e837b9edcac52efc79230d3f0235%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fwww.doc88.com%2Fp-2045556984539.html&ie=utf-8&sc_us=16634907125052348932》;20140802;第3-5章
"p-HPF支持多范例并行计算的并行编译技术";胡长军,等;《计算机学报》;20010731;第24卷(第7期);全文

Also Published As

Publication number Publication date
CN105389220A (en) 2016-03-09

Similar Documents

Publication Publication Date Title
Li et al. Blockchain-based digital twin sharing platform for reconfigurable socialized manufacturing resource integration
Adorf et al. Simple data and workflow management with the signac framework
CN108628605A (en) Stream data processing method, device, server and medium
WO2016048732A1 (en) Cloud-based parallel computation using actor modules
CN110717268B (en) A Portable Component Unit Encapsulation Method Based on FACE Architecture
US9990216B2 (en) Providing hypercall interface for virtual machines
CN105260177A (en) SiPESC platform based Python extension module development method
Kinzer et al. A computational stack for cross-domain acceleration
Shepovalov et al. FPGA and GPU-based acceleration of ML workloads on Amazon cloud-A case study using gradient boosted decision tree library
Matthew et al. GillesPy2: a biochemical modeling framework for simulation driven biological discovery
Medeiros et al. A gpu-accelerated molecular docking workflow with kubernetes and apache airflow
Suram et al. Engineering design analysis utilizing a cloud platform
CN105389220B (en) A Method of Parallel Linear Algebra Computation in Interactive R Language Platform
US10606588B2 (en) Conversion of Boolean conditions
Wozniak et al. Interlanguage parallel scripting for distributed-memory scientific computing
Zhang et al. A low-code development framework for cloud-native edge systems
US9361266B2 (en) System and method for distributed computing
Wang et al. Transformer: A new paradigm for building data-parallel programming models
Sanderson et al. Armadillo: An efficient framework for numerical linear algebra
Diez Dolinski et al. Distributed simulation of P systems by means of map-reduce: first steps with Hadoop and P-Lingua
Chard et al. PDACS-A Portal for Data Analysis Services for Cosmological Simulations
Makarov et al. Supercomputer technologies in social sciences: Existing experience and future perspectives
CN105930262A (en) Application program user interface automated testing method and electronic device
Skjellum et al. Object‐oriented analysis and design of the Message Passing Interface
Hai Socheat Automatic and scalable cloud framework for parametric studies using scientific applications

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Gu Rong

Inventor after: Wang Zhaokang

Inventor after: Huang Yihua

Inventor after: Fan Shiqing

Inventor before: Huang Yihua

Inventor before: Wang Zhaokang

Inventor before: Gu Rong

Inventor before: Fan Shiqing

GR01 Patent grant
GR01 Patent grant