WO2009153658A1

WO2009153658A1 - Memory allocation

Info

Publication number: WO2009153658A1
Application number: PCT/IB2009/006024
Authority: WO
Inventors: Charles Garcia-Tobin
Original assignee: Nokia Inc
Current assignee: Nokia Inc
Priority date: 2008-06-19
Filing date: 2009-06-19
Publication date: 2009-12-23
Anticipated expiration: 2010-12-19
Also published as: GB0811297D0

Abstract

A computing device comprises a memory allocation system arranged to operate in an optimised mode; wherein said device includes memory optimisation data pertaining to the memory requirements of one or more processes and said memory allocation system is arranged to allocate memory to said processes, based on said memory optimisation data; and wherein said memory requirements of the one or more processes are established by running said one or more processes in a non- optimised mode.

Description

MEMORY ALLOCATION

TECHNICAL FIELD

This invention relates generally to memory allocation.

BACKGROUND TO THE INVENTION

During the process of dynamic memory allocation memory is allocated for use by a program during the execution of that program. Memory may be allocated for any one of a number of objects such as semaphores, process descriptors, file objects etc. A known memory management system is the Memory Pool System (MPS) described, for example, in the publication "The Memory Pool System" by Richard Brooksby and Nicholas Barnes, 2002, Ravenbrook Limited, available at http://www.ravenbrook.com/proiect/mps/doc/2002-01 -30/ismm2002-paper/.

The MPS operates by creating memory pools from which memory can be allocated for use by a program. A number of different classes are defined, each class regulating a different pool. In this manner, the pools may be tailored to suit different objects and to various factors which influence garbage collection. Memory pools are advantageous in some circumstances as they may be arranged according to the size of memory required by a particular object. For example, a pool of pre-allocated process control blocks may be associated with a kernel. When a new process is created the kernel can extract a process block memory from this pool. When a process is deleted, the control block is simply placed back on the pool.

SUMMARY OF THE INVENTION

According to a first example of the invention, there is provided an apparatus comprising memory use data pertaining to the memory requirements of one or more processes and a memory allocator arranged to allocate memory to said processes, based on said memory use data; and wherein said memory requirements of the one or more processes are established by running said one or more processes.

According to a second example of the invention, there is provided a method comprising allocating memory to one or more processes based on pre-stored memory use data pertaining to the memory requirements of the one or more processes, wherein said memory requirements of the one or more processes are established by running said one or more processes.

According to a third example of the invention, there is provided an apparatus comprising a memory allocation system arranged to operate in two modes; wherein, in a first mode, said memory allocation system is arranged to determine memory use data pertaining to the memory requirements of one or more processes and, in a second mode, said memory allocation system is arranged to allocate memory to said processes, based on said memory use data.

According to a fourth example of the invention, there is provided a method comprising determining memory use data pertaining to the memory requirements of one or more processes; and allocating memory to said processes based on said memory use data.

According to further examples, there is provided a memory allocation system arranged to set aside a plurality of memory pools and allocate memory to a program from one or more of said pools based on requirements of the program and characteristics of the pools where a process of setting aside said memory pools depends on a previous instantiation of said program.

According to a further example, there is provided a memory allocation system adapted to operate in two modes wherein, in a first mode, said memory allocation system collects data pertaining to the memory requirements of a program and, in a second mode, said memory allocation system allocates memory in dependence on said collected data.

Said memory allocation system may set aside memory in one or more memory pools so that memory is allocated from at least one of said memory pools.

Said memory allocation system may set aside memory in memory pools according to tuning data wherein said tuning data is derived from said collected data.

Said tuning data may specify a size or number of memory pools determined from said collected data. Said tuning data may determine the size and number of memory pools determined from said collected data. Each of said pools may have a plurality of blocks of memory wherein each block of any pool are of equal size and wherein the size of the blocks of any pool is determined with reference to said collected data.

The tuning data may be stored as part of an executable file associated with the program. Alternatively, the tuning data could be stored as a resource file.

The collected data may pertain to a manner in which memory is allocated and deallocated to an object related to the program. The collected data may include, for a plurality of objects relating to said program, a size and address of memory allocated to said object. The collected data may further include, for said plurality of objects, a size and address of memory deallocated in respect of said object.

Said collected data may include an indication of a type of said object. Where said collected data includes an indication of the type of object, said collected data may further comprise the following for each type of object: maximum and minimum numbers of objects allocated or deallocated in a predetermined time period; average number of objects in respect of which memory is allocated or deallocated in a predetermined time period; frequency of memory allocation or deallocation; and number of objects for which memory is allocated or deallocated in a predetermined period.

Said object may be any one of a semaphore, a process descriptor or a file object.

According to further examples, the invention may extend to a method of setting aside memory, said method comprising collecting data pertaining to how memory is dynamically allocated to a program in a first operational mode; converting said collected data to tuning data and, in a second operational mode, allocating memory for later assignment to said program according to said tuning data.

Said allocated memory may be assigned to said program during execution of said program.

Converting said collected data to said tuning data may be performed manually. Alternatively, said step of converting said collected data to tuning data could be performed automatically.

Allocating memory may include arranging memory into one or more pools where each pool is associated with an object type, an object size or both, where an object is a memory resource which may be requested by the program.

Arranging memory into one or more pools may occur at a boot time. The method may also include arranging memory into pools after boot time.

The method may include arranging memory into pools in a manner specified by a user. This can allow the method to allocate pools in the traditional manner so that object types not specifically catered for can be dealt with and in case the pools set aside at boot time are all used by the program during execution of the program. As used herein such pools and the associated allocation and deallocation algorithms are collectively referred to as the "backup allocator" whereas pools which are established at boot time are referred to as "p^re-a'^located pools". Said method may include altering said memory pools during execution of the program. Therefore, the pools may be dynamic or static.

In one example, the method includes specifying a maximum size for one or more dynamic pools.

Said method may include instantiating one or more memory pools as linked lists.

In some embodiments, the invention may extend to a computer program, a suite of computer programs, a computer readable medium, a computer program product or any form of software for performing the techniques set out above.

Other features of the present invention are defined in the, appended claims. Features and advantages associated with the present invention will be apparent from the following description of specific embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention are hereinafter described with reference to the accompanying diagrams where:

Figure 1 is a schematic diagram of a mobile computing device in which an embodiment of the invention has been implemented;

Figure 2 is a block diagram representing the mobile computing device of Figure 1 ;

Figure 3 is a flow diagram showing the operation of an embodiment the invention in profile mode;

Figure 4 is a histogram showing the number of operations per object category according to an embodiment of the invention;

Figure 5 is a histogram showing the number of operations as a function of time for a particular object category according to an embodiment of the invention; Figure 6 is a flow diagram showing the operation of an embodiment of the invention in optimised mode; and

Figure 7 is a schematic diagram of a memory pool according to a preferred embodiment of the invention.

DESCRIPTION OF PREFERRED EMBODIMENTS

Figure 1 is a schematic diagram of a mobile computing device 10 having a casing 12. The casing 12 encapsulates a keypad 14, a screen 16, a speaker 18 and a microphone 20. The device 10 further includes an antenna 22. The mobile computing device 10 illustrated in Figure 1 may function as a phone and, in this instance, sends and receives telecommunication signals via antenna 22.

Figure 2 is a schematic illustration of certain components of the mobile computing device 10. Device 10 includes a kernel 34 which forms part of the operating system of the device 10 and comprises a plurality of programs. In the embodiment shown, the operating system is the Symbian operating system. The kernel 34 is connected to device drivers 26, 28 and 30 which control the behaviour of, and communication with, respective devices: keyboard 14, display 16 and long term storage 24. Although the device drivers are illustrated as being connected to kernel 34, it is to be realised that in certain situations these device drivers could be considered part of the kernel 34, depending on the type of device driver. The kernel 34 is further connected to a memory manager 36 which is connected, and controls the operation of system memory 38. The mobile computing device 10 includes many more devices and components than those illustrated here. Mobile computing devices of this type are known in the art and will therefore not be further described herein.

The code which operates the memory manager 36 may be located in a generic library which may be accessed by programs, the kernel 34 and other parts of the operating system. When a process requires memory, the code runs within that process. Although the memory manager 36 is shown in connection with the kernel 34, it will be appreciated that the memory manager's code may be accessed directly, from the generic library, by most programs. The kernel 34 is generally responsible for providing memory space, in said system memory 38. The memory manager 36 code uses the space to allocate memory dynamically to variables a process might require.

The device 10 further comprises a plurality of user programs 32 which communicate with the kernel 34 so that the kernel 34 controls the manner in which the user programs 32 access hardware devices such as the keyboard 14 and display 16 and the manner in which memory is allocated from the system memory 38 to the user programs 32. The memory manager 36 allocates memory to the user programs 32 in the manner herein described.

During execution of the user programs or the kernel, memory is needed for "objects" associated with the program. Such objects may, for example, be semaphores, process descriptors or file objects.

The memory manager 36 operates in one of two modes; a profile mode and an optimised mode.

The profile mode is intended for a programmer to determine the memory use characteristics of a particular program or process. Profile mode is used to determine a memory use profile, which is in turn used to generate tuning data. This may be done manually or automatically as further described below. In optimised mode, the memory manager 36 defines a set of pre-allocated memory pools to provide memory to objects. The sizes of the memory pools, the number of these pools and the types of object for which pools are provided may all be defined in the tuning data.

In profile mode in this embodiment, the memory manager 36 uses a standard, known memory allocation technique, such as a free list or bitmap to allocate memory to the user programs 32 or the kernel programs. The memory manager 36 includes a data log and data is collected of how memory is allocated according to this process during operation of the programs. Furthermore, data for each deallocation or reallocation is also collected. When a software engineer wishes to optimise memory for a particular program or process, the memory manager's data log is reset. Once the particular process has finished, the memory manager 36 stops collecting data. The data log therefore has a set of data for the program or process being optimised.

As noted above, it can be useful to have preallocated memory pools for objects of a particular size. This can speed up memory allocations as the memory allocator does not have to waste time looking for space for each allocation. It can also be useful to allocate memory by type. Profile mode can enable an engineer to optimise memory based on size alone, or based on size and type.

When a particular program or process is run in profile mode, the following information is collected:

1. Size and address of object allocated.

2. Size and address of object de-allocated

3. If run time type identification (RTTI) is available, or if the object contains any inherent type information, such as a virtual table pointer, the type of object allocated.

With this information it is possible to ascertain the following for each size of memory block allocated or type of object:

• Maximum and minimum number of objects. • Average number of objects.

• Frequency of allocations and de-allocations.

• The number of objects allocated at any particular time period.

The process of profiling a particular program according to an embodiment of the invention will be described in connection with Figure 3. The process begins by running the program to collect the aforementioned data (step 401). Based on the above, the memory manager 36 is arranged to generate a histogram which shows the number of operations (i.e allocations and deallocations) for each object type. The software engineer selects whether to do this by size alone, or by size and type. In this case, the software engineer chooses to profile by size alone. Figure 4 is a histogram which shows, for a particular time-limited process, the number of operations per object size. The number of operations is shown on the y-axis and the different sizes of objects are shown on the x-axis. For example, sort C may be objects which are 32 bytes, sort B may be objects which are 16 bytes, etc. As can be seen from Figure 4, sort C objects were involved in the greatest number of operations at just under 10000. Sort B objects were involved in just under 8000 operations, and so on.

This histogram gives a clear idea, to the software engineer, of which object sizes will most benefit from optimisation. When generating tuning data, the software engineer may take account of the amount of RAM available and the importance of the process being optimised. Using the data provided in Figure 4, the engineer can decide how many object sizes should optimised. If the program uses relatively little space, compared to the amount of RAM available, the engineer may choose to optimise all objects, and provide memory pools for all of them. Alternatively, if RAM is limited, the engineer may choose to optimise only the three most used object sizes, for example, Sorts C, B and Z.

The memory manager 36 also provides a histogram for each object size which shows operations over time. Figure 5 shows Sort C. The number of operations is shown on the y-axis and time is shown on the x-axis. As can be seen, Sort C object operations peak early in the process, with 1200 operations. After around 50 seconds, cell usage drops and settles at around 600 operations. As above, the software engineer can use his or her judgement to determine how many blocks to allocate in optimised mode for this particular object size. For example, the engineer could allocate 1200 blocks to this object size, however this would result in a portion of memory which is only at 50% of capacity for the last 65 seconds of the process.

Using the above information the software engineer is able to generate tuning data (step 402) for a particular process. In this embodiment the tuning data includes:

1) the number of object sizes to optimise, i.e. the number of pools of memory to preallocate;

2) the number of blocks to provide within each pool and hence the size of each pool;

3) the overall amount of memory required by the memory allocator for the preallocation of pools. As will be appreciated, there is no set algorithm for determining the tuning data. Instead, the software engineer may use his or her judgment based on factors such as available RAM, the importance of memory optimisation for the process under consideration, etc.

Once the process has been optimised in this embodiment, the tuning data is stored and the process can be set to run in optimised mode (step 403). Typically, the tuning data is set for a number of programs and processes during the development of the OS or program. The OS or program can then be shipped with tuning data in place. The device would then run in optimised mode when used by the end user.

The process of running a program in optimised mode in an embodiment of the invention is described in connection with Figure 6. In optimised mode the memory manager 36, on boot of the kernel 34 or on initiation of a particular process or program, sets aside an area of system memory 38 containing a set of memory pools, based on the tuning data (step 410). Each pool contains a number of pre-allocated blocks or cells of a particular size and the pools are set up according to the tuning data. There is one pool for each typically allocated memory block size or object type noted in the tuning data. If RTTI is available, then pools are set aside by type rather than by size, for those objects where the RTTI is available. Thus, for programs of the kernel for example, pre-allocated pools are created for processes, threads, semaphores, or other types of kernel objects. Moreover, in the case where type information is available, the data could be partially initialised at boot to reduce object/structure initialisation time after boot.

In an alternative embodiment, the memory manager 36 is able to preallocate pools based on objects with a range of sizes. For example if the engineer finds lots of allocations for small objects of sizes ranging from 16 to 32 bytes, then he or she could create a single pool with cells of size 32. Anything in the size range 16-32 could be allocated using this pool. This does mean some space is wasted on each cell (up to half the cell) but extends the speed performance improvement. The final tuning data is roughly equivalent to that described above. However, instead of defining how many individual objects sizes or types, we would say how many individual "size ranges" or types. In addition to the pre-allocated pools, the memory manager 36 may also implement a more traditional memory allocation technique. A backup allocator may be used for objects sizes or types that are not catered for by the pools, or to deal with the situation where all pre-allocated blocks in a pool have been used up.

Depending on the implementation, pools may be static or dynamic. In the former case the pool has a fixed size. In the latter the pool starts at a nominal size (defined by the tuning data), and can grow up to a maximum (also defined by the tuning data).

When an object of a program requests memory, the memory manager 36 takes the following steps:

1. If this is a memory block size or object type that is catered for by one of the pools, and a pre-allocated block is available, then the block is removed from the corresponding pool and returned it to the requesting program (step 411).

2. If no block is available in the corresponding pool, a new block is allocated using the backup allocator (step 412).

3. If no corresponding pool is available for an object of that object type or block of that block size, the backup allocator is used (step 413).

With memory deallocation the following steps would take place:

1. If the block being deallocated originated from a pre-allocated pool, the block is returned to the corresponding pool. This information can be easily ascertained from the address of the reallocated block, as it will lie within the memory bounds of the corresponding pool.

2. Otherwise, if a there is a pool for the corresponding object type or block size; the pool is dynamic and the number of blocks for that has not reached maximum, the block is added to the pool for the corresponding object type or block size. 3. If the pool is dynamic but has reached its maximum size, the backup allocator de-allocation routine is used to deallocate the block.

4. If the object has no pool associated with it, the backup allocator de-allocation routine is used to deallocate the block. It will be realised that algorithms having 0(1) constant time can be easily developed for both allocation and deallocation of objects coming from a pool if the pool is arranged as a linked list. Figure 7 illustrates an example of a memory pool 48. The pool 48 is first created from a single contiguous portion of memory defined between a start address 52 and an end address 56. The block is then divided into a number of equally sized blocks 54. Finally the blocks 54 are linked together into a singly linked list.

Each block 54 includes a link element 58 which links 60 to the next free block in the pool. The first block of the list is a list head 50. When allocation takes place the head 50 of the pool is removed and the object requested by the program is copied into the block. When the block is de-allocated, and it needs to return the pool, it is inserted at the head 50 of the list. Note also that the link memory is overwritten by object data when the memory block is used by the object's constructor or initialisation routine.

Using this simple list approach, dynamic pools are also possible.

While 0(1 ) constant time can be guaranteed to add or remove a block from a pool, the allocation routine also needs a fast method to access the right pool for a given memory block size or object type. There are various known techniques that could be use to create this mapping. In the embodiment shown, this mapping is not 0(1 ), however as the number of pools available would generally be small this should not significantly affect performance.

For each pool of the type shown in Figure 7, the tuning data would contain the following entries: a) Pool size: size of the pool. In architectures where memory is broken up into pages this would typically be rounded up to a multiple of page size; b) Maximum pool size: for dynamic pools the maximum size and for static pools the size at boot time (i.e. equal to the pool size); and c) Block size or type: size of each cell, or type if RTTI is available. If RTTI is available, the tuning data for each pool could also contain some basic initial data or a pointer to a function used to initialise each block when the pool is first created.

In one embodiment, each executable using the memory manager 36 to allocate pools as described would have access to the tuning data, either by using a separate configuration file, or configuration system (such as Symbian's central repository or MS Windows registry), or simply by having some data in the executable itself. This last-mentioned method could be suitable for very simplistic embedded devices or for executables that could not access a file or configuration system at initialisation; for example the kernel. This approach may however require a post linker to place the tuning data into the executable image.

As noted above, the collected data may be converted to tuning data manually or automatically. In the manual manner of doing this, an operator will view the collected data and, from this, determine the tuning data. The operator may repeat the step of collecting the data to ensure that all permutations of memory use are collected and that the resultant data sample is representative. In the simplest manner of doing this, the operator will specify tuning data to establish memory pools for each memory block size and type of object for which collected data exists. The size of the pool is equal to the maximum number of allocations in the collected data.

In an alternative embodiment, the operator may determine that the size of the pool may be set less than the maximum number where operating conditions for the program are unequal to those experienced during collection, or where the program is given a lower priority than other user programs.

The tuning data will generally be associated with a particular program. The data could either be stored in the executable file of the program or in a separate file or configuration storage area. The tuning data may provide information on how to set up the memory pools when the memory manager is in optimised mode.

If the process is created to automatically convert the collected data to tuning data, the memory manager 36 may, on every run, or on specific runs, collect data, even when running in optimised mode. At the end of the run the collected data is converted into tuning data used for the next run.

Tuning data may be referred to as memory optimisation data. Object category may mean object size, object size range, object type or a combination of these factors.

By basing the allocation on historical performance-related data, examples of the invention may ensure that the memory is allocated more efficiently and effectively when compared to known memory management systems. Furthermore, examples of the invention may allow the effectiveness of different known memory management systems to be compared for a particular hardware configuration.

The above examples have been described in the context a device which operates in an optimised mode. As described above, optimised mode may be a mode of operation in which memory use is optimised on the basis of memory use data which is determined at an earlier time. The mode of operation in which memory use data is determined may be referred to as a non-optimised mode. Optimised mode may also be referred to as an enhanced mode, and non-optimised mode may be referred to as a non-enhanced mode. More generally, enhanced mode may refer to a mode of operation in which knowledge of memory use requirements is used to improve the operation of the device. Non-enhanced mode may generally refer to a mode of operation in which knowledge of memory requirements are not used to improve the operation of the device. However, during non-enhanced mode, although knowledge of memory requirements which are used in enhanced mode are not generally used, other mechanisms for improving performance, including other, unrelated memory use improvements, may be used.

Various modifications, changes, and/or alterations may be made to the above described embodiments to provide further embodiments which use the underlying inventive concept, falling within the spirit and/or scope of the invention. Any such further embodiments are intended to be encompassed by the appended claims.

Claims

1. Apparatus comprising memory use data pertaining to the memory requirements of one or more processes and a memory allocator arranged to allocate memory to said processes, based on said memory use data; and wherein said memory requirements of the one or more processes are established by running said one or more processes.

2. A device according to claim 1 , wherein said processes include objects and said memory allocator is further arranged to allocate at least one pool of memory to at least one category of object.

3. A device according to claim 2, wherein said object category is one or more of object size, object size range and object type.

4. A device according to claim 3, wherein said memory use data specifies the number of pools which the memory allocation system should establish at run-time, the category of objects which the pools are for, and the size of the pools.

5. A device according to claim 4, wherein each of said pools comprises a plurality of blocks, the blocks of any one pool are of equal size and the size of the blocks of any one pool is set by said memory use data.

6. A device according to claim 5, wherein the size of a pool may be dynamically adjusted.

7. A device according to claim 6, wherein said memory use data includes a maximum size for said dynamic pools.

8. A device according to claim 1 , wherein said memory allocator is further arranged to allocate memory to said processes, based on said memory use data, in an enhanced mode.

9. A device according to claim 8, wherein said memory use data is determined in a non-enhanced mode.

10. A method comprising allocating memory to one or more processes based on pre-stored memory use data pertaining to the memory requirements of the one or more processes, wherein said memory requirements of the one or more processes are established by running said one or more processes.

11. Apparatus comprising a memory allocation system arranged to operate in two modes; wherein, in a first mode, said memory allocation system is arranged to determine memory use data pertaining to the memory requirements of one or more processes and, in a second mode, said memory allocation system is arranged to allocate memory to said processes, based on said memory use data.

12. A method comprising determining memory use data pertaining to the memory requirements of one or more processes; and allocating memory to said processes based on said memory use data.

13. A method according to claim 12, wherein determining memory use data includes the recording of profile data, relating to allocations and deallocations of process objects.

14. A method according to claim 13, wherein said profile data includes the number of operations per object category and the time of each operation per object category.

15. A method according to claim 14, wherein a user determines memory use data based on a consideration of said profile data.

16. A method according to claim 15, wherein said user specifies that at least the object category with the most operations should have a pool of memory during optimisation.

17. A method according to claim 16, wherein the user specifies the size of a pool to be at least large enough to accommodate the average number of objects in use at any one time during the running of the process.

18. A method according to claim 17, wherein said one or more processes may include a program and said allocated memory is assigned to said program during execution of said program.

19. A method according to claim 18, wherein said memory allocation system includes a back-up allocator for allocating memory to object categories which are not listed in said memory use data and for use when said pools are full.

20. A method according to any of claims 12 to 19, wherein said method further includes instantiating one or more memory pools as linked lists.

21. A computer program or suite of computer programs arranged such that when executed by a computer they cause the computer to operate in accordance with the method of any of claims 10 and 12 to 20.

22. A computer readable medium storing the computer program, or at least one of the suites of computer programs, according to claim 21.

23. An operating system for causing a computing device to operate in accordance with a method as claimed in any one of claims 10 and 12 to 20.

24. A computing device substantially as described hereinbefore and as shown in Figures 1 to 7.