CN117827689A

CN117827689A - Acceleration device and method for embedded flash memory

Info

Publication number: CN117827689A
Application number: CN202311748189.8A
Authority: CN
Inventors: 刘峰
Original assignee: Anhui Minsi Microelectronics Co ltd
Current assignee: Anhui Minsi Microelectronics Co ltd
Priority date: 2023-12-18
Filing date: 2023-12-18
Publication date: 2024-04-05

Abstract

The present disclosure provides an acceleration device and method for embedded flash memory, which belongs to the field of computers. The device comprises: the system comprises a buffer module and an acceleration module, wherein the buffer module comprises a first-in first-out buffer unit and a second first-in first-out buffer unit, and the first-in first-out buffer unit and the second first-in first-out buffer unit are used for storing prediction data which are determined according to a first reading instruction of an MCU; the acceleration module comprises a read acceleration unit, the read acceleration unit is used for receiving a second read instruction of the MCU, the second read instruction is used for accessing a first address of the embedded flash memory, the second read instruction is a next read instruction of the first read instruction, a third read instruction is generated according to the first address, and the third read instruction is used for accessing the embedded flash memory, the first-in first-out buffer unit or the second first-in first-out buffer unit. And two first-in first-out caches are adopted to accelerate the reading process and the writing process of the embedded flash memory, so that the reading and writing acceleration efficiency of the embedded flash memory is improved.

Description

Acceleration device and method for embedded flash memory

Technical Field

The present disclosure relates to the field of computers, and in particular, to an acceleration apparatus and method for an Eflash (embedded Flash).

Background

MCU (Micro Controller Unit, micro control unit) mainly adopts the Eflash as the flash memory, and the flash memory is mainly used for storing program and data. The Eflash has the advantages of rapid random access capability, and is suitable for executing codes and reading key data; the disadvantage is that the speed of reading data or programming (writing) data is slow and takes a long time. Therefore, the reading process of the Eflash needs to be accelerated by the acceleration device.

In the related art, the acceleration device includes: the acceleration module and the cache module are composed of registers, and when the MCU is in an idle state, the acceleration module stores data in the Eflash which the MCU possibly needs to read into the cache module in advance, so that the MCU can directly read from the cache module instead of reading from the Eflash when the MCU needs to read the data, and the acceleration of the reading process of the Eflash is realized.

However, because the capacity of the register is small, only data with a fixed size can be stored, and once the MCU needs to continuously access the Eflash, the amount of data stored in the small-capacity register cannot meet the reading requirement of the MCU. In this case, the MCU still needs to directly access the Eflash to read the data, so the acceleration effect of the acceleration device is poor.

Disclosure of Invention

The present disclosure provides an acceleration apparatus and method for an Eflash, which can accelerate a reading process of the Eflash. The technical scheme at least comprises the following scheme:

in a first aspect, an acceleration device for an embedded flash memory is provided, the acceleration device comprising: the system comprises a buffer module and an acceleration module, wherein the buffer module comprises a first-in first-out (First Input First Output, FIFO) buffer unit and a second FIFO buffer unit, and the first FIFO buffer unit and the second FIFO buffer unit are used for storing prediction data which is determined according to a first reading instruction from an MCU; the acceleration module comprises a read acceleration unit, the read acceleration unit is respectively connected with the first FIFO buffer unit and the second FIFO buffer unit, the read acceleration unit is used for receiving a second read instruction from the MCU, the second read instruction is used for accessing a first address of the Eflash, the second read instruction is a next read instruction of the first read instruction, a third read instruction is generated according to the first address, and the third read instruction is used for accessing the Eflash, the first FIFO buffer unit or the second FIFO buffer unit.

Optionally, the read acceleration unit is configured to implement the generating the third read instruction according to the first address in any one of the following manners: responding to the predicted data to contain first data corresponding to the first address, storing the predicted data in the first FIFO buffer unit, and generating a third read instruction for accessing the first FIFO buffer unit according to the first address; or, in response to the predicted data including the first data corresponding to the first address, storing the predicted data in the second FIFO buffer unit, and generating a third read instruction for accessing the second FIFO buffer unit according to the first address; or, generating a third read instruction for accessing the Eflash according to the first address in response to the predicted data not including the first data corresponding to the first address.

Optionally, the read acceleration unit is further configured to read data from the Eflash and store the data in the second FIFO buffer unit while the MCU reads data from the first FIFO buffer unit; or the read acceleration unit is further configured to read data from the Eflash and store the data in the first FIFO buffer unit while the MCU reads data from the second FIFO buffer unit.

Optionally, the first FIFO buffer unit and the second FIFO buffer unit are programmable FIFO buffer units, and the read acceleration unit is further configured to set depths of the first FIFO buffer unit and the second FIFO buffer unit according to at least one of a kernel type of the MCU and a type of a program executed by the MCU.

Optionally, the acceleration module further comprises: the programming acceleration unit is respectively connected with the first FIFO buffer memory unit and the second FIFO buffer memory unit; the programming acceleration unit is used for receiving a first programming instruction from the MCU; and according to the first programming instruction, sequentially applying high voltage to a plurality of rows of storage units in the Eflash, and sequentially writing data of a plurality of bytes in a target cache unit into the Eflash, wherein the target cache unit is the first FIFO cache unit or the second FIFO cache unit.

Optionally, the programming acceleration unit is configured to write the data of the ith byte in the target cache unit into the Eflash in the following manner: determining a programming address of an ith byte according to the programming address of the ith byte-1, wherein the programming address of the ith byte-1 is determined according to a programming starting address corresponding to the first programming instruction; applying a high voltage to a row of memory cells corresponding to a programming address of an ith byte, and writing the ith byte into the row of memory cells corresponding to the programming address of the ith byte; where i is an integer, i is greater than 1 and i is less than or equal to the number of bytes of the plurality of bytes of data.

Optionally, the programming acceleration unit is further configured to store the data to be programmed sent by the MCU while reading data from the target cache unit and writing the data to the Eflash, where the cache units other than the target cache unit in the first FIFO cache unit and the second FIFO cache unit are further used.

Optionally, the programming acceleration unit is further configured to receive an erase command from the MCU, and erase a plurality of pages or sectors in the Eflash one by one according to the erase command.

In a second aspect, there is provided an acceleration method for an Eflash, the method comprising: determining predicted data according to a first reading instruction from the MCU, and storing the predicted data in a first FIFO buffer memory unit or a second FIFO buffer memory unit; receiving a second read instruction from the MCU, wherein the second read instruction is used for accessing a first address of the Eflash, and the second read instruction is a next read instruction of the first read instruction; and generating a third read instruction according to the first address, wherein the third read instruction is used for accessing the Eflash, the first FIFO buffer memory unit or the second FIFO buffer memory unit.

Optionally, the generating a third read instruction according to the first address, where the third read instruction is used to access the Eflash, the first FIFO buffer unit, or the second FIFO buffer unit, includes: responding to the predicted data to contain first data corresponding to the first address, storing the predicted data in the first FIFO buffer unit, and generating a third read instruction for accessing the first FIFO buffer unit according to the first address; or, in response to the predicted data including the first data corresponding to the first address, storing the predicted data in the second FIFO buffer unit, and generating a third read instruction for accessing the second FIFO buffer unit according to the first address; or, generating a third read instruction for accessing the Eflash according to the first address in response to the predicted data not including the first data corresponding to the first address.

Optionally, the method further comprises: determining whether the MCU is in an idle state; and when the MCU is determined to be in an idle state, storing the predicted data to the cache module.

Optionally, the method further comprises: reading data from the Eflash and storing the data in the second FIFO buffer unit while the MCU reads the data from the first FIFO buffer unit; or, while the MCU reads data from the second FIFO buffer unit, the MCU reads data from the Eflash and stores the data into the first FIFO buffer unit.

Optionally, the first FIFO buffer unit and the second FIFO buffer unit are both programmable FIFO buffer units, and the method further comprises: and setting the depths of the first FIFO buffer memory unit and the second FIFO buffer memory unit according to at least one of the kernel type of the MCU and the type of the program executed by the MCU.

Optionally, the method further comprises receiving a first programming instruction from the MCU; according to the first programming instruction, applying high voltage to the multiple rows of memory cells in the Eflash row by row; writing data of a plurality of bytes in a target cache unit into the Eflash line by line according to bytes, wherein each line of storage unit stores data of one byte, and the target cache unit is the first FIFO cache unit or the second FIFO cache unit.

Optionally, the method further includes determining a programming address of an ith byte according to the programming address of the ith-1 th byte, wherein the programming address of the ith-1 th byte is determined according to a programming start address corresponding to the first programming instruction; applying a high voltage to a row of memory cells corresponding to a programming address of an ith byte, and writing the ith byte into the row of memory cells corresponding to the programming address of the ith byte; where i is an integer, i is greater than 1 and i is less than or equal to the number of bytes of the plurality of bytes of data.

Optionally, the method further includes storing data to be programmed sent by the MCU while reading data from the target cache unit and writing the data to the Eflash.

Optionally, the method further comprises: and receiving an erasure instruction from the MCU, and according to the erasure instruction, erasing a plurality of pages or sectors in the Eflash one by one.

The technical scheme provided by the embodiment of the disclosure has the beneficial effects that at least:

the read acceleration unit is used for acquiring the predicted data and storing the predicted data into the FIFO buffer memory unit, so that the MCU can continuously read the data from the FIFO buffer memory unit, and the efficiency of the MCU in the process of reading the data from the Eflash is improved. And, adopt the FIFO buffer cell, compare in the register, the capacity is great, can store the predictive data more, be favorable to improving the hit rate of second read instruction.

Drawings

Fig. 1 illustrates a schematic diagram of an application scenario of an acceleration device for an Eflash provided in an exemplary embodiment of the present disclosure;

fig. 2 is a schematic diagram illustrating a structure of an acceleration apparatus for an Eflash according to an exemplary embodiment of the present disclosure;

fig. 3 illustrates a schematic structural diagram of an acceleration device for an Eflash according to an exemplary embodiment of the present disclosure;

fig. 4 shows a schematic diagram of the internal structure of an Eflash;

FIG. 5 shows a schematic diagram of adjacent two byte programming in the related art;

FIG. 6 illustrates a schematic diagram of an exemplary provision of adjacent two byte programming after employing an accelerator in accordance with an embodiment of the present disclosure;

FIG. 7 shows a schematic diagram of a state machine of a read acceleration unit;

FIG. 8 shows a schematic diagram of a state machine programming an acceleration unit;

fig. 9 is a flowchart illustrating an acceleration method for embedded flash memory according to an exemplary embodiment of the present disclosure.

Detailed Description

For the purposes of clarity, technical solutions and advantages of the present disclosure, the following further details the embodiments of the present disclosure with reference to the accompanying drawings.

Fig. 1 is a schematic diagram of an application scenario of an acceleration device for an Eflash provided in an embodiment of the present disclosure. Referring to fig. 1, the acceleration device 302 is connected to the MCU301 and the Eflash303, respectively. The acceleration device 302 may accelerate the read process and/or the programming process (i.e., the write process) of the flash303 by the MCU 301.

In the embodiment of the present disclosure, the accelerating device 302 accelerates the read process and/or the write process of the Eflash 303 by using the FIFO buffer unit, so as to improve the read-write acceleration efficiency of the Eflash.

The working principle of the FIFO buffer unit is described below.

The FIFO buffer unit comprises a read pointer, and data is read in the FIFO buffer unit according to the address pointed by the read pointer.

When reading data from the FIFO buffer unit, the data is read from the address pointed by the read pointer. After the data of the address pointed by the read pointer is read, the address pointed by the read pointer is added with 1; if the reading of data from the FIFO buffer element is continued at this time, the data in the address pointed to by the current read pointer is read, thus achieving a sequential reading of the data.

The FIFO buffer unit further comprises a write pointer for writing data in the FIFO buffer unit according to the address pointed by the write pointer.

When writing data into the FIFO buffer unit, writing starts from the address pointed to by the write pointer, similar to reading data from the FIFO buffer unit. After writing the data, the address pointed by the writing pointer is added with 1. If the data continues to be written into the FIFO buffer cell at this time, the data is written into the address pointed by the write pointer in the current FIFO buffer cell, thereby realizing the sequential writing of the data.

When each address in the FIFO buffer memory unit stores data, if the data is to be written into the FIFO buffer memory unit continuously, the data in the address of the first written data in the data corresponding to all addresses in the current FIFO buffer memory unit is replaced by the current written data, and the cycle is repeated in this way, so that the first-in first-out is realized.

Fig. 2 illustrates a schematic structure of an acceleration apparatus for an Eflash according to an exemplary embodiment of the present disclosure. Referring to fig. 2, the acceleration device 302 includes: a caching module 311 and an acceleration module 312.

The buffer module 311 includes: a first FIFO buffer unit 14 and a second FIFO buffer unit 15. The first FIFO buffer unit 14 and/or the second FIFO buffer unit 15 are programmable FIFO buffer units for storing prediction data determined from the first read instruction from the MCU.

The acceleration module 312 includes: a read acceleration unit 12. The read acceleration unit 12 is connected to the first FIFO buffer unit 14 and the second FIFO buffer unit 15, respectively, where the read acceleration unit 12 is configured to receive a second read instruction from the MCU, where the second read instruction is used to access a first address of the Eflash, the second read instruction is a next read instruction of the first read instruction, and generate a third read instruction according to the first address, where the third read instruction is used to access the Eflash, the first FIFO buffer unit 14, or the second FIFO buffer unit 15.

In the related art, a buffer (buffer) is generally composed of two registers, and since the length of data is fixed (for example, 1 byte) each time data is stored in a register, a segment of continuous data needs to be stored multiple times when data is stored in the buffer, which is inefficient in storing data. Further, since the buffer capacity is fixed (small in comparison with the FIFO buffer unit), the length of the predicted data is short even when the buffer is filled with the predicted data, and the larger the length of the predicted data is, the higher the hit probability is. Thus, since the length of the storable predicted data in the buffer is short, the hit probability is low when the predicted data is stored using the buffer.

In the presently disclosed embodiments, hit refers to: the cached and stored prediction data contains data in an address of the Eflash to be accessed by the MCU.

In the related art, SRAM (Static Random Access Memory ) may be used as a buffer memory, and SRAM may realize continuous data storage, but the volume and cost of SRAM are relatively large, and the use of SRAM as a buffer memory increases the volume and cost of the chip.

In the embodiment of the present disclosure, the first FIFO buffer unit 14 and the second FIFO buffer unit 15 are adopted as buffers, on one hand, compared with a register, the capacity is larger, predicted data that can be stored is more, which is favorable for improving the hit rate of the second read instruction, on the other hand, compared with an SRAM, the FIFO buffer unit has low cost, small volume and wide application range, so that the cost of the accelerator 302 can be reduced and the applicability of the accelerator 302 can be improved by adopting the FIFO buffer. And, the read acceleration unit 12 acquires the predicted data and stores the predicted data into the FIFO buffer unit, so that the MCU can continuously read the data from the FIFO buffer unit, and the efficiency of the MCU in the process of reading the data from the Eflash is improved. In addition, the speed of the FIFO buffer memory unit for reading data is faster than that of the SRAM, and even the operating frequency of the FIFO buffer memory unit is the same as that of the MCU, so that the MCU can read data from the FIFO buffer memory unit at the maximum speed, and the speed of the MCU in the process of reading data from the Eflash is further improved.

Fig. 3 illustrates a schematic structure of an acceleration apparatus for an Eflash according to an exemplary embodiment of the present disclosure. Referring to fig. 3, the acceleration device 302 includes: a caching module 311 and an acceleration module 312. The buffering module 311 comprises a first FIFO buffer unit 14 and a second FIFO buffer unit 15. The acceleration module 312 includes a bus interface unit 11, a read acceleration unit 12, and a control signal conversion unit 16.

Optionally, the bus interface unit 11 is configured to receive a read instruction (e.g., a first read instruction, a second read instruction, etc. hereinafter) from the MCU, and send the read instruction to the read acceleration unit 12.

Illustratively, the bus interface unit 11 is configured to receive the aforementioned read instructions from the bus. Optionally, the bus is an AHB (Advanced High performance Bus, high performance bus), and the MCU may send the read command and the start address corresponding to the read command to the bus interface unit 11 through the AHB bus. Illustratively, the bus may also be an AXI (Advanced eXtensible Interface ) bus, with embodiments of the present disclosure not limiting on the type of bus.

Optionally, the read acceleration unit 12 is configured to store the prediction data to the buffer module 311 according to the first read instruction sent by the MCU. The first read instruction is the read instruction which is executed up to date. The predicted data is data predicted by the read acceleration unit 12 to be read by the MCU next read instruction. Since the prediction data is predicted by the read acceleration unit 12, the prediction data may or may not include data that needs to be read by the next read instruction of the MCU.

Optionally, storing the prediction data to the cache module 311 according to the first read instruction includes: and storing the data stored in the X addresses after the second address in the Eflash into the cache module 311. The second address is the end address of the second data corresponding to the first read command (i.e. the address of the last byte of data in the second data in the Eflash), and the data stored in the X addresses after the second address is the predicted data. Wherein X is a positive integer.

In the embodiment of the present disclosure, the value range of X is 10-50, and may be, for example, data in 10 addresses after the second address, data in 20 addresses after the second address, or data in 50 addresses after the second address. Illustratively, the second address is 00001 and X is 10, then the predicted data is from 00002 up to the data in 0000A.

It should be noted that, the embodiment of the present disclosure does not limit the manner of determining the prediction data, and any manner in the related art may be adopted.

Optionally, X is less than or equal to the depth of the first FIFO buffer unit 14. In some examples, X is equal to the depth of the first FIFO buffer unit 14. In this way, the first FIFO buffer 14 may just be filled up according to the predicted data corresponding to the first read instruction.

In some examples, the first FIFO buffer unit 14 and the second FIFO buffer unit 15 are non-programmable FIFO buffer units.

In other examples, the first FIFO buffer unit 14 and the second FIFO buffer unit 15 are programmable FIFO buffer units. The programmable FIFO buffer unit can adjust relevant parameters (such as depth and the like) of the FIFO according to actual requirements under different application scenarios. For example, in an application scenario requiring large-capacity buffering, the depth of the programmable FIFO buffer unit may be set to a maximum value; in an application scenario where a small cache size is required, the depth of the programmable FIFO buffer element may be set to half the maximum (or to some other form of less than the maximum). In this way, the relevant parameters of the programmable FIFO buffer memory unit can be adjusted according to the actual requirements, so that the programmable FIFO buffer memory unit can adapt to the application requirements of different application scenes.

When the programmable FIFO buffer units are adopted, parameters such as the depth of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 can be set according to at least one of the type of program executed by the MCU and the type of the MCU core, so as to achieve the optimal acceleration performance.

In one possible implementation manner, the application programs in the MCU may be pre-divided into a first type program and a second type program according to the characteristic of accessing the Eflash, so as to obtain a program type list; the list of program types is then stored into a register. The read acceleration unit 12 may periodically query the program being executed by the MCU, determine whether the program being executed by the MCU is a first type program or a second type program according to the program type list in the register and the program being executed by the MCU, and then set and modify the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 according to the program type.

The application programs in the MCU are divided into a first type program and a second type program in advance, and the first type program and the second type program can be determined through testing or code analysis.

In this case, the read acceleration unit 12 is configured to set the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 to a first value when the MCU executes a first type program, which is a program sequentially accessed when accessing the Eflash; and/or setting the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 to a second value when the MCU executes a second class of program, the second class of program being a program that jumps access when accessing the Eflash, wherein the second value is smaller than the first value.

Alternatively, the first value is the maximum depth of the first FIFO buffer unit 14 and the second FIFO buffer unit 15, and the second value is a set proportion of the first value, which may be set according to actual needs, for example, 1/2 or 1/3, or the like.

When the program being executed by the MCU is a program sequentially accessed when accessing the Eflash, it is indicated that the MCU will continuously access the Eflash next, that is, the read acceleration unit 12 will continuously receive the read command, and at this time, the FIFO depth needs to be increased so as to store more predicted data, and improve the hit probability of the predicted data. When the program being executed by the MCU is the program which is jumped to access when the Eflash is accessed, the MCU is indicated to intermittently access the Eflash, and at the moment, the depth of the FIFO can be reduced, and excessive predicted data is prevented from being stored. Therefore, the depth of the FIFO buffer memory unit is changed according to the characteristic that software being executed by the MCU accesses the Eflash, so that the acceleration performance can be effectively improved.

In another possible embodiment, the read acceleration unit 12 is configured to set the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 to a first value when the core of the MCU is a first type core, and/or to set the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 to a second value when the core of the MCU is a second type core, wherein the second value is smaller than the first value.

The first type of core is a core with larger operand when the MCU accesses the Eflash, for example, may be an ARM (Advanced Reduced Instruction Set Computer Machine, advanced reduced instruction set machine) core, and the second type of core is a core with smaller operand when the MCU accesses the Eflash, for example, an 8051 core.

In yet another possible implementation, the read acceleration unit 12 is configured to set the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 to a first value when the core of the MCU is a first type core and the MCU executes the first type program; when the core of the MCU is the second type core and the MCU executes the second type program, setting the depths of the first FIFO buffer cell 14 and the second FIFO buffer cell 15 to be a second value; when the core of the MCU is the first type core and the MCU executes the second type program, setting the depths of the first FIFO buffer cell 14 and the second FIFO buffer cell 15 to be a third value; when the core of the MCU is the second type core and the MCU executes the first type program, the depths of the first FIFO buffer unit 14 and the second FIFO buffer unit 15 are set to the fourth value. Wherein the first value is greater than the third value, the third value is greater than or equal to the fourth value, and the fourth value is greater than the second value.

Here, the foregoing embodiments are referred to with respect to the program type and the kernel type.

Optionally, the third value and the fourth value are set proportions of the first value, and the set proportion of the third value is greater than or equal to the fourth value, and the set proportion of the fourth value is greater than the second value.

Illustratively, the first value is a maximum depth, the third value is equal to the fourth value and is 1/2 of the first value, and the second value is 1/3 of the first value.

Optionally, the read acceleration unit 12 includes: an access time detection subunit 101 and a prefetch subunit 102. The access time detection subunit 101 is configured to detect a time when the MCU accesses the Eflash, and consider that the MCU is in a working state when the MCU accesses the Eflash; when the MCU does not access the Eflash, the MCU is considered to be in an idle state. The information whether the MCU is in an idle state can be acquired by the access time detection subunit 101. The prefetch subunit 102 is configured to store the prediction data to the cache module 311 when the access time detection subunit 101 determines that the MCU is in an idle state.

Optionally, the starting address of the predicted data in the Eflash (i.e. the address next to the second address) and the length (X) of the predicted data are stored in the register while the predicted data is stored in the buffer module 311, so that the address range in the Eflash corresponding to the predicted data can be determined by only querying the starting address of the predicted data in the register and the length of the predicted data. For example, since the starting address of the predicted data is a and the length of the predicted data is X, the address range in the corresponding Eflash of the predicted data is a to (a+x-1), that is, the data in a to (a+x-1) in the Eflash is stored in the buffer module 311.

In the embodiment of the present disclosure, the access time detection subunit 101 may obtain the working state of the MCU by querying the MCU working flag bit. The MCU working zone bit comprises a high level and a low level, and when the MCU is in a state requiring to visit the Eflash, the MCU working zone bit is changed into the high level; when the MCU is in an idle state, the MCU working bit becomes low level. Therefore, the working state of the MCU can be detected by inquiring the working zone bit of the MCU.

Alternatively, the read acceleration unit 12 periodically queries the MCU working bit, and when the MCU is in a working state, the read acceleration unit 12 waits for the next query of the MCU working bit; when the MCU is in an idle state, the read acceleration unit 12 determines the predicted data according to a first read instruction from the MCU.

Because the working state of the MCU needs to be timely acquired to improve the acceleration efficiency, the period for inquiring the working zone bit of the MCU cannot be overlong. In the embodiment of the disclosure, the period of the query MCU working flag bit ranges from 1 microsecond to 10 microseconds, for example, may be 1 microsecond, 2 microseconds, 5 microseconds, or 10 microseconds.

In some embodiments, the read acceleration unit 12 continually queries the MCU for the work flag.

Optionally, the read acceleration unit 12 is further configured to receive a second read instruction from the MCU, where the second read instruction is a next read instruction of the first read instruction, and the second read instruction is used to access the first address of the Eflash; after receiving the second read instruction, the read acceleration unit 12 generates a third read instruction according to the first address, where the third read instruction is used to access the Eflash, the first FIFO buffer unit 14, or the second FIFO buffer unit 15.

In the embodiment of the disclosure, the second read instruction is used to access data in an address range in the Eflash, and the first address may be any address in the address range.

Alternatively, the read acceleration unit 12 generates a third read instruction from the first address in the following manner.

In the first step, the read acceleration unit 12 generates a third read instruction according to the relationship between the first address and the address range in the Eflash corresponding to the predicted data.

If the first address is located in the address range, it means that the first data corresponding to the first address exists in the buffer module 311, and execute the second step to the third step; if the first address is outside the address range, it means that the first data corresponding to the first address does not exist in the buffer module 311, and the fourth step to the fifth step are performed.

The second step, the read acceleration unit 12 generates a third read instruction for accessing the cache module 311.

Optionally, the second step further comprises: it is determined which cache unit in the cache module 311 the first data corresponding to the first address is located.

When the first data corresponding to the first address is located in the first FIFO buffer unit 14, the third read instruction is used to access the first FIFO buffer unit 14; alternatively, the third read instruction is used to access the first FIFO buffer unit 14 first, and then access the second FIFO buffer unit 15 after all the data in the first FIFO buffer unit 14 is read.

When the first data corresponding to the first address is located in the second FIFO buffer unit 15, the third read instruction is used to access the second FIFO buffer unit 15; alternatively, the third read instruction is used to access the second FIFO buffer unit 15 first, and then access the first FIFO buffer unit 14 after all the data in the second FIFO buffer unit 15 is read.

Alternatively, the read acceleration unit 12 may set an identifier indicating the cache unit in which the data was last stored. In this way, the cache unit of the next stored data can be quickly determined from the identifier. For example, when predicted data is stored in the first FIFO buffer unit 14, the identifier is set to a first value; when the predicted data is stored in the second FIFO buffer unit 15, the identifier is set to a second value. Thus, the read acceleration unit 12 can determine whether the predicted data is stored in the first FIFO buffer unit 14 or the second FIFO buffer unit 15 by looking at the identifier. Optionally, the first value is 1 and the second value is 0; alternatively, the second value is 1 and the first value is 0.

And thirdly, the read acceleration unit 12 sends the data in the buffer module 311 to the MCU according to a third read instruction.

The fourth step, the read acceleration unit 12 generates a third read instruction for accessing the Eflash.

The third read instruction is used for accessing the first address in the Eflash.

And fifthly, after the read acceleration unit 12 reads the first data from the Eflash according to the third read instruction, the read first data is sent to the MCU.

Optionally, the read acceleration unit 12 is further configured to read data from the Eflash and store the data in the second FIFO buffer unit 15 while the MCU reads data from the first FIFO buffer unit 14; alternatively, the read acceleration unit 12 is further configured to read data from the Eflash and store the data in the first FIFO buffer unit 14, while the MCU reads data from the second FIFO buffer unit 15. That is, the first FIFO buffer unit 14 and the second FIFO buffer unit 15 constitute a ping-pong FIFO.

In the embodiment of the present disclosure, the read acceleration unit 12 is configured to store the new predicted data to the buffer module 311 according to the second read instruction while generating the third read instruction and transmitting the data to the MCU according to the third read instruction. The new prediction data is determined from the second read instruction. The related content of storing the prediction data to the buffer module 311 according to the second read instruction is referred to as the first read instruction, and a detailed description thereof is omitted herein.

And to which cache location the new prediction data is stored is determined based on the third read instruction. For example, if the third read instruction is for accessing the first FIFO buffer 14, the predicted data is stored to the second FIFO buffer 15; if the third read instruction is for accessing the second FIFO buffer 15, storing the predicted data to the first FIFO buffer 14; if the third read instruction is for accessing an Eflash, the predicted data is stored to either one of the first FIFO buffer 14 and the second FIFO buffer 15.

In some embodiments, the read acceleration unit 12 further comprises: the address determination subunit 103 is accessed. The access address determination subunit 103 is configured to generate the aforementioned third read instruction according to the first address.

Optionally, the acceleration module 312 further comprises a programmed acceleration unit 13. The programming acceleration unit 13 is connected to the bus interface unit 11 and the control signal conversion unit 16, respectively. The bus interface unit 11 is further configured to receive a programming instruction (e.g., a first programming instruction, a second programming instruction, etc.) from the MCU, and send the programming instruction to the program acceleration unit 13.

The MCU stores the data to be written into the Eflash into the target cache unit before sending a programming instruction to the program acceleration unit 13.

The programming acceleration unit 13 is configured to receive a first programming instruction from the MCU, where the first programming instruction is configured to instruct the programming acceleration unit 13 to start programming; according to the first programming instruction, high voltage is sequentially applied to a plurality of rows of storage units in the Eflash, and data of a plurality of bytes in a target cache unit is sequentially written into the Eflash, wherein the target cache unit is a first FIFO cache unit 14 or a second FIFO cache unit 15.

Here, applying a high voltage to a plurality of rows of memory cells in an Eflash line by line means sequentially applying a high voltage to word lines to which the plurality of rows of memory cells are connected.

The high voltage is a voltage required to be applied when writing data in a row of memory cells directed to the Eflash, and is greater than the operating voltage of the chip. For example, the operating voltage of the chip is 2.7 v-5.5 v, and the voltage required to be applied when writing data into an Eflash row of memory cells is as high as 12 v, so the high voltage is much greater than the operating voltage of the chip. When a high voltage is applied to a certain row of memory cells in an Eflash, the high voltage needs to be started first, and the starting process takes a long time.

As shown in fig. 4, the Eflash includes a plurality of memory cells 401 arranged in an array, each memory cell 401 may store one bit of data. The word lines 412 of each row of memory cells 401 are connected and the bit lines 411 of each column of memory cells 401 are connected. In fig. 4, after the word lines 412 of a row (8) of memory cells 401 are connected, the row of memory cells can store 8 bits of data, i.e., 1 byte of data.

In the related art, for an Eflash supporting byte programming, it is necessary to apply a high voltage to a word line of a row of memory cells according to a program instruction during programming, and then write data to each memory cell in the row by controlling a state (e.g., floating or high voltage, etc.) of the bit line. Typically, a row of memory cells includes 8 bits of data, and thus, one byte of data can be written per program instruction. When a plurality of bytes need to be programmed continuously, a plurality of programming instructions need to be sent, and each time a programming instruction is received, a programming address corresponding to the programming instruction needs to be read, so that a high voltage is applied to word lines of a row of memory cells corresponding to the programming address. Therefore, a certain interval is required between adjacent two-byte programming, resulting in lower programming efficiency.

Fig. 5 shows a schematic diagram of adjacent two byte programming in the related art. Referring to fig. 5, 501 is a timing sequence of a programming process, where address 0 and address 1 represent programming addresses corresponding to two adjacent rows of memory cells, and programming 0 is writing one byte into address 0 by one programming instruction; programming 1 is to write a byte to address 1 by a second programming instruction in succession.

Since each programming instruction is received, the corresponding programming address needs to be read, the high voltage is started, the high voltage is turned off after the programming instruction is executed, and then the high voltage is restarted when the next programming instruction is received. This process takes a long time, and thus, when two adjacent bytes are programmed to two adjacent memory cells, a timing of one T0 length is necessarily spaced.

Fig. 6 shows a schematic diagram of an exemplary provision of adjacent two byte programming after employing accelerator 302 in accordance with an embodiment of the present disclosure. Referring to fig. 6, 501 is a timing sequence of a programming process, where address 0 and address 1 represent programming addresses corresponding to two adjacent rows of memory cells, and programming 0 is writing one byte into address 0 by one programming instruction; programming 1 is to write the next byte to address 1. It can be seen that, in the embodiment of the present disclosure, a high voltage is applied to the word lines connected to each row of memory cells row by row according to one programming instruction, and data of a plurality of bytes in the target cache unit are written into the Eflash row by row according to bytes, and this process only needs to receive one sub programming instruction, start the high voltage once, and after the high voltage is started once, since the high voltage is not turned off, the high voltage can be continuously applied to the next row of memory cells without restarting the high voltage. Therefore, when two adjacent bytes are programmed to two adjacent memory cells, a time sequence of one T0 length is not needed, so that the waiting time between two byte programming operations is reduced, and the programming efficiency of Eflash is improved.

In the embodiment of the present disclosure, the program acceleration unit 13 is configured to sequentially apply a high voltage to a plurality of rows of memory cells in the Eflash according to the first programming instruction, and sequentially write data of a plurality of bytes in the target cache unit into the Eflash by using the following manner:

for the data of the 1 st byte in the target cache unit, the program acceleration unit 13 is configured to apply a high voltage to a row of memory cells corresponding to the program start address according to the program start address corresponding to the first program instruction, and write the 1 st byte in the target cache unit into the row of memory cells corresponding to the program start address.

For the data of the ith byte in the target cache unit, the programming acceleration unit 13 is used for determining the programming address of the ith byte according to the programming address of the ith-1 th byte, and the programming address is used for indicating the programming address determined according to the programming starting address corresponding to the first programming instruction; applying a high voltage to a row of memory cells corresponding to the programming address of the ith byte, and writing the ith byte into the row of memory cells corresponding to the programming address of the ith byte; where i is an integer, i is greater than 1 and i is less than or equal to the number of bytes of data of the plurality of bytes.

Wherein, according to the programming address of the ith-1 th byte, determining the programming address of the ith byte comprises: and adding 1 to the programming address of the ith byte to obtain the programming address of the ith byte.

That is, the programming acceleration unit 13 writes the 1 st byte in the target cache unit into a row of memory cells corresponding to the programming start address according to the programming start address in the Eflash set by the MCU; after the writing of the first byte is finished, 1 is added to the programming start address, and the programming start address is written with the 2 nd byte in the target cache unit and then added with 1 to correspond to the first row of memory cells. And so on until all bytes in the target cache unit are written into the Eflash.

In implementation, the program start address corresponding to the first programming instruction may be written into the control register by the program acceleration unit 13.

Optionally, the accelerator 302 further includes a plurality of registers, and the plurality of registers are connected to the accelerator module 312. The plurality of registers includes at least one control register for storing a program start address and a program address. After the program acceleration unit 13 writes the first byte in the target cache unit into a row of memory cells corresponding to the program start address, the program acceleration unit 13 may increment the program start address stored in the control register by 1, and then each time the program acceleration unit 13 writes one byte of data into the Eflash, the program acceleration unit 13 may increment the address stored in the control register by 1. In this way, the programming acceleration unit 13 can sequentially write all bytes in the target cache unit into the Eflash according to the address stored in the control register.

Optionally, the buffer units other than the target buffer unit in the first FIFO buffer unit 14 and the second FIFO buffer unit 15 are further used to store the data to be programmed sent by the MCU while reading the data from the target buffer unit and writing the data into the Eflash.

The program acceleration unit 13 is further configured to receive a second programming instruction from the MCU after reading data from the target cache unit and writing the data into the iflash, apply a high voltage to the multiple rows of memory cells in the iflash according to the second programming instruction, and sequentially write data of multiple bytes in another cache unit other than the target cache unit into the iflash.

The process of programming according to the second programming instruction is referred to as a process of programming according to the first programming instruction, and a detailed description thereof will be omitted.

In this way, the first FIFO buffer unit 14 and the second FIFO buffer unit 15 can be programmed as ping-pong FIFOs, so as to realize continuous writing of data into the Eflash, thereby accelerating the speed of programming the Eflash.

The embodiments of the present disclosure are particularly applicable to scenarios where large amounts of data need to be written to elast. For example, when a large amount of data needs to be burned into the Eflash before the chip leaves the factory, the programming acceleration unit 13 in the embodiment of the disclosure can be adopted to quickly write the data into the Eflash. In this case, the programming acceleration unit 13 is configured to receive a third programming instruction from the host computer, read data from the target cache unit, and write the data into the Eflash. The process of programming according to the third programming instruction is referred to as a process of programming according to the first programming instruction, and a detailed description thereof will be omitted.

When a large amount of data is burnt into the Eflash, the upper computer firstly sends the large amount of data to the SRAM, and after the SRAM receives the data, the data of a plurality of bytes are sequentially stored into the target cache unit and another cache unit outside the target cache unit. After the data of the target cache unit is written into the Eflash by the programming acceleration unit 13, the programming acceleration unit 13 continuously receives a fourth programming instruction from the MCU, reads the data from another cache unit outside the target cache unit and writes the data into the Eflash; meanwhile, the SRAM continues to store data in the target cache unit, and the data is cycled until a large amount of data in the SRAM is completely written into the Eflash.

In this process, since the program acceleration unit 13 can reduce the waiting time between two byte programming operations while there are a plurality of bytes of data in a large amount of data, the time taken for this process is greatly saved by employing the program acceleration unit 13.

Optionally, the program acceleration unit 13 is further configured to erase a page or a sector where the plurality of rows of memory cells are located before applying a high voltage to the plurality of rows of memory cells in the Eflash line by line according to the first programming instruction, and writing the data of the plurality of bytes in the target cache unit into the Eflash line by line according to the bytes.

In general, before performing a programming operation, the state of the memory cells corresponding to the programming address is determined, and if the state of the memory cells is valid, the page or sector where the memory cells are located needs to be erased first, and then the programming data is written into the memory cells corresponding to the programming address. If the state of the memory cell is erased, the programming data can be directly written into the memory cell corresponding to the programming address.

Optionally, the program acceleration unit 13 is further configured to receive an erase command from the MCU, and erase a plurality of pages or sectors in the Eflash one by one according to the erase command. The erased Eflash comprises a plurality of pages or sectors, wherein the pages or sectors comprise a plurality of rows of memory cells.

When the method is realized, the initial address of the page or the sector to be erased in the Eflash can be sequentially stored in the buffer module 311 by the MCU, and when the program acceleration unit 13 receives an erase command from the MCU, continuous erasing is realized by reading the initial address of the page or the sector to be erased stored in the buffer module 311, so that the speed of the erasing process is increased.

Based on a similar principle to the adjacent two-byte programming process, when two consecutive pages or sectors are erased, since the buffer module 311 stores the start addresses of the two consecutive pages or sectors, the program acceleration unit 13 can implement consecutive erasing of the pages or sectors by reading the start addresses of the pages or sectors from the buffer module 311, without receiving the erase command transmitted by the MCU once again every time one page or sector is erased, thereby accelerating the speed of the erase process.

In the embodiment of the present disclosure, the MCU does not send the read instruction and the program instruction to the bus interface unit at the same time, so the read acceleration unit 12 and the program acceleration unit 13 do not operate at the same time, and thus the bus interface unit, the first FIFO buffer unit 14, the second FIFO buffer unit 15, and the control signal conversion unit can be used together by the read acceleration unit 12 and the program acceleration unit 13 without generating a conflict in use. In this way, the cost of the accelerator 302 can be reduced.

Optionally, the control signal conversion unit 16 is configured to convert the read instruction sent by the read acceleration unit 12 and the program instruction sent by the program acceleration unit 13 into an instruction conforming to the Eflash interface. Therefore, when facing different types of Eflash, only the control signal conversion unit 16 needs to be modified, so that the instructions sent by the read acceleration unit 12 and the programming acceleration unit 13 can be effectively identified by Eflash.

Optionally, the control signal conversion unit 16 is further configured to perform synchronous processing on different signals, so as to prevent signal contention hazards. The related content of the synchronization processing for different signals is more related to the related art, and a detailed description thereof is omitted here.

By performing synchronous processing on different signals, the problem of contention risk among different signals can be prevented.

In the embodiment of the disclosure, the read acceleration unit 12 acquires the predicted data and stores the predicted data into the FIFO buffer unit, so that the MCU can continuously read the data from the FIFO buffer unit, and the efficiency of the MCU in the process of reading the data from the Eflash is improved; the data to be programmed sent by the MCU is stored into the target cache unit through the programming acceleration unit, and the data of a plurality of bytes in the target cache unit are sequentially written into the Eflash.

The division of the devices in the embodiments of the present disclosure is schematically shown as only one logic function division, and there may be another division manner when actually implementing the division, and in addition, each functional device in the embodiments of the present disclosure may be integrated in one processor, or may exist separately and physically, or two or more devices may be integrated into one device. The integrated device can be realized in a form of hardware or a form of a software functional device.

Fig. 7 shows a schematic diagram of a state machine of the read acceleration unit, referring to fig. 7, in which the read acceleration unit 12 is in an idle state 701 in an initial state; the address arbitration state 702 is entered from the idle state 701 upon receipt of a second read instruction.

In the address arbitration state 702, the read acceleration unit 12 determines to send the data in the first FIFO buffer unit 14, the Eflash, or the second FIFO buffer unit 15 to the MCU according to the first address indicated by the second read instruction. Here, since the second read instruction is used to access data in an address range in the Eflash, the first address may be any address in the address range, and the data accessed by the single read instruction is often continuous in address space, in the actual implementation process, when the first address in the address range is first entered into the address arbitration state 702, only the first address in the address range needs to be determined as the first address.

When the read acceleration unit 12 determines to send the data in the first FIFO buffer unit 14 to the MCU, it enters a first FIFO buffer unit read state 703; when the read acceleration unit 12 determines to send the data in the Eflash to the MCU, entering an Eflash read state 704; when the read acceleration unit 12 determines to send data in the second FIFO buffer unit 15 to the MCU, it enters a second FIFO buffer unit read state 705.

If the read acceleration unit 12 determines to send the data in the Eflash and the data in the buffer module to the MCU (i.e. the data stored in the buffer module hits) according to the address indicated by the received read instruction, in this case, the Eflash read state 704 is first entered, in the Eflash read state 704, the read acceleration unit 12 determines the predicted data from the Eflash according to the first address, and stores the predicted data in the first FIFO buffer unit 14 or the second FIFO buffer unit 15, after storing the predicted data in the first FIFO buffer unit 14 or the second FIFO buffer unit 15 (i.e. after storing the predicted data is completed), the data stored in the buffer module hits all at this time, and enters the next address arbitration state 706, and then directly enters the address arbitration state 702, and then repeatedly executes the above procedure.

If the read acceleration unit 12 determines to send the data in the first FIFO buffer unit 14 to the MCU according to the address indicated by the received read instruction, the address arbitration state 702 enters the first FIFO buffer unit 14 reading state 703, sends the data to the MCU, after reading one byte of data from the first FIFO buffer unit 14, enters the next address arbitration state 706, and in the next address arbitration state 706, continues to read one byte from the first FIFO buffer unit 14, that is, continues to enter the first FIFO buffer unit reading state 703, and then repeatedly executes the above procedure until the data in the first FIFO buffer unit 14 is read.

After the data in the first FIFO buffer unit 14 is read, the read acceleration unit 12 is in the next address arbitration state 706, and if a read command (e.g., a third read command) from the MCU is continuously received, the process enters the address arbitration state 702, and the above-mentioned process is repeatedly performed on the third read command. If the data in the first FIFO buffer unit 14 is read completely, a read command from the MCU is not received, it indicates that the last read command is currently executed, i.e. the MCU has already read the data completely at this time, and returns to the idle state 701.

FIG. 8 shows a schematic diagram of the state machine of the programming acceleration unit, see FIG. 8, in an initial state the programming acceleration unit is in an idle state 801; the address arbitration state 802 is entered after receiving the first programming instruction, where the address arbitration state 802 is used to determine whether the data sent by the MCU to be written into the Eflash exists in the first FIFO buffer unit 14 or the second FIFO buffer unit 15 (i.e. determining the target buffer unit).

If the data to be written into the Eflash sent by the MCU exists in the first FIFO buffer unit 14 (i.e., the first FIFO buffer unit 14 is the target buffer unit), entering the first FIFO buffer unit 14 to read the data state 803, and then entering the byte programming state 805, i.e., writing the data of the first byte in the target buffer unit into a row of storage units corresponding to the programming start address in the Eflash from the first FIFO buffer unit 14; then, the data reading state 803 of the first FIFO buffer unit is entered from the byte programming state 805, and then the byte programming state 805 is entered, so that the data of the second byte in the target buffer unit is written into the next address of the programming start address, and so on, so as to realize the line-by-line writing of the data in the target buffer unit into the Eflash. The second FIFO buffer unit is a target buffer unit, similar to the first FIFO buffer unit 14, and the first FIFO buffer unit read state 803 is replaced with the second FIFO buffer unit read state 804, which is omitted here.

After the data in the target cache unit are written into the Eflash, the write acceleration unit is in a byte programming state 805, and if the write acceleration unit continues to receive a next programming instruction (for example, a second programming instruction) from the MCU, the write acceleration unit enters an address arbitration state 802 to write the data indicated by the second programming instruction into the Eflash. If the write acceleration unit does not receive the next programming instruction from the MCU, it indicates that the last programming instruction is currently executed, i.e. the programming has been completed, and thus returns to the idle state 801.

The following are method embodiments of the present application, for details not described in detail in the method embodiments, reference may be made to the apparatus embodiments described above.

Fig. 9 is a flowchart illustrating an acceleration method for an Eflash according to an exemplary embodiment of the present disclosure. The method may be implemented based on the acceleration device described above. Referring to fig. 9, the method includes:

in step 901, prediction data is acquired, and the prediction data is stored in the first FIFO buffer unit or the second FIFO buffer unit.

In step 902, a second read command from the MCU is received, the second read command being for accessing the first address of the Eflash, the second read command being a next read command to the first read command.

In step 903, a third read instruction is generated according to the first address, where the third read instruction is used to access the Eflash, the first FIFO buffer unit, or the second FIFO buffer unit.

Optionally, step 903 includes: responding to the predicted data to contain first data corresponding to the first address, storing the predicted data in the first FIFO buffer unit, and generating a third read instruction for accessing the first FIFO buffer unit according to the first address; or,

responding to the predicted data to contain first data corresponding to the first address, storing the predicted data in the second FIFO buffer unit, and generating a third read instruction for accessing the second FIFO buffer unit according to the first address; or,

and responding to the predicted data not to contain the first data corresponding to the first address, and generating a third read instruction for accessing the Eflash according to the first address.

Optionally, the method further comprises: reading data from the Eflash and storing the data in the second FIFO buffer unit while the MCU reads the data from the first FIFO buffer unit; or the read acceleration unit is further configured to use a ping-pong mechanism for the buffer module, and read data from the Eflash and store the data in the first FIFO buffer unit while the MCU reads data from the second FIFO buffer unit.

Optionally, the first FIFO buffer unit and the second FIFO buffer unit are programmable FIFO buffer units, the method further comprising: and setting the depths of the first FIFO buffer memory unit and the second FIFO buffer memory unit according to at least one of the kernel type of the MCU and the type of the program executed by the MCU.

Optionally, the method further includes, before sequentially applying high voltages to the multiple rows of storage units in the Eflash according to the first programming instruction, and sequentially writing the data of the multiple bytes in the target cache unit into the Eflash, receiving an erase instruction sent by the MCU, and sequentially erasing multiple pages or sectors in the Eflash according to the erase instruction, where the multiple pages or sectors in the Eflash include the pages or sectors in which the multiple rows of storage units are located.

Optionally, the program acceleration unit is further configured to receive an erase command sent by the micro control unit, and erase a plurality of pages or sectors in the Eflash one by one according to the erase command.

In the embodiment of the disclosure, the read acceleration unit acquires the predicted data and stores the predicted data into the FIFO buffer unit, so that the MCU can continuously read the data from the FIFO buffer unit, and the efficiency of the MCU in the process of reading the data from the Eflash is improved; the data to be programmed sent by the MCU is stored into the target cache unit through the programming acceleration unit, and the data of a plurality of bytes in the target cache unit are sequentially written into the Eflash.

The foregoing description of the preferred embodiments of the present disclosure is provided for the purpose of illustration only, and is not intended to limit the disclosure to the particular embodiments disclosed, but on the contrary, the intention is to cover all modifications, equivalents, alternatives, and alternatives falling within the spirit and principles of the disclosure.

Claims

1. An acceleration device for an embedded flash memory, the acceleration device comprising: a buffer module (311) and an acceleration module (312),

the buffer module (311) comprises a first-in first-out buffer unit (14) and a second first-in first-out buffer unit (15), wherein the first-in first-out buffer unit (14) and the second first-in first-out buffer unit (15) are used for storing prediction data, and the prediction data is determined according to a first reading instruction from the micro control unit;

the acceleration module (312) comprises a read acceleration unit (12), wherein the read acceleration unit (12) is respectively connected with the first-in first-out buffer unit (14) and the second first-out buffer unit (15), the read acceleration unit (12) is used for receiving a second read instruction from the micro control unit, the second read instruction is used for accessing a first address of the embedded flash memory, and a third read instruction is generated according to the first address, wherein the second read instruction is the next read instruction of the first read instruction, and the third read instruction is used for accessing the embedded flash memory, the first-in first-out buffer unit (14) or the second first-in first-out buffer unit (15).

2. The acceleration apparatus according to claim 1, characterized in that the read acceleration unit (12) is adapted to implement the generation of the third read instruction from the first address in any one of the following ways:

responding to the predicted data to contain first data corresponding to the first address, storing the predicted data in the first-in first-out buffer unit (14), and generating a third read instruction for accessing the first-in first-out buffer unit (14) according to the first address; or,

responding to the predicted data to contain first data corresponding to the first address, storing the predicted data in the second first-in first-out buffer unit (15), and generating a third read instruction for accessing the second first-in first-out buffer unit (15) according to the first address; or,

and generating a third read instruction for accessing the embedded flash memory according to the first address in response to the predicted data not containing the first data corresponding to the first address.

3. The acceleration device according to claim 1, characterized in that the read acceleration unit (12) is further configured to read data from the embedded flash memory and store the data to the second fifo buffer unit (15) while the micro control unit reads data from the first fifo buffer unit (14);

Or, the read acceleration unit (12) is further configured to read data from the embedded flash memory and store the data in the first-in first-out buffer unit (14) while the micro control unit reads data from the second first-in first-out buffer unit (15).

4. The acceleration apparatus of claim 1, characterized in, that the first fifo buffer unit (14) and the second fifo buffer unit (15) are programmable fifo buffer units,

the read acceleration unit (12) is further configured to set depths of the first fifo unit (14) and the second fifo unit (15) according to at least one of a kernel type of the micro control unit and a type of a program executed by the micro control unit.

5. The acceleration apparatus of any one of claims 1-4, wherein the acceleration module (312) further comprises: a programming acceleration unit (13), wherein the programming acceleration unit (13) is respectively connected with the first-in first-out buffer unit (14) and the second first-in first-out buffer unit (15);

the programming acceleration unit (13) is configured to receive a first programming instruction from the micro control unit, apply a high voltage to a plurality of rows of storage units in the embedded flash memory row by row according to the first programming instruction, write a plurality of bytes of data in a target cache unit into the embedded flash memory row by row according to bytes, and store one byte of data in each row of storage units, where the target cache unit is the first-in first-out cache unit (14) or the second first-in first-out cache unit (15).

6. The acceleration apparatus according to claim 5, characterized in, that the programming acceleration unit (13) is configured to write the data of the ith byte in the target cache unit into the embedded flash memory in the following manner:

determining a programming address of an ith byte according to the programming address of the ith byte-1, wherein the programming address of the ith byte-1 is determined according to a programming starting address corresponding to the first programming instruction;

applying a high voltage to a row of memory cells corresponding to a programming address of an ith byte, and writing the ith byte into the row of memory cells corresponding to the programming address of the ith byte;

where i is an integer, i is greater than 1 and i is less than or equal to the number of bytes of the plurality of bytes of data.

7. The acceleration apparatus according to claim 6, characterized in that the cache units other than the target cache unit among the first fifo cache unit (14) and the second fifo cache unit (15) are further configured to store data to be programmed sent by the micro control unit while reading data from the target cache unit and writing data to the embedded flash memory.

8. The acceleration device of claim 6 or 7, characterized in, that the programmed acceleration unit (13) is further adapted to receive an erase command from the micro control unit; and according to the erasing instruction, a plurality of pages or sectors in the embedded flash memory are erased one by one.

9. An acceleration method for embedded flash memory, the method comprising:

determining predicted data according to a first reading instruction from the micro control unit, and storing the predicted data in the first-in first-out buffer unit or the second first-in first-out buffer unit;

receiving a second read instruction from the micro control unit, wherein the second read instruction is used for accessing a first address of the embedded flash memory, and the second read instruction is a next read instruction of the first read instruction;

and generating a third read instruction according to the first address, wherein the third read instruction is used for accessing the embedded flash memory, the first-in first-out buffer memory unit or the second first-in first-out buffer memory unit.

10. The method according to claim 9, wherein the method further comprises:

receiving a first programming instruction from the micro control unit;

according to the first programming instruction, applying high voltage to a plurality of rows of memory cells in the embedded flash memory row by row;

writing data of a plurality of bytes in a target cache unit into the embedded flash memory line by line according to bytes, wherein each line of storage unit stores one byte of data, and the target cache unit is the first-in first-out cache unit or the second first-in first-out cache unit.