SUMMERY OF THE UTILITY MODEL
The present application provides a computer system that can flexibly adapt to changing requirements of a block chain algorithm. In one embodiment of a computer system, a wafer on Wafer (WAFER) technique is used to stack a memory device on a wafer and a core logic circuit on a wafer in a solid structure. This approach can make the space between two wafers not need to be excessive, and can directly use thousands of connection pads as the signal transmission path. Since the number of transmission lines is no longer limited by the planar design, a large number of dedicated wires can be used to solve the performance problem of data transmission.
The memory device of the present application is configured on a layer of wafer dedicated for memory, and may include a plurality of memory arrays (BANK). Each memory array is mainly composed of a common line and a plurality of memory units. The common lines may be data lines or address lines in this embodiment, and each common line is correspondingly connected to one of the rows or one of the columns of the memory cells. A memory cell refers to a basic unit of storage bit information, which is usually turned on by an address signal and read or write data by a data signal.
The memory device further comprises a line driver connected to the common line for driving the memory cells. The line driver may be a substitute for a data driver or an address decoder.
As mentioned above, the computer system includes a logic circuit layer integrated with the memory crystal layer into a Wafer stack (Wafer on Wafer). The circuit includes a plurality of connecting pads for transmitting signals.
In the logic circuit layer of the computer system, a delay controller is configured, and the memory array is connected through the connecting pad. The design objective is to flexibly adjust the number of memory cells connected to the common line to dynamically change the delay characteristics of the memory array.
In a further embodiment, a plurality of multiplexers are configured in each memory array. Each multiplexer is separated by a specific number of rows or columns. The multiplexers define a memory array as a plurality of memory regions, each memory region including a specific number of rows or a specific number of columns of memory cells. In other words, each two adjacent memory regions are configured with a multiplexer, and the line driver is connected with a dedicated line.
When the delay controller transmits a control signal through a connection pad to start a multiplexer, the common line is disconnected into a first line segment and a second line segment, and the second line segment is connected to the dedicated line.
Since the common line is originally connected in series with a plurality of memory cells, after the common line is disconnected into two line segments, two sub-arrays are logically formed. In other words, the memory area corresponding to the first segment forms a first sub-array, and the memory area corresponding to the second segment forms a second sub-array. For convenience of management, the disconnection manner of the present embodiment may be halving. Therefore, a memory array can be equally divided into two sub-arrays with equal size, and the two sub-arrays can be further divided into four sub-arrays by more multiplexers, and so on.
In one embodiment, the dimension of the memory array, i.e., the formation of the sub-array, is changed by disconnecting the common data line into two shorter data lines. The line driver includes a data driver. The common lines herein represent one or more common data lines, each of which connects the data driver and a corresponding row of the memory cells for transmitting data signals of the memory cells. After the multiplexer is started, the second sub-array no longer shares the common data line of the first sub-array due to the disconnection of the common data line. Instead, a multiplexer additionally provides dedicated lines for transmitting data signals to the memory cells in the second sub-array. The method can reduce the number of the memory units on the shared line, further reduce the capacitance load and accelerate the response speed of data driving.
As for the second sub-array, since the data lines are connected to the data driver by dedicated lines instead, different data signal sources are received independently, and the same effect of low load and high speed is enjoyed. Still further, the address lines of the second sub-array may be changed to share the address lines of the first sub-array. Thus, in order to change the original memory array dimensions, the number of data lines (array width) is doubled, and the number of address lines (array height) is halved. The memory device originally comprises a plurality of common address lines, and each common address line is connected with the address decoder and a corresponding line of memory units in the memory units and is used for transmitting address signals of the memory units. The line driver includes an address decoder connected to each row of memory cells in the memory array via the common address line. In practice, after the multiplexer is activated, the address decoder makes the memory cells in the second sub-array share the common address line of the first sub-array according to the control signal, or synchronously drives the first sub-array and the second sub-array by using the same address signal. In other words, the common address lines of the second sub-array and the corresponding number of rows in the first sub-array are made to receive the same address signals.
In another specific embodiment, the dimension of the memory array, i.e., the formation of the sub-array, may be changed by disconnecting the common address line into two shorter address lines. In this case, the common line represents one or more common address lines for transmitting address signals of the memory cells. After the multiplexer is activated, the memory cells in the second sub-array transmit address signals using the dedicated line. Meanwhile, the address decoder enables the memory cells in the second sub-array to share the shared data line of the first sub-array according to the control signal, or drives the first sub-array and the second sub-array by using the same data signal. Since the memory cells in the second sub-array use a different address line than the first sub-array, the memory array dimension is equal to halving the number of data bits (array width) and doubling the number of address lines (array height).
In a further embodiment, the logic circuit layer further comprises a memory controller coupled to the memory array through the connection pad. A core is coupled to the memory controller and the latency controller for executing an application. The kernel can set the multiplexer in the memory array through the delay controller according to an application program condition required by the application program, so that the memory array changes dimension, namely, the memory array is divided into a plurality of power sub-arrays of two and then recombined into a new array dimension meeting the application program condition. The kernel may use the reconfigured memory array through the memory controller while executing the program.
In a further embodiment, the application conditions comprise a reaction time required by the application. In the embodiment of disconnecting the data lines, the larger the number of multiplexers enabled by the delay controller, the shorter the response time of the new memory array formed,
the present application further provides a memory control method, which is applied to the computer system and the memory device. When an application program is executed by a kernel, the kernel sets the multiplexer in the memory array through the delay controller according to an application program condition required by the application program, so that the memory array is divided into two or more sub-arrays meeting the application program condition, and the memory sub-arrays are used through the memory controller when the application program is executed.
In summary, the present application provides a memory architecture capable of flexibly adjusting array dimensions based on a wafer stacking technique, so that a block chain server product has the capability of adapting to the requirements of future algorithms.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is an embodiment of a three-dimensional wafer product 100 of the present application. The three-dimensional wafer product 100 is formed by stacking at least one memory crystal layer 110, a logic circuit layer 120, and a substrate 130. Substrate 130 provides additional routing space in addition to providing substantial support. A plurality of connection pads 102 or 104 are disposed between each layer to provide signal channels. The three-dimensional wafer product 100 of the present embodiment is a semi-finished product of the computer system 700, and after being diced, a plurality of computer systems 700 can be generated to operate independently. As shown in FIG. 1, each computer system 700 may each include a plurality of memory devices 112 and a plurality of logic circuits 122, all having the same three-dimensional wafer structure. In other words, the memory device 112 and the logic circuit 122 included in each computer system 700 are respectively arranged in the memory crystal layer 110 and the logic circuit layer 120 in advance, and then are fabricated into a three-dimensional structure in a wafer stack manner. In the three-dimensional structure, the circuit wires between the chipsets do not need to occupy extra area, and thousands of connection pads 102 and 104 can be directly used as signal transmission paths, so that the efficiency problem of data transmission is effectively solved, and the computer system 700 of the present application is realized.
Fig. 2 is an embodiment of a memory device of the present application. The memory device 112 of the present application is disposed on the memory crystal layer 110 dedicated to the memory. The fabrication may be modular, and each memory device 112 may include multiple memory arrays 200, otherwise known as memory arrays (BANKs). The operation of each memory matrix is controlled by an array selection signal # SL. Each memory array 200 is comprised of a plurality of memory cells 202. Memory cells 202 are arranged in rows and columns, each row sharing an address line and receiving address signals numbered R0 through Rn. Each row shares a data line for transmitting data signals B0-Bn. In other words, each common line is correspondingly connected to one of the rows or one of the columns of the memory cells 202. The memory cell 202 is a basic unit for storing bit information, and is usually turned on by an address signal and read or write data by a data signal. The address lines are coupled to an address decoder 210 for passing address signals 214 generated by the address decoder to cause selected one or more rows of memory cells 202 to be turned on. The data lines are connected to a data driver 220 for transferring data written to or read from the memory cells 202. The architecture disclosed in fig. 2 is merely an example, and in actual manufacturing, the number of the memory array 200, the address decoder 210, and the data driver 220 is not limited to one, and the link relationship therebetween is not limited to one-to-one or many-to-many. In summary, the address decoder 210 and the data driver 220 in the memory device are line drivers. The data lines and address lines are connected as a common line to drive the plurality of memory cells 202 in a mesh-like interleaved manner.
Fig. 3-5 illustrate an embodiment of a memory array 200 and multiplexer 302 according to the present application. To achieve the effect of dynamically adjusting the delay characteristics, the present embodiment configures a plurality of multiplexers 302 in each memory array 200. Each multiplexer 302 is separated by a specified number of rows or columns. The multiplexers 302 define a memory array 200 as a plurality of memory regions 310, each memory region 310 containing a particular number of rows or a particular number of columns of memory cells 202. Taking fig. 3 as an example, a multiplexer 302 is disposed adjacent to each of two memory areas 310.
Fig. 4 shows the operation of the multiplexer 302 when it is activated. The multiplexer 302 is connected to the data driver 220 via a dedicated line 224. When a control signal # S is transmitted from the logic circuit layer 120 shown in FIG. 1 to the multiplexer through one of the connection pads 102, the shared data line 222 is disconnected at the location of the multiplexer 302, so that the upper and lower memory regions 310 no longer share the same data line 222. The data line 222 is divided into a first line segment in the upper half memory area 310 and a second line segment in the lower half memory area 310. In the embodiment, the memory region 310 in the upper half can continue to receive the original data signals B0-B7, but since the number of the memory cells shared by the first segment is reduced, the capacitive load is significantly reduced, and thus the delay time of the memory region 310 can be effectively shortened, i.e. the response speed is increased. Multiplexer 302 switches the second segment of the lower memory region 310 to dedicated line 224 so that the lower memory region 310 continues to be controlled by data driver 220. For example, the multiplexer 302 continues to receive data signals numbered B0-B7 from the data drivers 220 via the dedicated lines 224. Since fewer memory cells 202 are shared on the second segment than before, a delay reduction is also achieved.
Fig. 5 shows another embodiment of the multiplexer 302 when activated. In this embodiment, in addition to changing the delay characteristics of the memory array 200, the dimension of the memory array 200 may also be changed. The sub-array is formed by similarly dividing the common data line 222 into two shorter upper and lower portions. In other words, since the data line 222 originally connects a plurality of memory cells 202 in series, two sub-arrays are logically formed after being broken into two line segments. The memory area 310 corresponding to the first line segment forms a first sub-array, and the memory area 310 corresponding to the second line segment forms a second sub-array. After the multiplexer 302 is activated, the second sub-array no longer shares the common data line 222 of the first sub-array due to the disconnection of the common data line 222. Rather, multiplexer 302 additionally provides dedicated lines 224 for routing data signals to the memory cells 202 in the second sub-array. In the present embodiment, the data driver 220 is improved such that the data signals transmitted via the dedicated line 224 are not numbered B0 to B7 but newly added B8 to B15. Furthermore, the present embodiment can be modified by the address decoder such that the second sub-array shares the address line of the first sub-array, or receives the same address signals R0 to R3 as the first sub-array.
That is, the address decoder 230 may make the memory cells in the second sub-array share the common address line of the first sub-array or synchronously drive the first sub-array and the second sub-array using the same address signal according to the control signal # S. Thus, the original memory array 200 dimensions are changed. The number of data lines (array width) is doubled from the original 8 to 16, while the number of address lines (array height) is halved from the original 8 to 4. Although the embodiment illustrates the division and rearrangement of the memory arrays 200 by 8 × 8, it is understood that in actual manufacturing, each memory array 200 may be a large array with a capacity of hundreds of megabits.
Fig. 6 is a further embodiment of the memory device 112 of the present application. A memory array 200 may have n multiplexers 402#1 through 402# n configured therein to divide the memory array 200 into n memory regions 410#1 through 410# n. The memory array 200 maintains conventional operation when multiplexers are not enabled. In addition to transmitting data signals over the conventional common data lines 222, the data driver 220 also provides a plurality of dedicated lines 224 to the multiplexers 402#1 through 402# n. The memory device 112 further includes an address decoder 230 for sending address signals # A to each of the memory regions 410#1 through 410# n via address lines 232. The data lines 222 and address lines 232 are shown as single lines, but it will be understood that the implementation may include multiple lines, one for each row or column of the memory array. Similar to conventional designs, each memory cell in the memory array 200 is commonly connected to a reference voltage, or ground # Gnd.
In practice, each multiplexer may be activated or not upon receiving a control signal # S. For example, the control signal # S may be a two-power value, i.e., 2,4,8, or 16, etc., for instructing the multiplexers 402# 1-402 # n to divide the memory array 200 into a corresponding number of sub-arrays. When the control signal # S is 2, it indicates that a multiplexer is required to equally divide the memory array 200 into two sub-arrays. At this point, the n/2 numbered multiplexer in the memory array 200 may be enabled in response to the control signal to achieve this. Similarly, when the value of the control signal # S is 4, it means that three multiplexers are required to equally divide the memory array 200 into four sub-arrays. At this time, multiplexers numbered n/4,2n/4, and 3n/4 in the memory array 200 can be activated in response to the control signal # S, so as to achieve the effect of dividing four blocks. In this design, the value of n may be preset to a power of two to facilitate the above-described segmentation.
In another implementation, the control signal # S may also be used to determine that every few memory regions need to be cut. For example, when the value of the control signal # S is 1, it indicates that each memory region needs to be isolated, that is, all multiplexers 402#1 to 402# n are activated, so that the memory array 200 becomes n sub-arrays, each sub-array including one memory region. When the value of the control signal # S is 2, it indicates that the memory array 200 needs to be divided into two memory regions. Accordingly, multiplexers numbered 2,4,6,8, etc. divisible by 2 are enabled in response to the control signal to make the memory array n/2 sub-arrays, each sub-array containing 2 memory regions.
In further implementations, the memory array 200 may be partitioned in a more flexible manner. For example, each multiplexer receives different control signals to determine whether to activate. The segmentation possibilities that can be generated in practice are therefore not limited to the above-described embodiments.
In the embodiment of fig. 6, the data driver 220 and the address decoder 230 may be further modified to change the data signal provided to each memory region or change the address signal provided to each memory region according to the division of the control signal # S. This enables the memory array 200 to logically dynamically change the length and width dimensions as described in the embodiment of fig. 5.
Fig. 7-8 are various memory array embodiments of the present application. Fig. 7 shows a memory array 500a generated by recombining the memory array 200 of fig. 6 by multiplexer division. Each of the memory regions 410#1 to 410# n originally has W rows of memory cells (width) and H rows of address lines (height). After dimension reorganization, a memory array 500a is formed that includes a plurality of sub-arrays 502 a. All the sub-arrays 502a share the H row address line, and the bit width is extended to nW columns. This means that multiplexers have to be provided to provide dedicated lines for the nW rows of memory cells to connect to the data drivers, i.e. nW banks. With the support of the wafer stacking technology, the technical difficulty of implementation can be easily overcome. In the embodiment of fig. 7, the original memory array dimension may be nH × W, where it is reformed to H × nW. Therefore, when the data line of each row of memory cells is driven, the capacitance load to be overcome becomes smaller by n times, so that the response speed of the memory cells becomes faster.
Fig. 8 shows a case of the memory array 500b generated after the memory array 200 of fig. 6 is divided and recombined by the multiplexer. Each of the memory regions 410#1 to 410# n originally has W rows of memory cells (width) and H rows of address lines (height). Here dimensions are rearranged by activating a multiplexer every third memory block, forming a memory array 500b comprising a plurality of sub-arrays 502 b. Each subarray 502b includes two memory regions, rows 2H high and rows W wide. Subarrays 502a in the memory array 500b share 2H row address lines. More specifically, through the improvement of the address decoder 230, all the memory sub-arrays 502b can share the same address line or the address decoder 230 can transmit the same address signal to the memory sub-arrays according to the division condition of the memory array 500 b. The bit width of the memory array 500b is extended to nW/2 rows. This means that the multiplexer needs to provide a corresponding number of dedicated lines for nW/2 rows of memory cells to connect to the data driver. In the embodiment of fig. 8, compared to fig. 7, the delay time is not as good as the architecture of fig. 7 due to the larger height (number of address lines) of sub-array 502b, but the number of dedicated lines required is smaller. This illustrates that the architecture of the present embodiment can be flexibly adjusted according to different requirements.
FIG. 9 shows a further embodiment of a memory layer 600 in a computer system 700. Based on the concepts described in the foregoing embodiments, the memory layer 600 may be one of the regions of the computer system cut out of the memory crystal layer 110 of fig. 1, and includes a plurality of memory devices 510a to 510 d. Each memory device 510 a-510 d may be configured with different delay characteristics by applying a plurality of different control signals, respectively. For example, a computer system 700 may pre-configure each memory device 510 a-510 d in firmware, configure the memory devices 510 a-510 d via control signals # S1- # S4, respectively, after booting, and then boot-load the operating system. In still further aspects, the computer system 700 of the present application may also be designed to allow dynamic and seamless changes in memory latency characteristics during operation. For example, when an application program is loaded, the requirement of the application program on the memory delay is judged, and the control signal is dynamically sent out to change the dimension of the memory devices so as to change the delay characteristic.
FIG. 10 is a further embodiment of a computer system 700 of the present application. After the wafer stacking process is completed, the three-dimensional wafer product 100 shown in fig. 1 is further processed by a wafer dicing process to form a plurality of computer systems 700. Shown in the memory layer 600 are memory devices 510 a-510 c configured in accordance with the embodiment of FIG. 9. A system layer 620 is stacked with the memory layer 600. The system layer 620 is cut from the logic circuit layer 120 of FIG. 1 and includes the necessary logic circuits for various computer architectures, such as the core 616 and the memory controllers 614 a-614 c. Each memory controller 614 a-614 c is coupled to the memory devices 510 a-510 c in the memory tier 600 through an interface module 612 a-612 c. The interface module is an interface designed to ensure data transmission, and is commonly referred to as a physical layer interface (PHY). As in fig. 1, signals are transmitted between the stack of memory layers 600 and system layers 620 via a plurality of connection pads (not shown). The system layer 620 is also fixed on the substrate 130 through a plurality of connecting pads 104. Substrate 130 may provide additional routing space in addition to providing substantial support. The memory controller 614 a-614 c may provide the address signal # a to the memory devices 510 a-510 c via the interface modules 612 a-612 c and pads to access the data signal # D. This architecture of the computer system 700 is merely an example. The configuration number of the memory controller, the interface module and the memory device is not limited to three. The core 616 may be a multi-core architecture.
In the computer system 700 of fig. 10, a delay controller 602 is configured to send control signals # S to the memory devices 510 a-510 c via one or more connection pads connected to the memory layer 600. These memory devices 510a and 510c, as described in the previous embodiments, can flexibly adjust the number of memory cells connected on the common line in each memory array according to the control signal # S to dynamically change the delay characteristics of the memory array. The delay controller 602 may be controlled by a core 616. When the kernel 616 executes an application program, it may determine the latency requirement of the application program in real time, and instruct the latency controller 602 to adjust the memory devices 510a to 510 c. For example, by changing the memory arrays to a dimension machine, each memory array can be divided into two power sub-arrays and then recombined into a new array dimension that meets the application requirements. While executing the program, the core 616 may adaptively use the memory devices 510 a-510 c that meet the application requirements through the memory controllers 614 a-614 c.
FIG. 11 is a further embodiment of a memory array and multiplexer of the present application. Memory cells 202a and 202b are used herein to illustrate how multiplexer 402 reduces the capacitive loading of the data lines. The basic logic in a memory cell is to control a capacitor with a switch to charge or discharge the capacitor to represent a bit of data. The address lines carrying address signals R0 and R1 are coupled in series with the gates of memory cells 202a and 202 b. The data line for carrying data signal B0 is coupled in series with one terminal of the switches of memory cells 202a and 202B. When the multiplexer 402 is not activated, the first line segment 222a and the second line segment of the data line are connected to form the same line segment, so that the memory cells 202a and 202B share the same data line to receive the data signal B0 and operate normally. When the multiplexer 402 is turned on by the control signal # S, the switch in the multiplexer 402 disconnects the second segment 222b from the first segment 222a and reconnects the second segment 222b to a dedicated line 224. The dedicated lines are coupled to data driver 220 so that memory cell 202b can still receive data signals. With this configuration, since the first line segment 222a and the second line segment 222b respectively drive half of the memory cells, the delay effect caused by the load capacitance can be reduced, and thus the response speed of the memory can be increased. Although the embodiment is described with respect to only one row and two columns of memory cells, it should be understood that the multiplexers 402 may actually be a plurality of numbers, interposed between rows in a memory array, while controlling the sharing and disconnection of data lines. The data signal transmitted on the dedicated line 224 is not limited to be the same as the data signal of the first line segment 222 a. Data driver 220 may also be modified so that memory cells 202a and 202b receive different data signals after being turned off. Correspondingly, the address decoder may be modified such that the address lines that originally transmit the address signals R0 and R1 respectively share the same address signal, such as R0, after the multiplexer 402 is activated, to turn on the memory cells 202a and 202b simultaneously. In this case, memory cells 202a and 202b may be logically viewed as different bits on the same row. That is, the width and height dimensions of the memory array are changed from 1 × 2 to 2 × 1. This architecture has a significant effect on flexibly adapting to different delay requirements. The actual circuit structures of the memory cells 202a and 202b are well known in the art, and therefore the present embodiment is only illustrative and not limited to the detailed implementation.
FIG. 12 shows an embodiment in which a multiplexer is placed on the address line. The foregoing embodiments mainly explain how to shorten the common data line. However, embodiments of the present application may also start with address lines. The dimension of the memory array, i.e., the first sub-array 810 and the second sub-array 820, may be changed by disconnecting the common address line into two shorter address lines. In this case, a plurality of common address lines carry address signals R0 through R7 for the memory cell 202. When the multiplexer 802 receives the control signal # S and is activated, the disconnected address lines in the second sub-array 820 are switched to the dedicated lines and connected to the address decoder. Further, the data driver 220 and the address decoder 230 may be modified so that the second sub-array 820 shares a common data line of the first sub-array 810 or the first sub-array 810 and the second sub-array 820 are driven by the same data signal according to the control signal # S. At the same time, the second sub-array 820 is made to use a different address signal source than the first sub-array, e.g., R8 through R15 (not shown). Thus, logically equal to creating a new memory array, the dimensions are half the number of original data bits (array width) and the number of address lines (array height) is doubled. In the method of fig. 11, since the address lines are shortened, the driving load of the address lines can be reduced, and the effect of changing the delay characteristic of the memory array is also achieved.
Fig. 13 is a flowchart of a memory control method according to the present application. The present application further provides a memory control method, which is applied to the computer system and the memory device. In step 901, when a kernel executes an application, the kernel instructs the latency controller to send a control signal according to an application condition required by the application. In step 903, the multiplexer in the memory array changes the dimension of the memory array according to the control signal. For example, the memory array is divided into two or more sub-arrays that meet the application conditions. In step 905, the kernel uses the memory subarray via the memory controller while executing the application.
In summary, the present application provides a memory architecture capable of flexibly adjusting array dimensions based on a wafer stacking technique, so that a block chain server product has the capability of adapting to the requirements of future algorithms.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.