The 8051 Microcontroller Instruction Set and Assembly Programming
The 8051 Microcontroller Instruction Set and Assembly Programming
University of Technology
Department of Electrical Engineering
Microcontroller Lab
Experiment 1:
The 8051 Microcontroller Instruction Set and
Assembly Programming
Reported by
Supervised by
The process of writing program for the microcontroller mainly consists of giving instructions
(commands) in the specific order in which they should be executed in order to carry out a
specific task. All commands are known as INSTRUCTION SET. All microcontrollers
compatible with the 8051 have in total of 255 instructions, i.e. 255 different words available
for program writing. Depending on operation they perform, all instructions are divided in
several groups:
• Data Transfer Instructions
• Arithmetic Instructions
• Branch Instructions
• Logic Instructions
• Bit-oriented Instructions
The CPU can access data in various ways. The data could be in a register, or in memory, or
be provided as an immediate value. These various ways of accessing data are called
addressing modes. The 8051 provides a total of six distinct addressing modes. They are as
follows:
1. Immediate Addressing Mode
2. Register Addressing Mode
3. Direct Addressing Mode
4. Indirect Addressing Mode
5. External Indirect Addressing Mode
6. Indexed Addressing Mode
This instruction uses Immediate Addressing because the Accumulator will be loaded with the
value that immediately follows; in this case 20 (hexadecimal). Notice that the immediate data
must be preceded by the pound sign, “#”.This addressing mode can be used to load
1
information into any of the registers, including the DPTR register as shown in the following
examples:
Notice that we can also use immediate addressing mode to send data to 8051 ports. For
example, “MOV P1, #55H” is a valid instruction.
Immediate addressing is very fast since the value to be loaded is included in the instruction.
However, since the value to be loaded is fixed at compile-time it is not very flexible.
2.2 Register Addressing Mode
Register addressing mode involves the use of registers to hold the data to be manipulated.
Examples of register addressing mode are shown below:
It should be noted that the source and destination registers must match in size. In other words,
coding “MOV DPTR, A” will give an error, since the source is an 8-bit register and the
destination is a 16-bit register.
Notice that we can move data between the accumulator and Rn (for n = 0 to 7) but movement
of data between "R" registers is not allowed. For example, the instruction “MOV R4, R7″ is
invalid.
2
Direct addressing is so-named because the value to be stored in memory is obtained by
directly retrieving it from another memory location. For example: MOV A, 30H
This instruction will read the data out of Internal RAM address 30 (hexadecimal) and store it
in the Accumulator. Further examples of the direct addressing mode are shown below:
Direct addressing is generally fast since, although the value to be loaded isn't included in the
instruction, it is quickly accessible since it is stored in the 8051s Internal RAM. It is also
much more flexible than Immediate Addressing Mode since the value to be loaded is
whatever is found at the given address. Also, it is important to note that when using direct
addressing any instruction which refers to an address between 00H and 7FH is referring to
Internal Memory. Any instruction which refers to an address between 80H and FFH is
referring to the SFR control registers that control the 8051 microcontroller itself. For
example, the instruction MOV 0E0, #30H is equivalent to MOV A, #30H.
Indirect addressing is a very powerful addressing mode which in many cases provides an
exceptional level of flexibility. Indirect addressing is also the only way to access the extra
128 bytes of Internal RAM found on an 8052.
Indirect addressing appears as follows:
MOV A,@R0
3
This instruction causes the 8051 to analyze the value of the R0 register. The 8051 will then
load the accumulator with the value from Internal RAM which is found at the address
indicated by R0.
If the data is inside the on-chip RAM, only registers R0 and R1 are used as pointers. In other
words, R2 – R7 cannot be used to hold the address of an operand located in RAM when using
this addressing mode.
Indirect addressing mode always refers to Internal RAM; it never refers to an SFR.
2.5 External Indirect Addressing Mode
In addition to its code memory, the 8051 family also has 64K bytes of data memory space. In
other words, the 8051 has 128K bytes of address space of which 64K bytes are set aside for
program code and the other 64K bytes are set aside for data. Program space is accessed using
the program counter (PC) to locate and fetch instructions, but the data memory space is
accessed using the DPTR register and an instruction called MOVX, where X stands for
external (meaning that the data memory space must be implemented externally).
This method of addressing is used to access data stored in external data memory. There are
only two commands that use external indirect addressing mode:
MOVX A, @DPTR
MOVX @DPTR, A
As you can see, both commands utilize DPTR. In these instructions, DPTR must first be
loaded with the address of external memory that you wish to read or write. Once DPTR holds
the correct external memory address, the first command will move the contents of that
external memory address into the Accumulator. The second command will do the opposite: it
will allow you to write the value of the Accumulator to the external memory address pointed
to by DPTR.
2.6 Indexed Addressing Mode
Indexed addressing mode is widely used in accessing data elements of look-up table entries
located in the program ROM space of the 8051. The instruction used for this purpose is:
“MOVC A, @A+DPTR”
The 16-bit register DPTR and register A are used to form the address of the data element
stored in on-chip ROM. Because the data elements are stored in the program (code) space
ROM of the 8051, the instruction MOVC is used instead of MOV. The “C” means code. In
this instruction the contents of A are added to the 16-bit register DPTR to form the 16-bit
address of the needed data.
One major difference between the code space and data space is that, unlike code space, the
data space cannot be shared between code and data.
4
3. Some Specific Instructions
In order to write simple assembly programs for the 8051 microcontroller, certain widely used
instructions are explored. These instructions are belonging to the different groups of
instruction types illustrated above.
3.1 The MOV Instruction
The MOV instruction is a data transfer instruction, and copies data from one location to
another. It has the following format:
1. Values can be loaded directly into any of registers A, B, or R0 – R7. However, to indicate
that it is an immediate value it must be preceded with a pound sign (#). For example the
instruction MOV A,#23H and MOV R5,#0F9H. Notice in instruction “MOV R5 , #0F9H”
that a 0 is used between the # and F to indicate that F is a hex number and not a letter. In
other words “MOV R5 , #F9H” will cause an error.
2. If values 0 to F are moved into an 8-bit register, the rest of the bits are assumed to be all
zeros. For example, in the instruction “MOV A, #5″ the result will be A = 05; that is, A =
00000101 in binary.
3. Moving a value that is too large into a register will cause an error. For example the
instruction MOV A,#7F2H is illegal.
5
For example the following instructions add the contents of register A and register R2 and put
the result in the accumulator.
MOV A,#25H ; load 25H into A
MOV R2,#34H ; load 34H into R2
ADD A,R2 ; add R2 to accumulator
3.3 The INC Instruction
The INC instruction is an arithmetic instruction, and is used to increment the contents of a
register or memory location by 1. For example: INC A, INC R0, INC 7H, INC DPTR.
3.4 The DEC Instruction
The DEC instruction is an arithmetic instruction, and is used to decrement the contents of a
register or memory location by 1. For example: DEC A, DEC R1, DEC @R0. The DEC
instruction can not be used to decrement the contents of the data pointer register DPTR.
3.5 The DJNZ Instruction
The DJNZ is an important conditional jump instruction that is widely used for making loops
in the 8051 programs. This instruction takes the following syntax:
DJNZ register, label
The DJNZ informs the CPU to decrease the content of the register and jump to a specified
location in the program determined by the label if the register content is not zero. The
program jump will be within –128 to +127 locations relative to the following instruction.
3.6 The SJMP (Short Jump) Instruction
The SJMP instruction is a 2-byte unconditional jump instruction which is used to make jumps
in the program. The jump to the target address will be within -128 to +127 bytes of memory
relative to the address of the current PC (program counter). The syntax of the SJMP
instruction is:
SJMP label
The LJMP instruction is a 3-byte unconditional jump instruction which is used to make
jumps in the program. The target address allows a jump to any memory location from 0000 to
FFFFH within the 64KB memory address space. The syntax of the LJMP instruction is:
LJMP label
6
3.8 The JNZ Instruction (Jump if A ≠ 0)
In this instruction the content of register A is checked. If it is not zero, it jumps to the target
address. Notice that the JNZ instruction can be used only for register A. It can only check to
see whether the accumulator is zero or not, and it does not apply to any other register.
3.9 The JZ Instruction (Jump if A = 0)
In this instruction the content of register A is checked. If it is zero, it jumps to the target
address. It is applied only to the accumulator register.
3.10 JNC (jump if no carry)
In this instruction, the carry flag bit in the flag (PSW) register is used to make the decision
whether to jump. In executing “JNC label”, the processor looks at the carry flag to see if it is
raised (C =1). If it is not, the CPU starts to fetch and execute instructions from the address of
the label. If C = 1, it will not jump but will execute the next instruction below JNC.
3.11 JC (Jump if carry)
In the JC instruction, if C = 1 it jumps to the target address. On the other hand if no carry
occurs, the processor will execute the next instruction below JC.
3.12 The CJNE Instruction
The CJNE (Compare and jump if not equal) instruction is used to compare the content of a
general purpose register with a certain value or direct memory location and jump to a certain
address if the two value are not equal. The syntax of the instruction is:
CJNE destination, source, label
The destination can be register A or the bank registers (R0-R7), and the source may be an
immediate value or a direct memory address.
3.13 The CLR Instruction
The CLR instruction is used to clear the content of the accumulator. It is a one byte logic
instruction, which has the syntax CLR A.
3.14 The CPL Instruction
The CPL (Complement) instruction is a one byte logic instruction that is used to complement
(negate) the bits of register A. After execution of this instruction, bits 1 in register A become
0 and vice versa. It takes the syntax CPL A. It can also be used as a bitwise instruction to
complement a bit.
7
3.15 The NOP Instruction
The NOP (No operation) instruction does not do anything. It is used only for making desired
delays in the programs.
3.16 The SETB and CLR Instructions
The SETB instruction is a bit oriented instruction that is used to set a certain bit in a register
or bit addressable memory location. The CLR instruction, on the other hand, is used to clear a
certain bit in a register or bit addressable memory location. For example the instruction SETB
PSW.3 is used to set bit 3 (RS0) of the program status word register (The Flag Register). The
instruction CLR C (or CLR PSW.7) is used to clear the carry bit in the flag register.
3.17 The JNB and JB Instructions
The JNB (jump if no bit) and JB (jump if bit = 1) instructions are also widely used single-bit
operations. They allow us to monitor a bit and make a decision depending on whether it is 0
or 1.
4. The Pseudo Instructions
Pseudo instructions are not translated to machine code, and therefore they have no operation
code. They are used by the assembler to organize the source code file (asm file). There are
several pseudo instructions supported by the 8051 microcontroller. Some of the widely used
pseudo instructions are discussed below.
4.1 ORG (Origin)
The ORG pseudo instruction is used to indicate the beginning of the address.
The number that comes after ORG can be either in hex or in decimal. If the number is not
followed by H, it is decimal and the assembler will convert it to hex.
4.2 DB (Define Byte)
The DB pseudo instruction is the most widely used data instruction in the assembler. It is
used to define the 8-bit data. When DB is used to define data, the numbers can be in decimal,
binary, hex, or ASCII formats. For decimal, the “D” after the decimal number is optional, but
using “B” (binary) and “H” (hexadecimal) for the others is required. Regardless of which is
used, the assembler will convert the numbers into hex. To indicate ASCII, simply place the
characters in quotation marks (‘like this’). The assembler will assign the ASCII code for the
numbers or characters automatically. Either single or double quotation marks can be used
around the ASCII strings.
Following are some DB examples:
8
4.3 EQU (Equate)
This is used to define a constant without occupying a memory location. The EQU pseudo
instruction does not set aside storage for a data item but associates a constant value with a
data label so that when the label appears in the program, its constant value will be substituted
for the label. The following uses EQU for the counter constant and then the constant is used
to load the R3 register.
When executing the instruction “MOV R3, #COUNT”, the register R3 will be loaded with
the value 25 (notice the # sign). What is the advantage of using EQU? Assume that there is a
constant (a fixed value) used in many different places in the program, and the programmer
wants to change its value throughout. By the use of EQU, the programmer can change it once
and the assembler will change all of its occurrences, rather than search the entire program
trying to find every occurrence.
4.4 END
Another important pseudo instruction is the END directive. This indicates to the assembler
the end of the source (asm) file. The END pseudo instruction is the last line of an 8051
program, meaning that in the source code anything after the END directive is ignored by the
assembler.
5. Structure of the 8051 Assembly Language Programs
An Assembly language program consists of, among other things, a series of lines of
Assembly language instructions. An Assembly language instruction consists of a mnemonic,
optionally followed by one or two operands. The operands are the data items being
manipulated, and the mnemonics are the commands to the CPU, telling it what to do with
those items.
9
Figure 1: Sample of an Assembly Language Program
A given Assembly language program (see Figure 1) is a series of statements, or lines, which
are either Assembly language instructions such as ADD and MOV, or statements called
pseudo instructions. While instructions tell the CPU what to do, pseudo-instructions (also
called directives) give directions to the assembler. For example, in the above program while
the MOV and ADD instructions are commands to the CPU, ORG and END are directives to
the assembler. ORG tells the assembler to place the opcode at memory location 0 while END
indicates to the assembler the end of the source code. In other words, one is for the start of
the program and the other one for the end of the program.
Brackets indicate that a field is optional, and not all lines have them. Brackets should not be
typed in. Regarding the above format, the following points should be noted.
1. The label field allows the program to refer to a line of code by name. The label field
cannot exceed a certain number of characters. Check your assembler for the rule.
2. The Assembly language mnemonic (instruction) and operand(s) fields together perform
the real work of the program and accomplish the tasks for which the program was
written. In Assembly language statements such as
ADD A, R5
MOV A,#0
ADD and MOV are the mnemonics, which produce op-codes; and “A, R5” and “A, #0″ are
the operands. Instead of a mnemonic and an operand, these two fields could contain
assembler pseudo-instructions. Remember that pseudo-instructions do not generate any
machine code (opcode) and are used only by the assembler, as opposed to instructions that
are translated into machine code (opcode) for the CPU to execute. In Figure 1, the commands
ORG (origin) and END are examples of pseudo-instructions (some 8051 assemblers use
.ORG and .END).
10
3. The comment field begins with a semicolon comment indicator “;”. Comments may be
at the end of a line or on a line by themselves. The assembler ignores comments, but
they are indispensable to programmers. Although comments are optional, it is
recommended that they be used to describe the program and make it easier for someone
else to read and understand, or for the programmers to remember what they wrote.
4. Notice the label “HERE” in the label field in Figure 1. Any label referring to an
instruction must be followed by a colon symbol, “:”. In the SJMP (short jump
instruction), the 8051 is told to stay in this loop indefinitely. If your system has a
monitor program you do not need this line and it should be deleted from your program.
By choosing label names that are meaningful, a programmer can make a program much
easier to read and maintain. There are several rules that names must follow. First, each label
name must be unique. The names used for labels in Assembly language programming consist
of alphabetic letters in both uppercase and lowercase, the digits 0 through 9, and the special
characters question mark (?), period (.), at (@), underline (_), and dollar sign ($). The first
character of the label must be an alphabetic character. In other words it cannot be a number.
Every assembler has some reserved words that must not be used as labels in the program.
Foremost among the reserved words are the mnemonics for the instructions. For example,
“MOV” and “ADD” are reserved since they are instruction mnemonics. In addition to the
mnemonics there are some other reserved words depending on the specific assembler to be
used.
5.2 Assembling an 8051 Program
The assembler is used to create a ready to run machine code file in HEX format from the
entered source code file. This generated HEX file will be burned into the program ROM of
the microcontroller to do a specific task. There are many commercial 8051 assemblers that
can perform the assembly process. The ASEM-51 assembler will be used in our lab for
producing the object files of the written assembly programs. The entered source file (asm
file) should be written using an editor like the MS-DOS EDIT program, or the windows
Notepad editor. The ASEM-51 assembler will generate two output files, the list file and the
machine code HEX file as shown in Figure 2. The list file is very useful to the programmer
because it lists all the opcodes and addresses as well as errors that the assembler may detect.
The programmer uses the list file to find syntax errors in order to generate the required HEX
file. This file can be accessed by an editor and displayed on the monitor.
11
Figure 2: Creating and Assembling 8051Code Files
The source code file is created from the command prompt window after changing the path
into the ASEM51 folder containing the assembler program. To change the path into the
directory ASEM51, type the following command from the root directory:
C:\>CD ASEM51
C:\ASEM51>
In order to print the source file (mycode.asm), type the following command:
C:\ASEM51>notepad mycode.asm
C:\ASEM51>EDIT mycode.asm
After typing and saving the source code file, it should then be assembled by typing the
command below:
C:\ASEM51>ASEM mycode.asm
If the source file is free of syntax errors, the assembler will generate the HEX file in addition
to the list file. Otherwise, the syntax errors can be detected by displaying the list file as
shown below:
C:\ASEM51>EDIT mycode.lst
To list all the created files on the screen, type the following command:
C:\ASEM51>DIR mycode.*
The assembler program generates two important files from the entered source code file (asm
file). These files are the list file and the machine code file in hexadecimal format (HEX file).
The HEX file is the program code that will be burned into the microcontroller's on-chip
ROM.
First, we examine the list file of the sample program presented in Figure 1 and how the code
is placed in the ROM of an 8051 chip. As we can see, the opcode and operand for each
instruction are listed on the left side of the list file as shown in Figure 3.
12
Figure 3: The Contents of the List File Generated by the Assembler
After the program is burned into ROM of an 8051 family member such as 8751 or AT8951 or
87C52, the opcode and operands are placed in ROM memory locations starting at 0000 as
shown in the list below in Figure 4.
13
Figure 4: ROM Locations for the Assembled Program
The list shows that address 0000 contains 7D, which is the opcode for moving a value into
register R5, and address 0001 contains the operand (in this case 25H) to be moved to R5.
Therefore, the instruction “MOV R5,#25H” has a machine code of “7D25″, where 7D is the
opcode and 25 is the operand. Similarly, the machine code “7F34″ is located in memory
locations 0002 and 0003 and represents the opcode and the operand for the instruction “MOV
R7,#34H”. In the same way, machine code “7400″ is located in memory locations 0004 and
0005 and represents the opcode and the operand for the instruction “MOV A, #″. The 0
memory location 0006 has the opcode of 2D, which is the opcode for the instruction “ADD
A, R5″ and memory location 0007 has the content 2F, which is the opcode for the “ADD A,
R7″ instruction. The opcode for the instruction “ADD A, #12H” is located at address 0008
and the operand 12H at address 0009. The memory location 000A has the opcode for the
SJMP instruction and its target address is located in location 000B.
6. Example Programs
Some examples are presented to illustrate how to write source code files for the 8051
microcontroller and its family members, and how to assemble and simulate them. A general
purpose emulator program (MIDE-51) that runs under windows will be used to generate the
HEX files necessary for practical programs implementation. This emulator has a built-in
assembler which is compatible with ASEM-51. Appendix-B gives a tutorial about this
software. Another advanced simulator (MCU 8051 IDE) could also be utilized to test, run, and
simulate the written assembly files. A general user guide about this software is provided in
Appendix-C.
Example 1
ORG 0H
MOV A,50H
INC A
MOV 51H,A
END
The machine code (hex code) for this operation will be stored into the ROM memory with the
starting location at 000H :
14
Memory Machine Code (Hex)
Assembly Code
Location Byte 1 Byte 2 Byte 3
ORG 0H
END
Example 2
A program that will add two data from memory locations 30H and 31H and then store the result in
memory location 32H can be developed in many ways.
Below are two possible programs which can run the above operation.
Example 3
Write a program to obtain in the register B the result of the 10 times 8 by using adding instruction.
ORG 00H
15
MOV B,A ; move the result from A to register B
END
7. Procedure
Implement the examples presented above and verify their results with the aid of the MCU 8051
MCU IDE emulator.
8. Discussion
16
BACK: MOV R5, #100
HERE: DJNJ R5, HERE
DJNZ R6, BACK
5. Write an 8051 program to exchange the contents of the stack pointer SP and the flag register
PSW.
MOV A, SP
MOV B, PSW
MOV PSW, A
MOV SP, B
END
6. What is the content of the list file produced by the assembler?
A-Machine language
B-Assembly language
17