1 Intro
1 Intro
Yulei Sui
University of Technology Sydney, Australia
1
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Software Is Everywhere
2
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Modern System Software
– Extremely Large and Complex
3
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Software Becomes More Buggy
More
Complex!
Memory Leaks
Buffer Overflows
Data-races
4
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Software Becomes More Buggy
More
Complex!
Memory Leaks
Buffer Overflows
Data-races
https://www.slideshare.net/innotech_conference/hp-cloud-security-inno-tech-20140501
5
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Let us take a look at some real-world vulnerability examples ...
6
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Memory Leak
• A dynamically allocated object is not freed along some execution path of a
program
• A major concern of long running server applications due to gradual loss of
available memory.
1 /* CVE−2012−0817 allows remote attackers to cause a denial of service through adversarial connection requests.*/
2 /* Samba −−libads/ldap.c:ads leave realm */.
3
4 host = memAlloc(hostname);
5 ...
6 if (...) {...; return ADS ERROR SYSTEM(ENOENT);} // The programmer forgot to release host on error.
7
7
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Buffer Overflow
• Attempt to put more data in a buffer than it can hold.
• Program crashes, undefined behavior or zero-day exploit1 .
1 /* A simplified example from ”Young and Mchugh, IEEE S&P 1987”, exploited by attackers to bypass verfication*/
2
3 void verfiyPassword(){
4 char buff [15]; int pass = 0;
5 printf ( ”\n Enter the password : \n”);
6 gets(buff );
7
8 if (strcmp(buff, ”thegeekstuff”)){ // return non−zero if the two strings do not match
9 printf ( ”\n Wrong Password \n”);
10 }
11 else{ // return zero if two strings matched or a buffer overrun
12 printf ( ”\n Correct Password \n”);
13 pass = 1;
14 }
15 if (pass)
16 printf ( ”\n Root privileges given to the user \n”);
17 }
18
1
Heartbleed, a well-known vulnerability in OpenSSL is also caused by buffer overflow (It took more than 2 years to discover and fix it since first
patch, and over 500,000 websites were affected). Vulnerability is exploited when more data can be read than should be allowed.
8
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Uninitialized Variable
• Stack variables in C and C++ are not initialized by default.
• Undefined behavior or denial of service via memory corruption
1 /* An uninitialized variable vulnerability simplified from gnuplot (CVE−2017−9670) */
2
3 void load(){
4 switch ( ctl ) {
5 case −1:
6 xN = 0; yN = 0;
7 break;
8 case 0:
9 xN = i ; yN = −i;
10 break;
11 case 1:
12 xN = i + NEXT SZ; yN = i − NEXT SZ;
13 break;
14 default :
15 xN = −1; xN = −1; // xN is accidentally set twice while yN is uninitialized
16 break;
17 }
18 plot (xN, yN);
19 }
20
21
9
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Use-After-Free
10
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Code Review by Developers
However!!#
!" intractable due to potentially unbounded number of paths that must be analyze
!" undecidable in the presence of dynamically allocated memory and recursive data structures
11
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
How about real-world large programs?
Whole-Program CFG of 300.twolf (20.5K lines of code)
#functions: 194
13
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Software Verification
• Software Verification is useful for proving the correctness, safety and security
of a program, and a key aspect of testing can execute as expected.
• ”Have we made what we were trying to make?”
• Are we building the system right?
• Does our design meet the user expectations?
• Does the implementation conform to the specifications?
14
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Why Software Verification?
• Better quality in terms of more secure and reliable software
• Help reduce the chances of system failures and crashes
• Cut down the number of defects found during the later stages of development
• Rule out the existence of any backdoor vulnerability to bypass a program’s
authentication
• Reduce time to market
• Less time for debugging.
• Less time for later phase testing and bug fixing
• Consistent with user expectations/specifications
• Assist the team in developing a software product that conforms to the specified
requirements
• Help get a better understanding of (legacy) parts of a software product
15
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
What Types of Software Verification We Have?
Code verification vs design verification
• Design verification: verifying design of a software system.
• Design specs: specification languages for components of a system. For
example,
• Z language for business requirements,
• Promela for Communicating Sequential Processes
• B method based on Abstract Machine Notation.
• Specification Language (VDM-SL)
• ...
16
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
What Types of Software Verification We Have?
Code verification vs design verification
• Design verification: verifying design of a software system.
• Design specs: specification languages for components of a system. For
example,
• Z language for business requirements,
• Promela for Communicating Sequential Processes
• B method based on Abstract Machine Notation.
• Specification Language (VDM-SL)
• ...
• Code verification: verifying correctness of source code (This subject)
• Code specs (e.g., return a sorted list):
• Assertions and pre/postconditions in Hoare logic (design by contract)
• Type systems
• Well-formed comments or annotations
• ...
16
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
How to Perform Software Verification?
Verify or prove the correctness of your code against the specifications via
• Dynamic verification (Checking code behavior during program execution)
• Per path verification which aims to find bugs by exercising one execution path
a time based on specific testing inputs
• Stress testing
• Model-based testing
• Fuzz testing
• ...
17
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
How to Perform Software Verification?
Verify or prove the correctness of your code against the specifications via
• Dynamic verification (Checking code behavior during program execution)
• Per path verification which aims to find bugs by exercising one execution path
a time based on specific testing inputs
• Stress testing
• Model-based testing
• Fuzz testing
• ...
• Static verification (inspecting the code before it runs) (This subject)
• All path verification which aims to prove that a program satisfies the
specification of its behavior by reasoning about all possible program paths
• Model checking (exhaustive exploration of state space by modeling programs as
state transition systems)
• Abstract interpretation (a general theory of sound approximation of a program
through program abstractions or abstract values)
• Symbolic execution (a practical way to use symbolic expressions instead of
concrete values to explore the possible program paths)
17
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Assertions as Specifications in Software Verification
18
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Assertions as Specifications in Software Verification (Example)
• Assertions are program statements used to test assumptions made by
software designers/programmers/testers.
• An assertion is a expression/predicate that always should evaluate to true
at that point during code execution.
19
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Assertions as Specifications in Software Verification (Example)
• Assertions are program statements used to test assumptions made by
software designers/programmers/testers.
• An assertion is a expression/predicate that always should evaluate to true
at that point during code execution.
1 #include <assert.h>
2 void main() {
3 int x_axis = 5;
4 int y_axis = 1;
5 // assertion succeeds
6 assert(x_axis > 0 && y_axis > 0);
7 plot(x_axis, y_axis);
8 int y_axis = x - 5;
9 // assertion fails and program crashes
10 assert(x_axis > 0 && y_axis > 0);
11 }
19
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Assertions as Specifications in Software Verification (Example)
• Assertions are program statements used to test assumptions made by
software designers/programmers/testers.
• An assertion is a expression/predicate that always should evaluate to true
at that point during code execution.
1 void display_number(int* myInt) {
1 #include <assert.h> 2 assert(myInt != nullptr);
2 void main() { 3 printf("%d", *myInt);
3 int x_axis = 5; 4 }
4 int y_axis = 1; 5 int main () {
5 // assertion succeeds 6 int myptr = 5;
6 assert(x_axis > 0 && y_axis > 0); 7 int* first_ptr = &myptr;
7 plot(x_axis, y_axis); 8 int* second_ptr = nullptr;
8 int y_axis = x - 5; 9 // assertion succeeds
9 // assertion fails and program crashes 10 display_number(first_ptr);
10 assert(x_axis > 0 && y_axis > 0); 11 // assertion fails and program crashes
11 } 12 display_number(second_ptr);
13 }
19
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Assertion-Based Software Verification
Assertions can be seen as partial software specifications that
• help a programmer read the code
• help the program detect its own defects
• help catch errors earlier and pinpoint sources of errors
20
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Assertion-Based Software Verification
Assertions can be seen as partial software specifications that
• help a programmer read the code
• help the program detect its own defects
• help catch errors earlier and pinpoint sources of errors
Assertions are typically used in the following scenarios.
• Software developers can add assertions during their programming or
implementation to verify their expected results.
• Software testers can add assertions as a part of the unit testing process.
• Project managers or third parties can add assertions in the middle or end of a
program execution to verify and understand code bases.
20
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Software Verification vs Software Analysis?
21
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Goal of this subject: develop your own software verification tool in 12 weeks.
More concretely: develop a static symbolic execution engine in C++ to verify C
programs with assertions at compile time.
1 void Overflow_case()
1void Overflow_case()
2{ source
2{
3 char buff[3] = {'\0','\0','\0'};
3 char buff[3] = {'\0','\0','\0'};
4
Automated 4 char input[] = "Overflow";
char input[] = "Overflow";
5 int input_length = 8; Assertion 5 int input_length = 8;
6 assert(input_length !"3);
6 assert(input_length !"3); Prover 7 strcpy(buff, input);
7 strcpy(buff, input);
8}
8} sink
Program
Control-flow reachability
Program Static Symbolic
Analysis
Execution Constraints:
Report
(buff[3] == {'\0','\0','\0'})
&& (input[] == "Overflow")
&& (input_length == 8)
22
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Program
Control-flow reachability
Program Static Symbolic
Analysis
Execution Constraints:
Report
(buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
∧ (input_length !# 8)
23
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Control-flow reachability
Program Static Symbolic
Execution Constraints:
(buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
∧ (input_length !# 8)
24
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
How about real-world large programs?
Whole-Program CFG of 300.twolf (20.5K lines of code)
#functions: 194
26
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
Call Graph of 176.gcc
26
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
27
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Static Symbolic Execution
Assignment-1
C"# LLVM compiler
1void Overflow_case() programming & IR 1 void Overflow_case()
2{ 2{ source
3 char buff[3] = {'\0','\0','\0'}; Graph representation of code 3 char buff[3] = {'\0','\0','\0'};
4
Automated 4 char input[] = "Overflow";
char input[] = "Overflow"; Assignment-2
5 int input_length = 8; Assertion 5 int input_length = 8;
6 assert(input_length !"3); Control-flow
Proverreachability 6 assert(input_length !"3);
7 strcpy(buff, input); 7 strcpy(buff, input);
8} Z3 theorem prover 8} sink
Assignment-3
Manual assertion prover Control-flow reachability
Program Static Symbolic
Execution
Assignment-4 Constraints:
Automated assertion prover (buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
28
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Static Symbolic Execution
Assignment-1
C"# LLVM compiler
1void Overflow_case() programming & IR 1 void Overflow_case()
2{ 2{ source
3 char buff[3] = {'\0','\0','\0'}; Graph representation of code 3 char buff[3] = {'\0','\0','\0'};
4
Automated 4 char input[] = "Overflow";
char input[] = "Overflow"; Assignment-2
5 int input_length = 8; Assertion 5 int input_length = 8;
6 assert(input_length !"3); Control-flow
Proverreachability 6 assert(input_length !"3);
7 strcpy(buff, input); 7 strcpy(buff, input);
8} Z3 theorem prover 8} sink
Assignment-3
Manual assertion prover Control-flow reachability
Program Static Symbolic
Execution
Assignment-4 Constraints:
Automated assertion prover (buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
28
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Static Symbolic Execution
Assignment-1
C"# LLVM compiler
1void Overflow_case() programming & IR 1 void Overflow_case()
2{ 2{ source
3 char buff[3] = {'\0','\0','\0'}; Graph representation of code 3 char buff[3] = {'\0','\0','\0'};
4
Automated 4 char input[] = "Overflow";
char input[] = "Overflow"; Assignment-2
5 int input_length = 8; Assertion 5 int input_length = 8;
6 assert(input_length !"3); Control-flow
Proverreachability 6 assert(input_length !"3);
7 strcpy(buff, input); 7 strcpy(buff, input);
8} Z3 theorem prover 8} sink
Assignment-3
Manual assertion prover Control-flow reachability
Program Static Symbolic
Execution
Assignment-4 Constraints:
Automated assertion prover (buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
28
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Static Symbolic Execution
Assignment-1
C"# LLVM compiler
1void Overflow_case() programming & IR 1 void Overflow_case()
2{ 2{ source
3 char buff[3] = {'\0','\0','\0'}; Graph representation of code 3 char buff[3] = {'\0','\0','\0'};
4
Automated 4 char input[] = "Overflow";
char input[] = "Overflow"; Assignment-2
5 int input_length = 8; Assertion 5 int input_length = 8;
6 assert(input_length !"3); Control-flow
Proverreachability 6 assert(input_length !"3);
7 strcpy(buff, input); 7 strcpy(buff, input);
8} Z3 theorem prover 8} sink
Assignment-3
Manual assertion prover Control-flow reachability
Program Static Symbolic
Execution
Assignment-4 Constraints:
Automated assertion prover (buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
28
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification
The Project of This Subject
Static Symbolic Execution
Assignment-1
C"# LLVM compiler
1void Overflow_case() programming & IR 1 void Overflow_case()
2{ 2{ source
3 char buff[3] = {'\0','\0','\0'}; Graph representation of code 3 char buff[3] = {'\0','\0','\0'};
4
Automated 4 char input[] = "Overflow";
char input[] = "Overflow"; Assignment-2
5 int input_length = 8; Assertion 5 int input_length = 8;
6 assert(input_length !"3); Control-flow
Proverreachability 6 assert(input_length !"3);
7 strcpy(buff, input); 7 strcpy(buff, input);
8} Z3 theorem prover 8} sink
Assignment-3
Manual assertion prover Control-flow reachability
Program Static Symbolic
Execution
Assignment-4 Constraints:
Automated assertion prover (buff[3] !# {'\0','\0','\0'})
∧ (input[] !# "Overflow")
29
Software Verification https://github.com/SVF-tools/Teaching-Software-Verification