[go: up one dir, main page]

0% found this document useful (0 votes)
10 views25 pages

4-functions-web

The document discusses pointers and function arguments in C++ and Python, emphasizing the importance of understanding pointer arithmetic and the differences between pass-by-value and pass-by-reference. It explains how function parameters are handled in both languages, including the implications of returning pointers to local variables and the behavior of arrays as function parameters. Additionally, it highlights the potential pitfalls of pointer manipulation in C++ compared to Python's reference handling.

Uploaded by

cranckcracker123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views25 pages

4-functions-web

The document discusses pointers and function arguments in C++ and Python, emphasizing the importance of understanding pointer arithmetic and the differences between pass-by-value and pass-by-reference. It explains how function parameters are handled in both languages, including the implications of returning pointers to local variables and the behavior of arrays as function parameters. Additionally, it highlights the potential pitfalls of pointer manipulation in C++ compared to Python's reference handling.

Uploaded by

cranckcracker123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Functions

CS 61:
Lecture 4
9/18/2023
C++: Pointer
Arithmetic
• A pointer holds the location
Address
space
(i.e., memory address) of an Byte N-1
object Stack
• Ex: An int* holds the address Typically a
bad idea to
of an int create
pointers to
• Ex: A float* holds the address unallocated
space—we’ll
of a float A pointer return to
value this topic
• C++ allows some kinds of represents a when we
location in discuss
pointer arithmetic Heap virtual
this space memory!
• IT IS VERY IMPORTANT FOR Static data
YOU TO BECOME
COMFORTABLE WITH Code
POINTERS AND POINTER Byte 0
ARITHMETIC
Inde 0 1 2 3 4
char a[5] = {‘C’,‘S’,‘6’,‘1’,‘!’}; x:
char* p1 = &a[1]; CS 6 1 !
char* p2 = &a[4];
a p1 p2
Let’s play
around
with
functions!
Function Arguments: Pass-by-value in
Python
• The program passes an integer
def f(arg):
arg = 0 variable x to a function
• The function modifies its
parameter . . .
x = 2
• . . . but the change does not modify
f(x) the caller’s value!
print(“x is {}”.format(x))
• The reason is that the function
Console output: x is 2 parameter is a copy of the
caller’s value—changes to the
function’s copy do not affect the
caller’s value
• How is this implemented?
• The two values might live in different
registers
• Or maybe one value lives in memory
and the other lives in a register
• Or maybe the two values live in
Function Arguments: Pass-by-reference
in Python
def f(arg): • In Python (and other languages
arg = [“P”] like Java and C#), non-
primitive values use pass-by-
x = [“C”, “S”, “6”, “1”] reference
f(x) • The callee’s argument is a pointer
print(“x is {}”.format(x)) to the caller’s value
• The callee’s pointer can be
Console output: x is ["C", "S", "6", "1"]
changed to point to something else
—this does not affect what the
def f(arg): caller’s pointer is pointing to!
arg[0] = “P” • However, if the callee changes the
referenced value, that change is
x = [“C”, “S”, “6”, “1”] seen by the caller
f(x) • Pass-by-reference avoids copying
print(“x is {}”.format(x)) the value, which can result in big
Console output: x is [“P", "S", "6", "1"] savings if the value is big!
Pass-by-reference in Python: One
Implementation
1. The array is created and the reference x is stored in
%rbx
2. The function f(x) is invoked, such that a copy of the
reference (not a copy of the array) is placed in %rax as
a function argument
3. The function does mov 0x50, 0x0(%rax), where 0x50
is the ASCII value for the character ‘P’!
def f(arg):
arg[0] = “P” #Step 3
1
x = [“C”, “S”, “6”, “1”] #Step 1 6
f(x) #Step 2 S Start
C
P addres
print(“x is {}”.format(x)) s of
Console output: x is [“P", "S", "6", "1"] array
Pass-by-value in Python: One
Implementation
1. The register %rbx is used to hold x’s value of 2
2. The function f(x) is invoked, such that a copy of x’s
value (which is 2) is placed in %rax as a function
argument
3. The function does mov 0x0, %rax, to implement the
Python code arg = 0!
def f(arg):
arg = 0 #Step 3 0
2 2

x = 2 #Step 1
f(x) #Step 2
print(“x is {}”.format(x))
Console output: x is 2
Python References vs. C++
Pointers
• Both a Python reference and a
int main() {
float f_arr[2] = {4.99, 5.00};
C++ pointer contain the float* fp = f_arr;
address of an object in memory printf("f_arr:\t%p\n", f_arr);
printf("fp:\t%p\n", fp++);
• However, the C++ abstract printf("fp:\t%p\n", fp++);//After the
machine allows a program to //increment, fp is now OOB!
inspect that address and perform printf("*fp: %f\n", *fp); //OH NO
pointer arithmetic return 0;
}
• In contrast, the Python abstract
machine disallows inspection of Console output: f_arr: 0x7ffdde27bb20
fp: 0x7ffdde27bb20
that address or pointer arithmetic fp: 0x7ffdde27bb24
involving that address *fp: -302192256.000000
• Prohibiting address a = ["CS","61"]
b = a
manipulation eliminates many b = b + 1 #OH NO
kinds of programming bugs (or Console output: Traceback (most recent call last):
b = b + 1
at least detects them ~~^~~
TypeError: can only concatenate
synchronously)! list (not "int") to list
Function Parameters in C++ on x86-64
void f(int arg) { • The default approach is pass-by-value
arg = 0; //Step 3 • If an argument can fit in a register (e.g., an
}
int or a T*), the compiler will place the
int main() {
copy of the caller’s value in a register
int x = 2; //Step 1
f(x); //Step 2 • If an argument cannot fit in a register (e.g.,
printf(“x is %d\n”, x); a large struct), the compiler will place the
return 0; copy of the caller’s value in memory
} • Compilers often store local variables in
Console output: x is 2 the stack (but may keep them in
registers too)
Step 1: The local variable x, which lives
Stack frame for
main()

in the stack region for main(), receives


Stack frame for f()

the value 2
Step 2: main() invokes f(x), creating a
new stack region for f(), and placing a
copy of x’s value in %rax
Step 3: f() assigns to its argument,
updating %rax; main()’s copy of x is
unchanged!
Function Parameters in C++ on x86-64
struct _s {int x; int y;}; • The default approach is pass-by-value
void f(struct _s* arg) {
arg->x = 0; //Step 3 • If an argument can fit in a register (e.g., an
} int or a T*), the compiler will place the
int main() { copy of the caller’s value in a register
struct _s s = {2,3}; //Step 1 • If an argument cannot fit in a register (e.g.,
f(&s); //Step 2
a large struct), the compiler will place the
printf(“s.x is %d\n”, s.x);
return 0;
copy of the caller’s value in memory
} • Compilers often store local variables in
Console output: s.x is 0 the stack (but may keep them in
main()’s
registers too)
s
Stack frame for
main()
Step 1: The local variable s, which lives in
Stack frame for f() the stack frame for main(), has its
members initialized to 2 and 3,
respectively
Step 2: main() invokes f(&s), creating a
new stack frame for f(), and placing the
address of s in %rax
Step 3: f() updates the single copy of the
struct via reference!
Function Parameters in C++ on x86-64
struct _s {int x; int y;}; • The default approach is pass-by-value
void f(struct _s* arg) {
arg->x = 0; //Step 3
• If an argument can fit in a register (e.g., an
} int or a T*), the compiler will place the copy
int main() { of the caller’s value in a register
struct _s s = {2,3}; //Step 1 • If an argument cannot fit in a register (e.g., a
f(&s); //Step 2 large struct), the compiler will place the
printf(“s.x is %d\n”, s.x); copy of the caller’s value in memory
return 0;
} • Compilers often store local variables in
Console output: s.x is 0 the stack (but may keep them in
struct _s {int x; int y;}; registers too)
void f(struct _s& arg) {
arg.x = 0; //Step 3 • A programmer can pass-by-reference
} using:
int main() {
• Explicit C++ pointers (first example)
struct _s s = {2,3}; //Step 1
f(s); //Step 2 • Implicit C++ references (second example)
printf(“s.x is %d\n”, s.x); • In Step 2, the compiler implicitly uses the address
return 0; of s as the argument for f()
} • In Step 3, the compiler implicitly adds code to
Console output: s.x is 0 dereference a pointer
C++: Passing Complicated //Note that f accepts the Addres
s space
Values
• A vector<int> object contains:
//vector argument by value,
//not by reference. Stack
• arr: Pointer to the array which stores void f(std::vector<int> v) { arr
siz
frame
for
the elements of the vector printf("&v in f:\t%p\n", e
capacity main()
• size: The number of elements added &v);
and removed via methods v[0] = -1;
} //f()’s v is now destroyed! Stack
like .push_back(), .insert(), arr frame
and .erase() int main() { siz for
std::vector<int> v = {42, e f()
• capacity: The actual size of the array capacity
41,
(which maybe larger than the vector’s
40};
size!)
printf("sizeof(v) is %d\n",
• What happens if you try to pass a sizeof(v));
vector<int> by value to a function? printf("&v in main:\t%p\n",
&v); Hea
• A copy of the vector will be placed in p
f(v);
the callee’s stack frame, with a copy of
printf("v[0] is %d\n",
the vector’s array placed in the heap v[0]);
• The callee will interact with this copy return 0; Stati
• The copy will be destroyed when the c
}
data
callee returns sizeof(v) is 24
Console &v in main: 0x7ffd292e05f0 Cod
• Pass-by-value is expensive if the &v in f: 0x7ffd292e0620 e
object to copy is large—use pass-by- v[0] is 42
C++ Arrays as Function Parameters
and Return Values
void f(int arg[]) { • Passing an array as a function
arg[0] = -1;
}
parameter (or returning one as a
return value) is equivalent to
int main() { passing/returning a pointer to the first
int arr[2] = {99, 100}; array element
f(arr);
• In other words, an array is
printf("arr[0] is %d\n",
arr[0]); passed/returned by reference
return 0; • So, passing/returning an array does not
} create a new copy of the array!
Console output: x is -1 • We recommend passing/returning a
pointer to an array’s first element
instead of passing/returning the array
itself, to make these semantics more
NEVER RETURN A
POINTER TO A LOCAL
VARIABLE
• The lifetime of a local
variable is the lifetime
of the enclosing function
• So, when the enclosing
function goes away, the
memory belonging to the
object becomes invalid
• Trying to access that
object’s memory later
will result in undefined
behavior!
NEVER RETURN A int* f(int arg) {
int local = arg + 42;
POINTER TO A LOCAL int* p = &local;
VARIABLE printf("f() &local:\t%p\n", p);
return p; //OH NO
}
main()’s
int* p void g(int* ptr) {
stack frame
int local = 999;
int* p = ptr;
printf("g() &local:\t%p\n", &local);
int local 42 f()’s stack printf("g() ptr:\t%p\n", ptr);
int* p 0x7fffcd443e94 frame printf("g() *ptr:\t%d\n", *ptr);
}

int main() {
int* p = f(0);
g(p);
return 0;
}
f() &local: 0x7fffcd443e94
Console
output
NEVER RETURN A int* f(int arg) {
int local = arg + 42;
POINTER TO A LOCAL int* p = &local;
VARIABLE printf("f() &local:\t%p\n", p);
return p; //OH NO
}
main()’s
int* p 0x7fffcd443e94 void g(int* ptr) {
stack frame
int local = 999;
int* p = ptr;
printf("g() &local:\t%p\n", &local);
int local 42 g()’s stack printf("g() ptr:\t%p\n", ptr);
int* p 0x7fffcd443e94 frame printf("g() *ptr:\t%d\n", *ptr);
}

The pointer points to an invalid int main() {


object int* p = f(0);
g(p);
return 0;
}
f() &local: 0x7fffcd443e94
Console
output
NEVER RETURN A int* f(int arg) {
int local = arg + 42;
POINTER TO A LOCAL int* p = &local;
VARIABLE printf("f() &local:\t%p\n", p);
return p; //OH NO
}
main()’s
int* p 0x7fffcd443e94 void g(int* ptr) {
stack frame
int local = 999;
int* p = ptr;
printf("g() &local:\t%p\n", &local);
int local 42
999 g()’s stack printf("g() ptr:\t%p\n", ptr);
int* p 0x7fffcd443e94 frame printf("g() *ptr:\t%d\n", *ptr);
}

int main() {
int* p = f(0);
g(p);
return 0;
}
g()’s stack frame
reuses space that f() &local: 0x7fffcd443e94
Console
output
was previously g() &local: 0x7fffcd443e94
occupied by f()’s g() ptr: 0x7fffcd443e94
stack frame! g() *ptr: 999 //Not 42!
NEVER RETURN A int* f(int arg) {
int local = arg + 42;
POINTER TO A LOCAL int* p = &local;
VARIABLE printf("f() &local:\t%p\n", p);
• The lifetime of a local }
return p; //OH NO

variable is the lifetime


void g(int* ptr) {
of the enclosing function int local = 999;
• So, when the enclosing int* p = ptr;
printf("g() &local:\t%p\n", &local);
function goes away, the printf("g() ptr:\t%p\n", ptr);
memory belonging to the printf("g() *ptr:\t%d\n", *ptr);
}
object becomes invalid
• Trying to access that int main() {
int* p = f(0);
object’s memory later g(p);
will result in undefined return 0;
behavior! }
f() &local: 0x7fffcd443e94
Console
output
• This kind of error can g() &local: 0x7fffcd443e94
lead to subtle bugs that g()
g()
ptr:
*ptr:
0x7fffcd443e94
999 //Not 42!
How do functions really
work on x86-64?
x86-64: Registers
• 16 general-purpose integer registers
• %rbp: break pointer
• %rsp: stack pointer
• %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %r8, %r9, %r10,
%r11, %r12, %r13, %r14, %r15: used for various
purposes
• %rip: instruction pointer (i.e., holds the address
of the next instruction to execute)
• Incremented or decremented by more than one
instruction during jumps, function calls/returns
• Various special-purpose registers like:
• %cr3: the page table pointer
x86-64: System V Calling
• Simplest case: callee Convention
arguments+retval are integers or pointers
• Caller stores first six arguments in %rdi, %rsi, %rdx, %rcx, %r8, and %r9
• Remaining arguments are passed via the stack
• Callee places return value in %rax
int64_t foo(int64_t a, int64_t
b, Caller
int64_t c, int64_t info
h
d, g
int64_t e, int64_t return
f, Stack addr
frame for
int64_t g, int64_t
foo()
saved %rbp %rbp
h){ local0
int64_t local0 = a*b*c; local1 %rsp
a b c d e f
int64_t local1 = d*e*f;
%rdi %rsi %rdx %rcx %r8 %r9
//Code at start of
int64_t foo(int64_t a, int64_t function.
b, //SaveCaller
old breakpointer.
int64_t c, int64_t info
pushq %rbp
h %rbp
movq %rsp,
d, g
int64_t e, int64_t return
//Allocate local vars.
f, addr%rsp
subq $0x10,
int64_t g, int64_t saved %rbp %rbp
//Rest of function . . .
h){ local0
int64_t local0 = a*b*c; local1 %rsp
a b c d e f
int64_t local1 = d*e*f;
%rdi %rsi %rdx %rcx %r8 %r9
int f3() {
int f3_local = 3; Console output
printf("&f3_local:\t%p\n", &f3_local);
return f3_local;
&main_local: 0x7ffeaa7f27cc
} &f1_local: 0x7ffeaa7f27ac
&f2_local: 0x7ffeaa7f278c
int f2() { &f3_local: 0x7ffeaa7f276c
int f2_local = 2;
printf("&f2_local:\t%p\n", &f2_local);
return f3();
} Recall that the stack on
x86 grows downwards!
int f1() {
int f1_local = 1;
printf("&f1_local:\t%p\n", &f1_local);
return f2();
}

int main() {
int main_local = 0;
printf("&main_local:\t%p\n", &main_local);
return f1();
}
x86-64: System V Calling
Convention
• If a compound value is too big
to fit inside a register, the
compiler will pass it via
memory, or via a combination
of registers and memory
• Recall that a CPU can
read/write a register must
faster than memory
• Accessing a register takes
~0.5ns, whereas accessing
memory takes ~60ns (i.e.,
~100x slower!)
• So, the compiler tries to place
arguments in registers if
possible

You might also like