Ownership and move semantics is one of the things that makes Rust unique. To understand this topic, you need to understand what Stack and Heap are at a basic level. I wrote a post about that! You can check it out if you need a refresher on those concepts. It is a little bit hard to get used to this feature because it forces you to think about stuff that you didn’t have to worry about in other languages. Enough introduction, let’s cut to the chase!
GDB
In this post, I am going to explore what is happening in memory using the GNU Debugger (gdb) with the special command rust-gdb
:
$ rust-gdb ./target/debug/move_semantics
I am going to use the x command a lot to explore the stack and the $sp
value (refers to the Stack Pointer).
The three rules of ownership
There are three rules that governs the ownership system:
- Every initialized value has an owner: Every initialized value has a variable that is its owner.¹
- There is only one owner per value: You can’t have two or more variables that owns the same value in memory. You can’t share ownership between variables.²
- If a variable’s scope ends, its value gets freed: When a scope ends, all values owned by variables contained in that scope get automatically freed.
¹ But not every variable owns a value, they may just hold a reference. I’ll talk about this in the “References and Borrowing” article.
² Actually you can have more than one owner in safe Rust. You have to use special structures, such as Rc (multiple owners do not own the value directly though).
Let’s test the rules! But before that, a little reminder of how the String
type is represented in memory:
where:
- ptr: A pointer to the first direction of the Heap containing the string itself (in this case
hello
). - len: How much memory, in bytes, the contents of the string is currently using.
- capacity: The total amount of memory, in bytes, allocated for that string.
Rule 1: Every initialized value has an owner.
Consider the following code:
|
|
In the hello_world
function, we have an initialized String value that is free (not assigned to a variable). Did Rust initialize the value in memory or just ignore it? We can’t use it so… Why would Rust save it? Let’s check what happens! When we compile this code we get the following warning:
warning: unused return value of `from` that must be used
--> src/main.rs:2:5
|
2 | String::from("hello! I am a free initialized String!");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
Rust warns us that we must use the returned value of the String::from
function, otherwise, we can’t access it in any way. What happens in memory? Let’s check it out with GDB!
First, we set a breakpoint at the beginning of the hello_world
function and execute the String
initialization:
Breakpoint 1, move_semantics::hello_world () at src/main.rs:2
2 String::from("hello! I am a free initialized String!");
(gdb) n
3 println!("{}", 42);
At this point, the String
is initialized, but it isn’t assigned to a variable. So, it has no owner! Let’s check the stack:
|
|
It seems that the String
value is there, lines 4 to 6 looks like our initialized value: memory addresses 0x7fffffffd998
and 0x7fffffffd9a0
(lines 5 and 6) have a 38 stored, and the string happens to have 38 characters. 0x7fffffffd990
(line 4) must be the Heap address where the actual text is allocated! Let’s see what’s inside that memory address.
First, print the address as hexa:
(gdb) x/xg 0x7fffffffd990
0x7fffffffd990: 0x00005555555a5a10
Then, explore what’s inside that address!
|
|
Our String
is mostly there! But, it appears that the beginning of it was overwritten. It’s ok, that value isn’t owned by any variable; we can’t access it. So, it doesn’t matter what happens to it.
NOTE: This memory exploration was done using a debug build. I am not really sure what happens if this code was compiled in release mode. I believe that Rust does not initialize the value as an optimization, because it is not used.
Rule 2: There’s only one owner per value
Consider the following code:
|
|
When we try to compile this, we get:
error[E0382]: borrow of moved value: `s1`
--> src/main.rs:7:20
|
3 | let s1 = String::from("hello world!");
| -- move occurs because `s1` has type `String`, which does not implement the `Copy` trait
4 | // Move ownership from s1 to s2
5 | let s2 = s1;
| -- value moved here
6 | // Oops! compiler error, the value has been moved!
7 | println!("{}", s1);
| ^^ value borrowed here after move
What is happening here is that the ownership of the String
"hello world!"
is transferred from s1
to s2
. Because of that, the compiler invalidates the access to s1
.
The value was moved because the type String
does not implement the Copy trait. This is used on types that can be fully allocated in the stack and can be duplicated by simply copying bits without much overload (duplicating data in the Heap is much more complicated). When a type implements the Copy
trait, instead of having “move semantics” it has “copy semantics”. This is usually the case for primitive types:
|
|
If we run this code…
cargo run
Compiling move_semantics v0.1.0 (/home/rust/blog)
Finished dev [unoptimized + debuginfo] target(s) in 0.30s
Running `target/debug/move_semantics`
42
42 42
compiles! Because the value 42
is copied!
Rule 3: If a variable’s scope ends its value gets freed
Consider the following code:
|
|
s1
allocation will have been freed when we reach line 7. This is because the curly braces at the beginning of the main
function creates a new scope. Once the code reaches the end of it, all the variables that it contained get dropped. Let’s check it out in GDB:
On line 4, we can find s1
in the locals variables of the scope:
Breakpoint 1, move_semantics::main () at src/main.rs:4
4 println!("{}", s1);
(gdb) info locals
s1 = "hello world!"
Let’s check where the Heap allocation of s1
is and what value it contains (remember that the first field of the Stack representation is the pointer to the Heap):
(gdb) p &s1
$1 = (*mut alloc::string::String) 0x7fffffffd960
(gdb) x/xg 0x7fffffffd960
0x7fffffffd960: 0x00005555555a5ad0
(gdb) x/12c 0x00005555555a5ad0
0x5555555a5ad0: 104 'h' 101 'e' 108 'l' 108 'l' 111 'o' 32 ' ' 119 'w' 111 'o'
0x5555555a5ad8: 114 'r' 108 'l' 100 'd' 33 '!'
But when the scope finishes…
7 println!("Checking drop with gdb!");
(gdb) info locals
No locals.
(gdb) x/12c 0x00005555555a5ad0
0x5555555a5ad0: 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0x5555555a5ad8: 16 '\020' 80 'P' 90 'Z' 85 'U'
All the locals variables were dropped and the memory occupied by them freed! That part of the Heap is now filled with something else (probably garbage).
Moving a value: What happens under the hood?
Consider the following code:
|
|
When we move a value, it does not dissapears from the memory, instead, whatever is in the stack that belongs to the moved value gets duplicated and the compiler just forbids us from accessing the old variable ever again.
Let’s verify what I just said with GDB. We are going to examine the stack frame of the move_stack_example
function. First of all, let’s check the locals variables:
(gdb) info locals
s2 = "hello world!"
s1 = "hello world!"
Whoa! Looks like s1
and s2
have the same value! Actually they are pointing to the same value. Let’s now see what the addresses of s1
and s2
are:
(gdb) p &s1
$1 = (*mut alloc::string::String) 0x7fffffffdb08
(gdb) p &s2
$2 = (*mut alloc::string::String) 0x7fffffffdb20
Great! Now, we know that s1
’s stack representation starts at 0x7fffffffdb08
and s2
’s starts at 0x7fffffffdb20
. Let’s now see the contents of the stack frame:
|
|
What do we have at s1
and s2
addresses? Let’s check it out!:
- ptr:
- For
s1
this value is at0x7fffffffdb08
. - For
s2
this value is at0x7fffffffdb20
.
- For
- len:
- For
s1
this value is at0x7fffffffdb10
. - For
s2
this value is at0x7fffffffdb28
.
- For
- capacity:
- For
s1
this value is at0x7fffffffdb18
. - For
s2
this value is at0x7fffffffdb30
.
- For
As you can see, both ptr
values are the same, meaning that both variables are pointing to the same data in the Heap. Let’s print them in hexadecimal to get the correct format to explore it:
(gdb) x/xg 0x7fffffffdb08
0x7fffffffdb08: 0x00005555555a5ad0
(gdb) x/xg 0x7fffffffdb20
0x7fffffffdb20: 0x00005555555a5ad0
So, the ptr
value is 0x00005555555a5ad0
! Now, take a look at the contents of that address in the Heap:
(gdb) x/12c 0x00005555555a5ad0
0x5555555a5ad0: 104 'h' 101 'e' 108 'l' 108 'l' 111 'o' 32 ' ' 119 'w' 111 'o'
0x5555555a5ad8: 114 'r' 108 'l' 100 'd' 33 '!'
The hello world!
string is there!
Conclusion
It can take some time to get used to working with ownership and move semantics, but, in my opinion, that is well invested time. Manually managing memory (by allocating and freeing it) is not an easy task and can create several bugs. With Rust’s approach, those bugs are caught at compile time, so they can never happen!
If you want to read more about this topic, check out the Rust book.