It is believed that it is impossible to leak memory with Rust, that is not true. Although is way more difficult to leak memory in Rust than in other languages, it can happen, sometimes by accident, and sometimes, by design. In this article we will explore some cases where leaking memory is useful.

Sharing memory between threads

Imagine that you have a value that it does not change and needs to be shared across several threads. If the value can be initialized with constant functions or values, you can use the static keyword. But what if the value to be shared is not constant? What are the options?

Using Arc<T>

Arc<T> is a smart pointer that allow us to share a value safely between threads. ”Arc” stands for “Atomically Reference Counted”. The value owned by the Arc can be read from different threads. Every time an Arc is cloned, the internal counter is incremented by one, and every time an Arc is dropped, the internal counter is decremented by one. Once the counter goes from 1 to 0, the resource owned by the Arc is cleaned up. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
use std::env;
use std::sync::Arc;
use std::thread;

fn main() {
    let name: String = env::args().collect::<Vec<String>>().remove(1);
    let person_name: Arc<String> = Arc::new(name);

    let p = person_name.clone();
    let t1 = thread::spawn(move || {
        println!("Hello {p}");
    });

    let p = person_name.clone();
    let t2 = thread::spawn(move || {
        println!("Bye {p}");
    });

    let _ = t1.join();
    let _ = t2.join();
}

This program receives a name by argument and spawns two threads that read that value. In this example there is no leak: Once the threads finish executing, the Arc is dropped and when the main thread exits, the value is cleaned up, since the Arc owned by the main function is the last reference to the name.

Leaking memory with Box::leak

Instead of creating an Arc and moving the clones where we need, we can leak the memory with Box::leak and share it with the threads:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
use std::env;
use std::thread;

fn main() {
    let name: String = env::args().collect::<Vec<String>>().remove(1);
    let person_name: &'static String = Box::leak(Box::new(name));

    let t1 = thread::spawn(move || {
        println!("Hello {person_name}");
    });

    let t2 = thread::spawn(move || {
        println!("Bye {person_name}");
    });

    let _ = t1.join();
    let _ = t2.join();
}

In line 6, we are leaking the memory creating an static reference and then moving it to the spawned threads so they can be read by them 1.

The code showed above is just an example, an actual useful case for leaking memory and reading it from threads or async tasks is having a global configuration loaded at runtime that we know it won’t change for the entire time the program is beign executed.

If we check this code with valgrind

valgrind ./target/release/box_leak John
==691== Memcheck, a memory error detector
==691== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==691== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==691== Command: ./target/release/box_leak John
==691==
Bye John
Hello John
==691==
==691== HEAP SUMMARY:
==691==     in use at exit: 28 bytes in 2 blocks
==691==   total heap usage: 27 allocs, 25 frees, 4,229 bytes allocated
==691==
==691== LEAK SUMMARY:
==691==    definitely lost: 24 bytes in 1 blocks
==691==    indirectly lost: 4 bytes in 1 blocks
==691==      possibly lost: 0 bytes in 0 blocks
==691==    still reachable: 0 bytes in 0 blocks
==691==         suppressed: 0 bytes in 0 blocks
==691== Rerun with --leak-check=full to see details of leaked memory
==691==
==691== For lists of detected and suppressed errors, rerun with: -s
==691== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

We effectively see that not all memory was freed.

Foreign Function Interfaces (FFI)

Libraries created in Rust can be used in other languages (commonly C) through Foreign Function Interfaces. In other languages, the way of doing things are usually different than in Rust. For example, in C, exists the “Opaque Pointer Pattern”. The objetive of this pattern is to hide implementations details by using an opaque pointer. To implement this pattern, we need a function to create the struct with hidden implementation details, functions to operate on it, and a function to destroy it.

Suppose you want to reimplement a stack data structure, originally written in C, using Rust. Here’s the header file:

#ifndef STACK_H
#define STACK_H

#include <stdbool.h>

// This is the opaque type
typedef struct Stack Stack;

Stack *stack_create(int capacity);
void stack_destroy(Stack *stack);
bool stack_push(Stack *stack, int value);
bool stack_pop(Stack *stack, int *out);
bool stack_peek(const Stack *stack, int *out);
bool stack_is_empty(const Stack *stack);

#endif

Data representation

To re-implement the stack library, we need to represent the stack in Rust and follow the interface from the header file. For this implementation, we’ll use a Vec structure:

type Stack = Vec<i32>;

Allocating memory and leaking it

Following the interface defined by the header file, we need to create a function that takes the capacity of the stack and returns a pointer to it. This means that we need to allocate in memory some kind of data that represents a Stack structure (in this case a Vec<i32>) and return a pointer to it. This is implemented as follows:

#[unsafe(no_mangle)]
pub extern "C" fn stack_create(capacity: usize) -> *mut Stack {
    Box::into_raw(Box::new(Vec::with_capacity(capacity)))
}

What we do in this function is create the Vec with the capacity passed by argument, save it inside a Box and use Box::into_raw to consume the Box and return a raw pointer to the vector. This means that the vector is allocated in memory and is accesible by our program, but now it is our responsibility to be sure that the pointer is allocated with the Stack representation when we want to use it and to release the memory when we do not need it anymore.

This is analogous to using malloc to allocate memory for the data structure and initialize it with values in C.

Reclaiming the leaked memory to release it

If we do not free the Stack allocated by the function stack_create, we will indeed have created a memory leak. The header file declares a function to destroy it, here’s the implementation in Rust:

#[unsafe(no_mangle)]
pub unsafe extern "C" fn stack_destroy(stack: *mut Stack) {
    if !stack.is_null() {
        let _ = unsafe { Box::from_raw(stack) };
    }
}

Here, if the stack pointer is not null, we reconstruct the Box to reclaim the ownership of the allocated memory. The created Box is immediately dropped, releasing the memory.

If we compile the main.c program and verify it with valgrind, we see that there’s no leak

$ valgrind ./main
==580== Memcheck, a memory error detector
==580== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==580== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==580== Command: ./main
==580==
Top of stack: 20
Popped: 20
Popped: 10
==580==
==580== HEAP SUMMARY:
==580==     in use at exit: 0 bytes in 0 blocks
==580==   total heap usage: 3 allocs, 3 frees, 1,056 bytes allocated
==580==
==580== All heap blocks were freed -- no leaks are possible
==580==
==580== For lists of detected and suppressed errors, rerun with: -s
==580== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

We can see that there are no leaks! So, technically we did not leaked memory, it was just a temporal thing from the Rust perspective, and, from C’s perspective, we were simply allocating memory and freeing it correctly afterward.

You can find the full implementation of both C and Rust here.

Can Box::leak be used instead of Box::into_raw?

Yes, but in my opinion you should not. Semantically, using Box::from_raw to re-create a Box from a Box::leak is incorrect because, technically, you did not intend to leak memory if you reclaim it with the objetive of releasing it. Check out the following code:

fn reconstruct_and_drop(value: &mut i32) {
    let _ = unsafe { Box::from_raw(value as *mut i32) };
}

fn main() {
    let value = Box::new(42);
    let static_value: &'static mut i32 = Box::leak(value);

    println!("The value is {}", static_value);

    reconstruct_and_drop(static_value);

    println!("The value is {}", static_value);
}

This results in the following output:

The value is 42
The value is 431898688

Here, we leak the value 42, and we give it a 'static lifetime. Citing the Rust Book:

But before specifying 'static as the lifetime for a reference, think about whether the reference you have actually lives the entire lifetime of your program or not, and whether you want it to

Additionally, by definition, a reference is valid throughout its entire lifetime, unlike a raw pointer.

Unlike a pointer, a reference is guaranteed to point to a valid value of a particular type for the life of that reference.

And Box::leak returns a reference that, as showed in the example above, we can invalidate. The example is very simple and the error can be spotted right away, but in a larger codebase, this could become a bug hard to find and fix.

In situations where we need a raw pointer and know we are going to reclaim the memory to release it is semantically correct to use the Box::into_raw / Box::from_raw pair of functions.

Conclusion

Even though leaking memory can be an useful resource, it must be used with care. We can’t abuse this mechanism because we can create unintended problems, like running out of memory, or consuming too much of it without an specific purpose.

But, why re-implement things in Rust and use these tricks? This would require a whole article to explain, but the short answer is that even though you need to use unsafe keyword from time to time, if used correctly, you still have a lot of compile-time checks and safety guarantees that Rust provides out of the box.

Lastly, a real-word scenario where this technique is used is the Redox OS’s libc re-implementation in Rust called relibc (for example, in the regcomp function here’s the Box::into_raw, and in the regfree function here’s the Box::from_raw)

References

  1. https://marabos.nl/atomics/basics.html#shared-ownership-and-reference-counting

  1. You may wonder why we need to use the move keyword if we are creating an ’static reference. Remember that closures capture their environment by reference, so what we are really capturing here is a &&'static String and not a &'static String where the outer reference is not 'static. When using move, we are “moving” &'static String into the thread, but, since they are references and references are Copy, we just copy the reference inside of the thread, not destroying it outside. ↩︎