Undefined Behavior

Undefined Behavior, often abbreviated UB, is the result of a violated assumption of the compiler or optimizer. UB can usually only be encountered if the program or its dependencies use unsafe code incorrectly.

There are a few known bugs that can cause UB without  in edge cases, which are labeled I-unsound on GitHub. However, it's very unlikely to encounter these bugs on a day-to-day basis.

Example
This example uses the unsafe function to convert the integer   to a. The problem is that Rust assumes that any  is either   (represented as  ) or   (represented as  ). This means that  is an invalid value – it is neither   nor.

This means that the compiler is free to assume that the code will never run and therefore can make certain optimizations, which can have drastic consequences in other parts of the code. This series of articles explains the impacts of UB; it focuses on C but the points also apply to Rust.

Consequences of Undefined Behavior
The compiler assumes that UB can never happen, and optimizes code under this assumption. However, optimizations that are correct in the absence of UB might be nonsensical and even dangerous, if UB is present. What exactly happens is not specified and can't be predicted.

For example, the optimizer is allowed to inline code and move code around, to unroll loops, replace arithmetic with bitwise operations, remove "dead code" and much more. So it is possible that code exhibiting UB could be removed, or does something entirely else. This is sometimes humorously called "eating your laundry" or nasal demons.

Difference between Undefined Behavior and Contract Violations
Library types can define their own contracts that must be upheld.

For example, has the contract that its length must never be greater than its capacity. Its implementation ensures that this never happens, and relies on it within  blocks. If the contract were violated, the unsafe code could cause UB.

Contract Violations are logical errors, but they can cause UB when unsafe code is involved. If a function doesn't ensure that contracts are upheld, and this can cause UB down the line, then that function should be. Furthermore, fields with a contract should be private.

Contract Violations somewhat overlap with invalid values. For example, a  can only have the values   and  ; this is called a validity invariant. Note that this is not just relied on by the API, but by the language itself, since  is a primitive type. Therefore, an invalid value is illegal to produce (even if it is never used). On the other hand, if a library type such as  has a violated contract, it causes UB only when it is used (e.g. when the   is indexed).

Another example is the  type, which is a slice of text that must be valid UTF-8. However, it has the same validity invariants as, the encoding is "just" an API contract. This means it is not UB to produce a  that isn't valid UTF-8. However, using an incorrectly encoded  can cause UB.

Behavior considered undefined
This list is taken from the Reference.

Data races
Rust assumes that data races can never occur; things like accessing a mutable static is therefore unsafe.

Data races may also happen when transmuting a shared reference to an exclusive reference. You should never do that.

Dereferencing a dangling/unaligned raw pointer
Raw pointers can be dereferenced with the   prefix operator. Doing this is UB if the pointer is dangling (i.e. no longer valid) or unaligned.

Breaking LLVM's pointer aliasing rules
Rust uses LLVM as its compiler backend. References follow LLVM's scoped noalias model, except if the   contains an.

Specifically, exclusive references must not alias: While an exclusive reference is valid, no other reference/pointer may access the data it points to.

Mutating immutable data
All data inside a constant is immutable. Moreover, all data reached through a shared reference or data owned by an immutable binding is immutable, unless that data is contained within an.

Executing code compiled with platform features that the current platform does not support
See.

Producing an invalid value
"Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation. Values must always be valid, even if they're unused or in a private field.

The following values are invalid (at their respective type):


 * A value other than   or    in a.


 * A discriminant in an  not included in the type definition.


 * A null  pointer.


 * A value in a  which is a surrogate or above.


 * A  (all values are invalid for this type).


 * An integer ( / ), floating point value, or raw pointer obtained from uninitialized memory, or uninitialized memory in a.


 * A reference or  that is dangling, unaligned, or points to an invalid value.


 * Invalid metadata in a wide reference,, or raw pointer:
 * metadata is invalid if it is not a pointer to a vtable for  that matches the actual dynamic trait the pointer or reference points to.
 * Slice metadata is invalid if the length is not a valid  (i.e., it must not be read from uninitialized memory).


 * Invalid values for a type with a custom definition of invalid values. In the standard library, this affects  and.