String Types

Strings are types that represent textual data. There are several string types that have slightly different purposes.

The most common string type,, is a primitive type and is used for string literals. Other string types in the standard library are and. All of these types are dynamically sized and can therefore only exist behind a reference or a pointer-like type (e.g. ). However, they have owned counterparts,,  and , which are allocated on the heap and are. This is similar to the distinction between  and.

Strings are represented as slices of bytes; however, Rust guarantees that strings are always valid. For instance, the  and   types must be valid UTF-8, violating this rule causes undefined behavior.

and
and  are types containing UTF-8 encoded text. This means that characters have a variable width: A character can be between 1 and 4 bytes long. Therefore these strings can't be indexed, and slicing a string to get a substring uses byte offsets. Slicing a string in the middle of a multi-byte character causes a panic:

The method returns the byte length of the string; this is not the same as the number of characters. To iterate over the characters of a string, the and  methods can be used.

A  or   can be converted from a   or   with the   and   family of functions, which validate the input.

A note about Unicode
In Rust, a  is a Unicode scalar value. This is similar to, but not the same as a Unicode code point: A  can't be a high or low surrogate. This means that converting a  to a   can fail.

A  is not the same as a character in the general sense. When talking about characters, we often mean graphemes, which can consist of multiple s. To iterate over or count the graphemes of a string, external crates such as  should be used. Note that graphemes don't necessarily correspond to what is displayed as a visual unit, since that depends on the text rendering pipeline. For example, fonts can define ligatures, and these can be turned on and off with font features.

Also note that strings that are semantically equal don't necessarily have the same byte representation. Often there are multiple ways to represent a single grapheme, and sometimes different graphemes should be treated equal. Therefore, when comparing or sorting strings, they should be normalized beforehand to produce correct results. This can be done with the crate, for example.

and
and  are used when interfacing with platform-specific APIs. Their format varies by system and is therefore not exposed to the programmer. One can infallibly convert a  to an. The reverse is not guaranteed as  may contain values unrepresentable by UTF-8, so the conversion can be done fallibly  or lossily.

and
and  are constrained by C language requirements. Namely, they are terminated by a nul byte and can't contain any other nul bytes. They can be created with the  method, and will fail if the input contains a non-terminal nul character.

Note that C strings are not constrained to UTF-8. This means that when a  is converted to a , it must be validated.

Owned strings can be dereferenced to their borrowed counterparts:


 * implements
 * implements
 * implements

This means that  methods are also available for   because of auto-deref. For example, `String::new.chars` is equivalent to `String::new.deref.chars`.

All strings implement the  trait to convert them to the borrowed variant:


 * and  implement
 * and  implement
 * and  implement

This is useful to be generic over strings, when a string reference is enough, for example:

For conversions in the opposite direction, the trait can be used.

and
These traits are used to convert other values from and into a string.

is fallible, i.e. it returns a. It used by.

has a blanket implementation for. This means that  doesn't need to be implemented manually; instead, one should implement the  trait. An implementation for  is then automatically provided.

and
These traits are used to convert a borrowed string to an owned string and vice versa:  implements ,   implements   and   implements.

As a result, strings can be used with the type, which stands for clone on write. It can be used to return a type that is either owned or borrowed, while avoiding allocations unless necessary. For example:

Sources of confusion
Rust is more pedantic than other languages when it comes to string handling, which can lead to confusion as to why a certain type or trait is used. Additionally, as strings are such a fundamental type, there are some arguably inelegant or redundant items such as.

is not a special case, since can be used with any type that implements , e.g.  ,   and many more. However,  is arguably the most common use case.

and
While  deviates from  by returning a , it seems redundant with  and   redundant with.

Indeed,  was going to make   obsolete, but cannot as it would overlap with   (#44174), which violates coherence. It is also used by. On the other hand,  is used as a convenience method for.