Macros

From Rust Community Wiki
(Redirected from Procedural macro)
Jump to navigation Jump to search

Macros are a method of using code to write code. They can be used to extend Rust with extra syntactical parts (such as the variable number of arguments in println!), make writing lots of duplicate code easier (e.g. macros are used to generate the identical methods of the primitive integer types in std), or to create entirely different sub-languages within Rust (e.g. Cargo vec.svginline-python ).

There are two types of macro in Rust: declarative macros and procedural macros, or proc-macros. Declarative macros are defined using macro_rules!, and provide a simple method of substituting a macro invocation for code. Procedural macros have to be defined in their own crate, and allow any arbitrary Rust code to run which will transform Rust code into different Rust code. Procedural macros are far more powerful, but are more difficult to write and slow down the compilation process (as they have to be both compiled and run during the compilation process).

Declarative macros[edit | edit source]

Declarative macros are defined with macro_rules!. They use a match-like syntax to determine which "branch" to use. Here is a simple macro that takes no arguments and expands to the string literal "Hello World!":

macro_rules! hello_world {
    () => { "Hello World!" };
}

This macro has one arm which takes no arguments. Attempting to call the macro with an argument will result in an error as no arms match. Its body is delimited with braces, which aren't included in the expansion. The trailing semicolon is optional for the last match arm, but is required for all other match arms.

It can be invoked in three ways:

  • Using parenthesis: hello_world!()
  • Using brackets: hello_world![]
  • Using braces: hello_world! { }

All three are completely equivalent, and whichever one is used is up to the user. There are conventions, however; function-like macros such as println! are written with parenthesis (like a function call), array-like macros such as vec! are written with brackets (like an array literal) and macros that take in larger blocks of code are written with braces (like code blocks).

Macro parameters[edit | edit source]

Any code written in the parameters of a macro will only be accepted if it matches exactly (excluding whitespace); for example (foo) => { ... } will match macro!(foo) and macro!( foo ) but not macro!(foo,) or macro!(f oo). To accept variable parameters to the macro use a dollar sign, followed by the variable's name, followed by a colon, followed by the variable's type; for example, ($value:expr) will accept any expression and bind it to the variable $value. Then, in the body of the macro $value will be expanded to the expression given to the macro. The following types are available:[1][2]

Name Description Examples
ident Any valid Rust identifier including keywords foo, fn
block A brace-delimited block of Rust code { 4 + 5 }, { x = 10; }
stmt A statement, often delimited with a semicolon let x = 5;, if y { x = 10; }
expr An expression 2 + 2, -8.abs()
pat A pattern (value, _), Range { start, .. }
ty A type Vec<String>
lifetime A lifetime 'a, 'static
literal A literal "Hello World!", 5.6f32
path A path (a sequence of any number of identifiers separated by ::) ::std::fmt::Display, foo
meta A meta item (what goes inside the #[] of an attribute, excluding the hash and brackets) derive(Debug), warn(clippy::pedantic)
tt A single token tree, which can be an identifier, literal, lifetime, piece of punctuation, bracket-delimited sequence of token trees, using either parenthesis, brackets, braces 'static, (some more tokens)
item An item; this includes structs, enums, function declarations, modules, et cetera struct S;
vis A visibility indicator pub, pub(crate), pub(self), pub(in ...)

Repeats[edit | edit source]

Declarative macros can specify parts to be repeated multiple times, by writing $(part)suffix where suffix is one of ?, * or +. ? specifies that the area should be repeated zero or one times, * specifies zero or more and + specifies one or more. Any token can be placed before the suffix to cause that token to have to appear in between each repeat. Using these features, a common pattern to allow trailing commas is $(list_item),* $(,)?.

In the body of the macro the same rules apply to expand the repeated variables. The source code of the vec!This links to official Rust documentation macro can be approximated using this:

macro_rules! vec {
    ($($elem:expr),* $(,)?) => {
        {
            let mut vec = Vec::new();
            $(
                vec.push($elem);
            )*
            vec
        }
    }
}

Visibility of declarative macros[edit | edit source]

Declarative macros, unlike all other items, cannot use things like pub and pub(crate) to control their visibility. Instead, macros must be either public or private. Macros can be made public by adding the macro_export attribute to it:

#[macro_export]
macro_rules! my_macro { ... }

It will then be visible at the crate root.

To make the macro behave (somewhat) like any other pub item, you have to do this:

pub mod foo {
    #[macro_export]
    macro_rules! my_macro {
        () => {}
    }

    pub use my_macro as my_macro;
}

my_macro will now be visible at the crate root as well as inside the module it's defined in.

If you only need a visibility of pub(crate) or less, then you can omit the #[macro_export]. In this case, my_macro will not be visible at the crate root.

Macros, unlike other items must be declared before they are used, i.e. this is invalid:

my_macro!();
macro_rules! my_macro { ... }

If a macro is defined in a child module, you must put the module declaration before the use of the macro, and if it is defined in a parent module the parent module must place the child's mod declaration after the declaration of the macro.

In order to use macros from other modules in your crate, you must annotate the module declaration with #[macro_use]. In the 2015 edition, you must also annotate extern crate declarations with #[macro_use] in order to use the macros inside it. #[macro_use] will import all exported macros into the current scope - you can't specify only one or two.

Accessing items in the current crate[edit | edit source]

If you are exporting a macro, it often needs to access items from the crate it came from. However, simply writing in the crate name won't work, as the user may have renamed the crate to something else. To solve this, you can use the $crate:: prefix in your macro. As an example:

#[macro_export]
macro_rules! my_macro {
    () => { $crate::SomeStruct }
}

pub struct SomeStruct;

Token munching[edit | edit source]

Token munching is a technique for writing declarative macros where the macro takes in $($tt:tt)*, which will match anything. The macro can then recursively call itself until it finishes parsing. Here is a (horribly inefficient, both at compile-time and run-time) recursive implementation of the vec!This links to official Rust documentation macro:

macro_rules! vec {
    () => {
        Vec::new()
    };
    ($elem:expr $(, $($tt:tt)*)?) => {
        {
            let mut vec = $crate::vec![$($($tt)*)?];
            vec.insert(0, $elem);
            vec
        }
    };
}

Declarative macros 2.0[edit | edit source]

Currently, declarative macros are hard to write in and have lots of weird edge cases. To solve this, RFC 1584: declarative macros 2.0 was written, describing a new macro system. However, this RFC does not specify many concrete details and importantly the actual syntax of the new macros has yet to be decided. The new syntax will likely look like:

macro foo { ... }
// Or:
pub macro bar(...) => { ... }

All that is definitively stated in the RFC is that the macro keyword will be used instead of the out of place-looking macro_rules! we have today.

Procedural macros[edit | edit source]

Procedural macros, or proc-macros, are a type of macro that allows for macros to run arbitrary Rust code during compilation. Procedural macros must be made in their own crate, which can contain nothing but procedural macros. To create a procedural macro crate, you must put this somewhere in your Cargo.toml:

[lib]
proc-macro = true

Then all public functions in your library become procedural macros.

Procedural macros operate on token trees. A token tree is the fundamental unit of Rust source code after raw Unicode characters and before the AST. The types that define token trees are in the proc_macroThis links to official Rust documentation crate; procedural macros are merely functions that take and return sequences of token trees (known as token streams).

There are three types of proc macro: function-like, attribute and derive. Function-like macros are called the same way declarative macros are, but unlike declarative macros can't yet be used inside expressions on stable (they can only be used in a crate or module). Attribute macros are applied as attributes to an item - for example, #[my_attribute_macro] struct S;. They take the item as input and replace it with their output. Derive macros are applied with the #[derive] attribute - for example, #[derive(MyTrait)]. The standard library contains many derive macros such as DebugThis links to official Rust documentation, PartialEqThis links to official Rust documentation, EqThis links to official Rust documentation, etc. Derive macros, unlike attribute macros can only add items (usually impl blocks) next to the item they were applied to, whereas attribute macros completely destroy it. Attribute macros and derive macros also must always operate on a valid Rust AST, but function-like macros can operate on any sequence of tokens.

use proc_macro::TokenStream;

// A function-like proc macro: fn_like_proc_macro!(input)
#[proc_macro]
pub fn fn_like_proc_macro(input: TokenStream) -> TokenStream;

// An attribute proc macro: #[attribute_proc_macro(attr)] item
#[proc_macro_attribute]
pub fn attribute_proc_macro(attr: TokenStream, item: TokenStream) -> TokenStream;

// A derive proc macro: #[derive(DeriveProcMacro)] item
#[proc_macro_derive(DeriveProcMacro)]
pub fn derive_proc_macro(item: TokenStream) -> TokenStream;

All the tokens in proc_macroThis links to official Rust documentation come with a SpanThis links to official Rust documentation. A span is an opaque structure representing where in the source code the tokens came from. It is good to hold onto them, because they can be used for many things such as nice error messages.

Reporting errors[edit | edit source]

There are three ways to report errors from within a proc macro. The simplest way is just to panic with an error message. Rustc will catch this panic and display an error message to the user. However, this isn't very good - it won't point the user to where the error occurred, it will just point to the entire macro invocation, making it very hard to fix it for a large macro. Moreover, warnings and additional notes on the error message can't be displayed using this, and only one error can be displayed per compilation.

The second method is using compile_errorThis links to official Rust documentation - by returning an invocation of compile_error with the Span of where the error occurred in the input, you can easily report multiple more precise errors to the user. This is far better than panics, but still doesn't allow for warnings or notes on your error message.

The third method is nightly-only, but it provides the complete power of rustc's error reporting. The DiagnosticThis links to official Rust documentation type allows for many precise errors, warnings and notes to be displayed to the user. All code should eventually move to using this, but for now using compile_error is the best option.

Finally, the Cargo vec.svgproc-macro-error crate provides an easy to use API for reporting errors similar to the old panic!-based method, but using compile_error! and Diagnostic under the hood, which leads to very nice error messages.

proc_macro2[edit | edit source]

The types from the proc_macroThis links to official Rust documentation crate can only be used from within proc macros. If used outside it, it will compile, but it will panic. This is because TokenStream is actually implemented using IPC directly to rustc, and so cannot be used without it. To solve this issue, the Cargo vec.svgproc-macro2 crate was created, which has an identical API to proc_macro, and first attempts to use the native proc_macro implementation but if that fails falls back on its own implementation. As a result, types from proc_macro2 can be used in any context, proc_macro or not. If you are doing anything with proc macros you should be using proc_macro2.

Quoting with quote![edit | edit source]

Constructing token trees by hand is hard and verbose. The Cargo vec.svgquote crate helps a lot, by providing a macro which automatically generates a TokenStream from simple Rust code. Here is a comparison of quote and the construction method constructing the expression 2 + 2:

use proc_macro2::{TokenStream, TokenTree, Literal, Punct, Span, Spacing};
use quote::quote;

// Using proc_macro2
TokenStream::from_iter(vec![
    TokenTree::Literal(Literal::u8_unsuffixed(2)),
    TokenTree::Punct(Punct::new('+', Spacing::Alone)),
    TokenTree::Literal(Literal::u8_unsuffixed(2)),
].drain());

// Using quote
quote!(2 + 2);

See quote's documentation for more.

Working with token trees: syn and absolution[edit | edit source]

Working directly with token trees is hard. As token trees are implemented using IPC under the hood, it prevents them from being easily introspectable and easily usable. The Cargo vec.svgabsolution crate solves this by proving a similar API to proc_macro and proc_macro2, but being much more easy to use and transparent. For simple proc-macros, this is probably what you want to use.

For more complex proc macros, especially ones which take in Rust source code, the Cargo vec.svgsyn crate can be used. It provides an AST and a parser for any Rust source code, and is also easily extendable to support your own custom ASTs.

Testing procedural macros[edit | edit source]

Procedural macro crates can't invoke their own procedural macros, so you have to find workarounds in order to test your macros.

If you have written the main logic of your program using proc_macro2, then you can test your macro by calling the logic of your macro and comparing it to the expected output, using quote! to easily generate token streams. One hitch is that none of the types from proc_macro or proc_macro2 implement PartialEq, so you can't use a simple assert_eq!. To solve this, you can convert both streams to strings via .to_string(), and then assert_eq! that.

Another common pattern to test proc macros is to have your wrapper crate around the proc macros contain tests for them, as the wrapper crate can invoke the implementation crate's proc macros.

Macro hygiene[edit | edit source]