Rust macros from a beginner point of view
This tracks what I understood writting my first real rust macro.
The goal was to simplify creating test data while writting a text adventure game framework of library in Rust.
The model is a simplifier first version of what I want achieve later.
It starts with just a Book
that contains Chapters
, the Chapters
have an Id
, their own text, plus a list of Choices
that associate some descriptions to the Id of the chapter that must be read if the player make that choice.
The current allows writting this:
#![allow(unused)] fn main() { let l = livre![ //. chapter_one_id: { // This is a chapter "text of chapter one", // This is the text of the chapter chapter_one_id: "Make choice one", // This is a choice chapter_two_id: "Make choice two", chapter_three_id: "Make choice three" }, chapter_two_id: { "texte du chapitre deux", chapter_two_id: "Make choice one" }, chapter_three_id: { "texte du chapitre trois", chapter_three_id: "Make choice one" } ]; }
instead of this:
#![allow(unused)] fn main() { let l = Livre { chapitres: HashMap::from([ ( "chapter_one_id".into(), Chapitre { texte: "text of chapter one".into(), choix: vec![ ("chapter_one_id".into(), "Make choice one".into()), ("chapter_two_id".into(), "Make choice two".into()), ("chapter_three_id".into(), "Make choice three".into()), ], }, ), ( "chapter_two_id".into(), Chapitre { texte: "text of chapter two".into(), choix: vec![("chapter_one_id".into(), "Make choice one".into()),], }, ), ( "chapter_three_id".into(), Chapitre { texte: "text of chapter three".into(), choix: vec![("chapter_three_id".into(), "Make choice one".into()),], }, ), ]), }; }
The original code is available on this version of the code.
development environnement
I used:
- the Rust By Example book
- the specification of the Macro By Example from the Rust Reference
- a nightly version of rust which allows the
trace_macros
feature:
$ rustc --version
rustc 1.84.0-nightly (03ee48451 2024-11-18)
the choice!
macro
I tried to start with a macro to generate Chapter
s, but this one hade two separate problems since the pattern for Choice
s and the text of the chapter where a bit different.
After some attempts, I simplified the problem by starting writting a macro for choices only.
It is used like this:
#![allow(unused)] fn main() { choice! { //. id_chapter_one: "Make first choice", id_chapter_three: "Make second choice", }; }
and produces a code Rust code eauivalent to:
#![allow(unused)] fn main() { vec![ ("id_chapter_one".into(), "Make first choice".into()), ("id_chapter_three".into(), "Make second choice".into()), ] }
the easy part
You can skip this chapter if you are not interested in learning to read the specification of Macros By Example from the Rust Reference and prefer to just understand how simple macros are writtent by looing at the Rsu By Example book.
Following the specification we first find (see real doc for proper notation and highlights):
MacroRulesDefinition :
macro_rules ! IDENTIFIER MacroRulesDef
The MacroDefinition start with macro_rules
followed by a !
, followed by the name of the macro which must be a valid Rust identifier.
Then we must provide a MacroRulesDef
which is defined above by:
MacroRulesDef :
( MacroRules ) ;
| [ MacroRules ] ;
| { MacroRules }
So basically we must put MacroRules
between parenthesis or brackets (in both case followed by a semicolon), or beetwen curly braces without semicolumn.
Interestingly, what you uses for the definition makes no difference on the produces macro.
You can call the macro with any of those delimiters, regardless of how to declared it.
#![allow(unused)] fn main() { macro_rules! MyMacroName // Identifier of the macro ( // Starts the MacroRulesDef (could have been a `[` or a `{` ) // put definition of the MacrorulesDef here ); // Ends the MacroRulesDef. Must match the opening deliminiter. The `;` is needed since we did not delimit with `{}` }
The MacroRules
is defined by:
MacroRules :
MacroRule ( ; MacroRule )* ;?
Which means that the MacroRules
is make of:
- a
MacroRule
(with no trailings
) - eventually some repetition of a semicolomn followed by an additionnal
MacroRule
- eventually one semicolumn
(it may be time to take time to read the chapter 1, Notation
of the Rust Reference if you can't read the grammar easily yet since I won't detail the other rules that much)
so we can't start writting a Macrorules
without knowing what a MacroRule
looks like:
MacroRule :
MacroMatcher => MacroTranscriber
It is jsut a MacroMatcher
followed by a =>
arrow then a MacroTranscriber.
Nothing we can start writting.
Let see how a MacroTransciber
it written:
MacroMatcher :
( MacroMatch* )
| [ MacroMatch* ]
| { MacroMatch* }
it is zero to any number of MacroMatch
s between one of the three possible delimiters.
On the other side of the =>
arrow, the MacroTranscriber
is made of a DelimTokenTree
.
This one is not defined on the same page but on the page of macros (not macros by example).
Its definition is that it is a set of zero to any TokenTree
s.
So we can no expend our macro with some MacroRule
s having empty MacroMatcher
s and MacroTranscriber
s:
#![allow(unused)] fn main() { macro_rules! MyMacroName // Identifier of the macro ( // Starts the MacroRulesDef (could have been a `[` or a `{` ) // this is the first MacroRule () // MacroMatcher with zero MacroMatch => () // MacroTranscriber with zero TokenTree ; // mandatory semicolumn if we want to another MacroRule // this is the second MacroRule without the option semicolumn () => () ); // Ends the MacroRulesDef. Must match the opening deliminiter. The `;` is needed since we did not delimit with `{}` }
This macro is not really usefull - it replaces nothing with nothing.
We need to capture some tokens in MetaVariables
.
We can match the tokens using the first type of MacroMatch
(which is a Token except $ and delimiters
), but to capture it we need to use the third type:
$ ( IDENTIFIER_OR_KEYWORD except crate | RAW_IDENTIFIER | _ ) : MacroFragSpec
That means a $
sign followed by either and identifier, a keyword that is not crate, a raw identifier, an underscore, then a column and a MacroFragSpec
.
Note that the parenthesis here do not have a black background. They are not some character that must matched, they are like the
|
chars part og the grammar of the specification. They are use to group together the three possible variants of what follows the$
sign.
The MacroFragSpec
specifies the type of what needs to be captured, and has fifteen different possible values, which are detailed in the specification.
we will use two of them for this the choice!
:
ident
, which is a valid rut identifier. (note that we would also want to accept numbers, but using att
would also allow strings as identifier, which would cause problems with later code. I probably will need to handle both the number and identifier cases separately and later merge them)literal
that allow matching the strings.
So if we want to capture and identifier in a MetaVariable
, followed by a column, and a string.
This is done like this:
( // starts a MacroMatcher
$i:ident // first MacroMatch captures an identifier in $i
: // second MacroMatch matches a column
$n:literal // third one captures the string
)
And we want to repeat that several times separated by commas, using the fourth type of MacroMatch
which is:
$ ( MacroMatch+ ) MacroRepSep? MacroRepOp
with a comma as MacroRepSep
and a MacroRepOp
of *
to allow zero to any repetition.
We add an optionnal comma at the end to match Rust habits, this gives:
( $($i:ident: $n:literal),* $(,)? )
And our matcher is done.
la deuxieme macro
matching delimiters
There are four possible variants for a MacroMatcher
:
MacroMatch :
Token except $ and delimiters
| MacroMatcher
| $ ( IDENTIFIER_OR_KEYWORD except crate | RAW_IDENTIFIER | _ ) : MacroFragSpec
| $ ( MacroMatch+ ) MacroRepSep? MacroRepOp
The simplest one is just a token, that is some text. The definition of a token is rather long, but for now we can say it is a "tiny part" of code, like a variable name, a keyword, a string, a number...
It should not contain a deliminiter (parenthesis, bracket, curly brace).
I first thought it was not possible to parse them and we needed to parse a MacroFragSpec
of type Block
to parse curly braces.
But this was a misunderstanding.
If the MacroMatch
(what is inside the delimiters of the MacroMatcher
) is a Token
, it can not contain delimiters.
But the MacroMatch
can also be a MacroMatcher
, which itself is surrended whith delimiters.
So you can can something like this:
#![allow(unused)] fn main() { #[test] fn macro1() { macro_rules! MyMacroName { { // This starts the MacroMatcher before the `=>` ( // This starts the MacroMatch which happen to be a parenthesis delimited MacroMatcher "a string" // This is a MacroMatch of type token inside the MacroMatch typed MacroMatcher ) } => ("another string") ; // this is the second MacroRule without the option semicolumn () => () } println!("{}", MyMacroName!{ ( "a string" ) }); } }
The exception is that the outer delimiters for the matcher will match any pair of delimiters.
truc dont il faut parler :
stringify!()
- comment il arrive a trouver le type du
into()
et comment j'ai pu virer leString::from()
- la macro peut s'appeler avec des crochet ou des accolades aussi (q: peut on faire des matchings differents pour les trois cas ?)