defmt
defmt
("de format", short for "deferred formatting") is a highly efficient logging framework that targets resource-constrained devices, like microcontrollers.
Features
println!
-like formatting- Multiple logging levels:
error
,info
,warn
,debug
,trace
- Compile-time
RUST_LOG
-like filtering of logs: include/omit logging levels with module-level granularity - Timestamped logs
Current limitations
- Output object format must be ELF
- Custom linking (linker script) is required
- Single, global logger instance (but using multiple channels is possible)
Intended use
In its current iteration defmt
mainly targets tiny embedded devices that have no mean to display information to the developer, e.g. a screen.
In this scenario logs need to be transferred to a second machine, usually a PC/laptop, before they can be displayed to the developer/end-user.
defmt
operating principles, however, are applicable to beefier machines and could be use to improve the logging performance of x86 web server applications and the like.
Operating principle
defmt
achieves high performance using deferred formatting and string compression.
Deferred formatting means that formatting is not done on the machine that's logging data but on a second machine.
That is, instead of formatting 255u8
into "255"
and sending the string, the single-byte binary data is sent to a second machine, the host, and the formatting happens there.
defmt
's string compression consists of building a table of string literals, like "Hello, world"
or "The answer is {:?}"
, at compile time.
At runtime the logging machine sends indices instead of complete strings.
Support
defmt
is part of the Knurling project, Ferrous Systems' effort at
improving tooling used to develop for embedded systems.
If you think that our work is useful, consider sponsoring it via GitHub Sponsors.
commit aa731a3 on 2024-05-13 17:49:16
User guide
Unless indicated otherwise these sections apply to the use of defmt
both in libraries and applications.
Setup
To use defmt
in a library you only need to add defmt
as a dependency.
$ cargo add defmt
To use defmt
in an application you need the additional steps documented below.
For applications
π‘ The prefferred way to create a new
defmt
application is to use our app-template. Tag along if you want to adddefmt
to an existing application.
Linker script
The application must be linked using a custom linking process that includes the defmt.x
linker script.
Custom linking is usual for embedded applications and configured in the .cargo/config
file.
To pass defmt.x
to the linker add the -C link-arg=-Tdefmt.x
flag to the rustflags section of .cargo/config.toml
.
# .cargo/config.toml
[target.thumbv7m-none-eabi]
rustflags = [
# --- KEEP existing `link-arg` flags ---
"-C", "link-arg=-Tlink.x",
"-C", "link-arg=--nmagic",
# --- ADD following new flag ---
"-C", "link-arg=-Tdefmt.x",
]
#[global_logger]
The application must link to or define a global_logger
.
The global_logger
specifies how logs are sent from the device running the app to the host where the logs are displayed.
The application must link to exactly one global_logger
.
The global_logger
can appear anywhere in the dependency graph and usually it will be its own crate.
The following global_logger
s are provided as part of the project:
defmt-rtt
, logs over RTT. Note that this crate can not be used together withrtt-target
.defmt-itm
, logs over ITM (Instrumentation Trace Macrocell) stimulus port 0.defmt-semihosting
, logs over semihosting. Meant only for testingdefmt
on a virtual Cortex-M device (QEMU).
Information about how to write a global_logger
can be found in the #[global_logger]
section.
Enabling logging
By default, only ERROR level messages are logged. To learn how to enable other logging levels and filters logs per modules read the Filtering section.
Memory use
When in a tight memory situation and logging over RTT, the buffer size (default: 1024 bytes) can be configured with the DEFMT_RTT_BUFFER_SIZE
environment variable. Use a power of 2 for best performance.
Logging macros
Logging is done using the
error!
,warn!
,info!
,debug!
andtrace!
macros.There is also the
println!
macro that always prints the content.
Each macro logs at the logging level indicated by its name.
The syntax of these macros is roughly the same as the println!
-macro.
#![allow(unused)] fn main() { extern crate defmt; let len = 80u8; // -> INFO: message arrived (length=80) defmt::info!("message arrived (length={})", len); struct Message; impl Message { fn header(&self) -> u8 { 0 } } let message = Message; // -> DEBUG: Header { source: 2, destination: 3, sequence: 16 } defmt::debug!("{:?}", message.header()); }
The Format
trait
Unlike core::fmt
which has several formatting traits (Debug
, Display
), defmt
has a single formatting trait called Format
.
The {}
formatting parameter indicates that the Format
trait will be used, meaning the argument must implement the Format
trait.
#![allow(unused)] fn main() { extern crate defmt; let x = 0; defmt::trace!("{}", x); // ^ must implement the `Format` trait }
Type and display hints
The defmt
grammar is similar to core::fmt
, but not the same. The syntax of a formatting parameter is shown below:
{[pos][=Type][:Display]}
Type hint
The Type
hint always starts with a =
.
For one it enables the framework to further compress the logs resulting in higher throughput.
Secondly it also typechecks the supplied value to fit the specified type and possibly coerces the value into the specified type.
The type hint can be a primitive or one, of, the special types. Read on to learn more about the type hints.
Display hint
The Display
hint, always starts with a :
and specifies the printing on the host side.
Read more about it here.
Positional parameter
The pos
parameter lets you specify the position of the value to format (see "Positional parameters").
Primitives
The following primitive types are available:
type hint | name |
---|---|
=bool | boolean |
={i,u}{8,16,32,64,128} | standard integer types |
=f{32, 64} | 32-bit / 64-bit floating point type |
=[u8; N] | byte array |
=[u8] | byte slice |
=str | string slice |
They can be used like this:
#![allow(unused)] fn main() { extern crate defmt; let enabled = false; let ready = false; let timeout = false; // arguments can be compressed into a single byte defmt::info!( "enabled: {=bool}, ready: {=bool}, timeout: {=bool}", enabled, ready, timeout, ); let x = 0u16; // arguments will be type checked defmt::trace!("{=u16}", x); // ^ must have type `u16` }
Additionally there are some special types:
type hint | name |
---|---|
=M..N | Bitfields |
=istr | Interned Strings |
=[?] | Format slices |
=[?; N] | Format arrays |
Read more about them in the following chapters.
Bitfields
:M..N
is the bitfield formatting parameter.
The bitfield argument is expected to be a unsigned integer that's large enough to contain the bitfields.
For example, if bitfield ranges only cover up to bit 11
(e.g. =8..12
) then the argument must be at least u16
.
When paired with a positional parameter it can be used to display the bitfields of a register.
#![allow(unused)] fn main() { extern crate defmt; let pcnf1 = 0u32; // -> TRACE: PCNF1 { MAXLEN: 125, STATLEN: 3, BALEN: 2 } defmt::trace!( "PCNF1: {{ MAXLEN: {0=0..8}, STATLEN: {0=8..16}, BALEN: {0=16..19} }}", // ^ ^ ^ same argument pcnf1, // <- type must be `u32` ); }
Bitfields are not range inclusive, e.g. following statement will evaluate to 5
(0b110
):
#![allow(unused)] fn main() { extern crate defmt; // -> TRACE: first three bits: 110 defmt::trace!("first three bits: {0=0..3}", 254u32); }
β οΈ You can not reuse the same argument in a bitfield- and a non bitfield parameter.
This will not compile:
#![allow(unused)] fn main() { extern crate defmt; defmt::trace!("{0=5..13} {0=u16}", 256u16); }
Interned strings
The
{=istr}
formatting parameter is used for interned strings.
Compared to the {=str}
parameter, which transmits a complete string, {=istr}
saves bandwidth by sending only a string index.
The {=istr}
parameter expects an argument with type defmt::Str
.
A Str
value is created using the intern!
macro; the argument to this macro must be a string literal.
#![allow(unused)] fn main() { extern crate defmt; let s = "The quick brown fox jumps over the lazy dog"; defmt::info!("{=str}", s); // ^ bandwidth-use = 43 bytes use defmt::Str; let interned: Str = defmt::intern!("The quick brown fox jumps over the lazy dog"); defmt::info!("{=istr}", interned); // ^^^^^^^^^ bandwidth-use <= 2 bytes }
β οΈ This was a contrived example to show the difference in bandwidth use.
In practice you should use the following, which also interns the log string and uses as little bandwidth as the
{=istr}
version:#![allow(unused)] fn main() { extern crate defmt; defmt::info!("The quick brown fox jumps over the lazy dog"); }
Format slices / arrays
The
{=[?]}
parameter can be used to log a slice of values that implement theFormat
trait.
The expected argument is a slice.
#![allow(unused)] fn main() { extern crate defmt; use defmt::Format; #[derive(Format)] struct X { y: u16, z: u8, } let xs: &[X] = &[ /* .. */ ]; defmt::info!("xs={=[?]}", xs); }
[T] where T: Format
also implements the Format
trait so it's possible to format [T]
with {=?}
but {=[?]}
uses slightly less bandwidth.
π‘ Note that for slices of bytes,
{=[u8]}
should be preferred as it's better compressed.
Arrays
If you have an array of types that implement the Format
trait, you should use
the {=[?; N]}
parameter (where N
is the number of elements); this saves bandwidth compared to {=[?]}
.
#![allow(unused)] fn main() { extern crate defmt; use defmt::Format; #[derive(Format)] struct X { y: u16, z: u8, } let xs: [X; 2] = [ X { y: 1, z: 2 }, X { y: 3, z: 4 }, // .. ]; defmt::info!("xs={=[?; 2]}", xs); }
Display hints
A hint can be applied to each formatting parameter to change how it's printed on the host side.
The hint follows the syntax :Display
and must come after the type within the braces, for example:
- typed
{=u8:x}
- untyped
{:b}
The following display hints are currently supported:
hint | name |
---|---|
:x | lowercase hexadecimal |
:X | uppercase hexadecimal |
:? | core::fmt::Debug -like |
:b | binary |
:a | ASCII |
:ms | timestamp in seconds (input in milliseconds) |
:us | timestamp in seconds (input in microseconds) |
:ts | timestamp in human-readable time (input in seconds) |
:tms | timestamp in human-readable time (input in milliseconds) |
:tus | timestamp in human-readable time (input in microseconds) |
The first 4 display hints resemble what's supported in core::fmt
, for example:
#![allow(unused)] fn main() { extern crate defmt; defmt::info!("{=u8:x}", 42); // -> INFO 2a defmt::info!("{=u8:X}", 42); // -> INFO 2A defmt::info!("{=u8:#x}", 42); // -> INFO 0x2a defmt::info!("{=u8:b}", 42); // -> INFO 101010 defmt::info!("{=str}", "hello\tworld"); // -> INFO hello world defmt::info!("{=str:?}", "hello\tworld"); // -> INFO "hello\tworld" }
The ASCII display hint formats byte slices (and arrays) using Rust's byte string syntax.
#![allow(unused)] fn main() { extern crate defmt; let bytes = [104, 101, 255, 108, 108, 111]; defmt::info!("{=[u8]:a}", bytes); // -> INFO b"he\xffllo" }
Alternate printing
Adding #
in front of a binary and hexadecimal display hints, precedes these numbers with a base indicator.
#![allow(unused)] fn main() { extern crate defmt; defmt::info!("{=u8:b}", 42); // -> INFO 101010 defmt::info!("{=u8:#b}", 42); // -> INFO 0b101010 defmt::info!("{=u8:x}", 42); // -> INFO 2a defmt::info!("{=u8:#x}", 42); // -> INFO 0x2a defmt::info!("{=u8:X}", 42); // -> INFO 2A defmt::info!("{=u8:#X}", 42); // -> INFO 0x2A }
Zero padding
Padding numbers with leading zeros is supported, for example:
#![allow(unused)] fn main() { extern crate defmt; defmt::info!("{=u8}", 42); // -> INFO 42 defmt::info!("{=u8:04}", 42); // -> INFO 0042 defmt::info!("{=u8:08X}", 42); // -> INFO 0000002A defmt::info!("{=u8:#08X}", 42); // -> INFO 0x00002A }
When the alternate form is used for hex and binary, the 0x
/0b
length is subtracted from the leading zeros. This matches core::fmt
behavior.
Display hints for byte slice and byte array elements
Besides ASCII hints, byte slice and array elements can be formatted using hexadecimal or binary hints:
#![allow(unused)] fn main() { extern crate defmt; let bytes = [4, 101, 5, 108, 6, 111]; defmt::info!("{=[u8]}", bytes); // -> INFO [4, 101, 5, 108, 6, 111] defmt::info!("{=[u8]:x}", bytes); // -> INFO [4, 65, 5, 6c, 6, 6f] defmt::info!("{=[u8]:#04x}", bytes); // -> INFO [0x04, 0x65, 0x05, 0x6c, 0x06, 0x6f] defmt::info!("{=[u8]:#010b}", bytes); // -> INFO [0b00000100, 0b01100101, 0b00000101, 0b01101100, 0b00000110, 0b01101111] }
Propagation
Display hints "propagate downwards" and apply to formatting parameters that specify no display hint.
#![allow(unused)] fn main() { extern crate defmt; #[derive(defmt::Format)] struct S { x: u8 } let x = S { x: 42 }; defmt::info!("{}", x); // -> INFO S { x: 42 } defmt::info!("{:#x}", x); // -> INFO S { x: 0x2a } }
#![allow(unused)] fn main() { extern crate defmt; struct S { x: u8, y: u8 } impl defmt::Format for S { fn format(&self, f: defmt::Formatter) { // `y`'s display hint cannot be overriden (see below) defmt::write!(f, "S {{ x: {=u8}, y: {=u8:x} }}", self.x, self.y) } } let x = S { x: 42, y: 42 }; defmt::info!("{}", x); // -> INFO S { x: 42, y: 2a } defmt::info!("{:b}", x); // -> INFO S { x: 101010, y: 2a } }
Implementing Format
#[derive(Format)]
The easiest way to implement the Format
trait for a struct
or enum
is to use the derive
attribute.
#![allow(unused)] fn main() { extern crate defmt; use defmt::Format; #[derive(Format)] struct Header { source: u8, destination: u8, sequence: u16, } #[derive(Format)] struct Descriptor; #[derive(Format)] enum Request { GetDescriptor { descriptor: Descriptor, length: u16 }, SetAddress { address: u8 }, } }
Like built-in derives (e.g. #[derive(Debug)]
), #[derive(Format)]
will add Format
bounds to the generic type parameters of the struct.
β οΈ Do not use the API used by the expansion of the
derive(Format)
macro; it is unstable.
Feature-gated #[derive(Format)]
It is also possible to feature-gate the implementation by defining
derive
implementation via cfg_attr(feature = ...)
:
#![allow(unused)] fn main() { #[derive(Copy, Clone)] #[cfg_attr(feature = "defmt", derive(defmt::Format))] struct Header { // ... } }
Manual implementation with write!
It is also possible to implement the Format
trait manually.
This trait has a single required method: format
.
In this method you need to format the value (self
) into the given Formatter
argument using the defmt::write!
macro.
Example below:
#![allow(unused)] fn main() { extern crate defmt; // value read from a MMIO register named "CRCCNF" struct CRCCNF { bits: u32, } impl defmt::Format for CRCCNF { fn format(&self, f: defmt::Formatter) { // format the bitfields of the register as struct fields defmt::write!( f, "CRCCNF {{ LEN: {0=0..2}, SKIPADDR: {0=8..10} }}", self.bits, ) } } }
Newtypes
If you need to implement Format
for some "newtype" struct you can delegate the formatting to the inner type.
Example below:
#![allow(unused)] fn main() { extern crate defmt; struct MyU8 { inner: u8 } impl defmt::Format for MyU8 { fn format(&self, f: defmt::Formatter) { self.inner.format(f) } } }
Uncompressed adapters
If you quickly want to get some code running and do not care about it being efficient you can use the two adapter types Display2Format
and Debug2Format
.
β οΈ These adapters disable compression and use the
core::fmt
code on-device! You should always preferdefmt::Format
overDebug
whenever possible!
Note that this always uses {:?}
to format the contained value, meaning that any provided defmt display hints will be ignored.
When using #[derive(Format)]
you may use the #[defmt()]
attribute on specific fields to use these adapter types.
Example below:
#![allow(unused)] fn main() { extern crate defmt; use defmt::Format; mod serde_json { #[derive(Debug)] pub enum Error {} } #[derive(Format)] enum Error { Serde(#[defmt(Debug2Format)] serde_json::Error), ResponseTooLarge, } struct Celsius(); impl std::fmt::Display for Celsius { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { Ok(()) } } #[derive(Format)] struct Reading { #[defmt(Display2Format)] temperature: Celsius, } }
Filtering
defmt
supports 5 different logging levels listed below from lowest severity to highest severity:
TRACE
DEBUG
INFO
WARN
ERROR
The logging macros trace!
, debug!
, info!
, warn!
and error!
match these logging levels.
By default, only ERROR
level messages are emitted.
All other logging levels are disabled.
Note that defmt::println!
statements cannot be filtered and are always included in the output.
DEFMT_LOG
if you are already familiar with
env_logger
andRUST_LOG
,defmt
's filtering mechanism works very similarly
To change which logging levels are enabled use the DEFMT_LOG
environment variable.
$ export DEFMT_LOG=warn
$ cargo build --bin app
$ # OR if using probe-run as the cargo runner you can use
$ DEFMT_LOG=warn cargo run --bin app
DEFMT_LOG
accepts the following logging levels: error
, warn
, info
, debug
, trace
.
Enabling a logging level also enables higher severity logging levels.
For example,
#![allow(unused)] fn main() { extern crate defmt; defmt::trace!("trace"); defmt::debug!("debug"); defmt::info!("info"); defmt::warn!("warn"); defmt::error!("error"); }
$ DEFMT_LOG=warn cargo run --bin all-logging-levels
WARN warn
ERROR error
$ DEFMT_LOG=trace cargo run --bin all-logging-levels
TRACE trace
DEBUG debug
INFO info
WARN warn
ERROR error
Modules
A different logging level filter can be applied to different modules using logging directives.
A logging directive has the syntax krate::module::path=level
.
DEFMT_LOG
can contain a list of comma separated logging directives.
#![allow(unused)] fn main() { extern crate defmt; // crate-name = app mod important { pub fn function() { defmt::debug!("important debug"); defmt::info!("important info"); defmt::warn!("important warn"); defmt::error!("important error"); } } mod noisy { pub fn function() { defmt::warn!("noisy warn"); defmt::error!("noisy error"); inner::function(); } mod inner { pub fn function() { defmt::warn!("inner warn"); defmt::error!("inner error"); } } } important::function(); noisy::function(); }
$ DEFMT_LOG=app::important=info,app::noisy=error cargo run --bin app
INFO important info
WARN important warn
ERROR important error
ERROR noisy error
ERROR inner error
Note that the app::noisy=error
directive also applies to the internal module app::noisy::inner
.
Hyphens
Crate names can have hyphens (-
) in Cargo metadata, and file paths, but when they appear in logging directives all hyphens must be converted into underscores (_
).
Packages vs crates
Do not confuse Cargo package names with crate names.
A Cargo package can contain more than one crate.
By default, the main crate has the same name as the package but this can be overridden in Cargo.toml
(e.g. in the [lib]
and [[bin]]
sections).
$ cargo new --lib my-package
$ tree my-package
my-package # package-name = my-package
βββ Cargo.toml
βββ src
βββ bin
β βββ my-binary.rs # crate-name = my_binary
βββ lib.rs # crate-name = my_package
Overrides
Logging directives that appear later in the list override preceding instances.
#![allow(unused)] fn main() { extern crate defmt; // crate-name = app pub fn function() { defmt::trace!("root trace"); } mod inner { pub fn function() { defmt::trace!("inner trace"); defmt::error!("inner error"); } } function(); inner::function(); }
$ DEFMT_LOG=trace,app::inner=error cargo run --bin app
TRACE root trace
ERROR inner error
This is equivalent to saying:
app::inner
emits ERROR level log messages and everything else emits TRACE level log messages.
Disabling logs
The "pseudo" logging level off
can be used to disable logs globally, per crate or per module.
$ # globally disable logs
$ DEFMT_LOG=off cargo run --bin app
$ # disable logs from the `noisy` crate (dependency)
$ DEFMT_LOG=trace,noisy=off cargo run --bin app
$ # disable logs from the `noisy` module
$ DEFMT_LOG=trace,app::noisy=off cargo run --bin app
Recompilation
It should be noted that DEFMT_LOG
is a compile-time mechanism.
Changing the contents of DEFMT_LOG
will cause all crates that depend on defmt
to be recompiled.
Default logging level for a crate
At the moment it's not possible to set a default logging level, other than ERROR, for a crate.
Timestamps
Applications that, directly or transitively, use any of
defmt
logging macros may use thetimestamp!
macro to define additional data to be included in every log frame.
The timestamp!
macro may only be used once throughout the crate graph. Its syntax is the same as for the other logging macros, except that timestamp!
is global and so cannot access any local variables.
By default, no timestamp is provided or transferred over the defmt sink.
Atomic timestamp
A simple timestamp
function that does not depend on device specific features and is good enough for development is shown below:
#![allow(unused)] fn main() { extern crate defmt; use std::sync::atomic::{AtomicUsize, Ordering}; // WARNING may overflow and wrap-around in long lived apps static COUNT: AtomicUsize = AtomicUsize::new(0); defmt::timestamp!("{=usize}", COUNT.fetch_add(1, Ordering::Relaxed)); }
Hardware timestamp
A timestamp
function that uses a device-specific monotonic timer can directly read a MMIO register.
It's OK if the function returns 0
while the timer is disabled.
The us
display hint can be used to format an integer value as a time in microseconds (eg. 1_000_000
may be displayed as 1.000000
).
extern crate defmt; fn monotonic_timer_counter_register() -> *mut u32 { static mut X: u32 = 0; unsafe { &mut X as *mut u32 } } // WARNING may overflow and wrap-around in long lived apps defmt::timestamp!("{=u32:us}", { // NOTE(interrupt-safe) single instruction volatile read operation unsafe { monotonic_timer_counter_register().read_volatile() } }); fn enable_monotonic_counter() {} fn main() { defmt::info!(".."); // timestamp = 0 defmt::debug!(".."); // timestamp = 0 enable_monotonic_counter(); defmt::info!(".."); // timestamp >= 0 // .. }
64-bit extension
Microcontrollers usually have only 32-bit counters / timers. Some of them may provide functionality to make one 32-bit counter increase the count of a second 32-bit counter when the first one wraps around. Where that functionality is not available, a 64-bit counter can be emulated using interrupts:
#![allow(unused)] fn main() { use std::sync::atomic::{AtomicU32, Ordering}; // the hardware counter is the "low (32-bit) counter" // this atomic variable is the "high (32-bit) counter" static OVERFLOW_COUNT: AtomicU32 = AtomicU32::new(0); // this is an interrupt handler running at highest priority fn on_first_counter_overflow() { let ord = Ordering::Relaxed; OVERFLOW_COUNT.store(OVERFLOW_COUNT.load(ord) + 1, ord); } }
To read the 64-bit value in a lock-free manner the following algorithm can be used (pseudo-code):
do {
high1: u32 <- read_high_count()
low : u32 <- read_low_count()
high2 : u32 <- read_high_count()
} while (high1 != high2)
count: u64 <- (high1 << 32) | low
The loop should be kept as tight as possible and the read operations must be single-instruction operations.
#[global_logger]
Applications that, directly or transitively, use any of
defmt
logging macros need to define a#[global_logger]
or include one in their dependency graph.
This is similar to how the alloc
crate depends on a #[global_allocator]
.
The global_logger
defines how data is moved from the device, where the application runs, to the host, where logs will be formatted and displayed.
global_logger
is transport agnostic: you can use a serial interface, serial over USB, RTT, semihosting, Ethernet, 6LoWPAN, etc. to transfer the data.
The global_logger
interface comprises the trait Logger
and the #[global_logger]
attribute.
The Logger
trait
Logger
specifies how to acquire and release a handle to a global logger, as well as how the data is put on the wire.
#![allow(unused)] fn main() { extern crate defmt; struct Logger; unsafe impl defmt::Logger for Logger { fn acquire() { // ... } unsafe fn flush() { // ... } unsafe fn release() { // ... } unsafe fn write(bytes: &[u8]) { // ... } } }
The write
method is not allowed to fail.
Buffering, rather than waiting on I/O, is recommended.
If using buffering write
should not overwrite old data as this can corrupt log frames and most printers cannot deal with incomplete log frames.
See the API documentation for more details about the safety requirements of the acquire-release mechanism.
The #[global_logger]
attribute
#[global_logger]
specifies which Logger
implementation will be used by the application.
#[global_logger]
must be used on a unit struct, a struct with no fields, which must implement the Logger
trait.
It's recommended that this struct is kept private.
#![allow(unused)] fn main() { extern crate defmt; #[defmt::global_logger] struct Logger; unsafe impl defmt::Logger for Logger { // ... fn acquire() {} unsafe fn flush() {} unsafe fn release() {} unsafe fn write(bytes: &[u8]) {} } }
β οΈ Only a single
#[global_logger]
struct can appear in the dependency graph of an application.Therefore the
global_logger
should be selected at the top of the dependency graph, that is in the application crate.
There are two general ways to implement a global_logger
.
Single logging channel
The first form uses a single channel.
This means that all execution contexts (i.e. threads OR main
+ interrupt handlers) use the same logging channel.
In an application that uses interrupts this means that acquire
must disable interrupts and release
must re-enable interrupts.
This synchronizes access to the single channel, from contexts running at different priority levels.
defmt-semihosting
is an example of this single logging channel approach.
Multiple logging channels
The other approach uses multiple logging channels: e.g. one for each priority level in an application that uses interrupts. With this approach logging can be made lock-free: interrupts are not disabled while logging data. This approach requires channel multiplexing in the transport layer. RTT, for example, natively supports multiple channels so this is not an issue, but other transports, like ITM, will require that each log frame to be tagged with the channel it belongs to (e.g. one logging channel = ITM channel).
The trade-offs of using more channels are:
- Lock-freedom
- higher memory usage on the target, for buffering
- lower overall throughput, as either different channels need to be polled from the host or the log frames need to be tagged with the channel they belong to
panic!
and assert!
The
defmt
crate provides its own version ofpanic!
-like andassert!
-like macros.
The defmt
version of these macros will log the panic message using defmt
and then call core::panic!
(by default).
Because the panic message is formatted using defmt!
the format string must use the same syntax as the logging macros (e.g. info!
).
#[defmt::panic_handler]
You can use the
#[defmt::panic_handler]
to override the panicking behavior of thedefmt::panic!
anddefmt::assert!
macros.
This attribute must be placed on a function with signature fn() -> !
.
This is for example useful, because defmt::panic!
invokes core::panic!
which can result in the panic message being printed twice if your #[core::panic_handler]
also prints the panic message.
This is the case if you use panic-probe
with the print-defmt
feature enabled but not an issue if you are using the panic-abort
crate, for example.
To avoid this issue you can replicate the panicking behavior of the Rust #[panic_handler]
but leave out the part that prints the panic message.
For example:
#[panic_handler] // built-in ("core") attribute
fn core_panic(info: &core::panic::PanicInfo) -> ! {
print(info); // e.g. using RTT
reset()
}
#![allow(unused)] fn main() { extern crate defmt; fn reset() -> ! { todo!() } #[defmt::panic_handler] // defmt's attribute fn defmt_panic() -> ! { // leave out the printing part here reset() } }
If you are using the panic-probe
crate then you should "abort" (call cortex_m::asm::udf
) from #[defmt::panic_handler]
to match its behavior.
π‘ Even if you don't run into the "double panic message printed" issue you may still want to use
#[defmt::panic_handler]
because this waydefmt::panic
anddefmt::assert
will not go through thecore::panic
machinery and that may reduce code size (we recommend you measure the effect of the change).
β οΈ The
#[panic_handler]
attribute cannot be used together with theexport_name
orno_mangle
attributes
Printers
Printers are host programs that receive log data, format it and display it. The following printers are currently available:
-
probe-run
, parses data sent over RTT (ARM Cortex-M only).π‘ If you are using the git version of defmt, make sure you also install the tool from git and not crates.io.
Since v0.3.3,
probe-run
has now a--json
flag to format the output. The main goal of--json
is to produce machine readable output, that can be used to changing the human-readable format, a question addressed here for example. -
defmt-print
, a generic command-line tool that decodes defmt data passed into its standard input. -
qemu-run
, parses data sent by QEMU over semihosting (ARM Cortex-M only).π‘ Used for internal testing and won't be published to crates.io
Encoding
π‘ Most users won't need to change the encoding so this section is mainly informative.
defmt
data can be encoded using one of these 2 formats:
rzcobs
- Reverse Zero-compressing COBS encoding (rzCOBS). This is the default encoding.raw
- raw data, that is no encoding.
In comparison to not using any encoding, rzcobs
compresses the data (uses less transport bandwidth),
and adds some degree of error detection thanks to its use of frames.
The encoding is selected via a Cargo feature on the defmt
crate.
These Cargo features are named encoding-{encoder_name}
, e.g. encoding-rzcobs
and encoding-raw
.
[package]
name = "my-application"
[dependencies.defmt]
version = "0.3.0"
features = ["encoding-rzcobs"] # <- encoding
β οΈ Note that libraries (dependencies) MUST not select the encoder so that applications (top-level crates) can.
If no enocding-*
feature is enabled then the default encoding is used.
The encoding is included in the output binary artifact as metadata so printers will detect it and use the appropriate decoder automatically.
When the rzcobs
encoding is used the printers will skip malformed frames (decoding errors) and continue decoding the rest of the defmt
data.
In contrast, printers handling the raw
encoding will exit on any decoding error.
Customizing the log output
The way a printer outputs logs can be customized by providing a format string to defmt
. The format string takes a set of metadata and format specifiers which can be used to include or exclude certain information when printing the logs.
Basics
The following log will be used as reference in the examples below:
defmt::debug!("hello");
The simplest format string is "{s}"
. This prints the log and nothing else:
hello
Arbitrary text can be added to the format string, which will be printed as specified with each log.
For example, "Log: {s}"
:
Log: hello
Multiple specifiers can be included within a format string, in any order. For example "[{L}] {s}"
prints:
[DEBUG] hello
Metadata specifiers
There are several metadata specifiers available that can be used in a format string.
Log - {s}
This specifier prints the actual log contents. For defmt::info!("hello");
, this specifier prints hello
.
Crate name - {c}
This specifier prints the name of the crate where the log is coming from.
File name - {f}
For a log coming from a file /path/to/crate/src/foo/bar.rs
, this specifier prints bar.rs
.
This specifier can be used to print more detailed parts of the file path. The number of f
s in the specifier determines how many levels up the path should be printed. For example, {ff}
prints foo/bar.rs
, and {fff}
prints src/foo/bar.rs
.
File path - {F}
For a log coming from a file /path/to/crate/src/foo/bar.rs
, this specifier prints /path/to/crate/src/foo/bar.rs
.
Line number - {l}
This specifier prints the line number where the log is coming from.
Log level - {L}
This specifier prints the log level. The log level is padded to 5 characters by default, for alignment purposes. For defmt::info!("hello);
, this prints INFO
.
Module path - {m}
This specifier prints the module path of the function where this log is coming from. This prints my_crate::foo::bar
for the log shown below:
// crate: my_crate
mod foo {
fn bar() {
defmt::info!("hello");
}
}
Timestamp - {t}
This specifier prints the timestamp at which a log was logged, as formatted by defmt::timestamp!
.
Customizing log segments
The way a metadata specifier is printed can be customized by providing additional, optional format specifiers.
Format parameters are provided by adding the formatting parameters after the metadata specifier, separated by colons (:
), like this: "{L:bold:5} {f:white:<10} {s}"
.
Color
A log segment can be specified to be colored by providing a color in the format parameters.
There are three different options for coloring a log segment:
- using a supported color such as
red
orgreen
. severity
colors the log segment using the predefined color for the log level of the log.werror
is similar toseverity
, but it only applies the color if the log level isWARN
orERROR
.
Only one coloring option can be provided in format parameters for a given log segment, i.e. {L:red:green}
is not supported.
The following colors are supported in the format parameters:
black
red
green
yellow
blue
magenta
purple
cyan
white
bright black
bright red
bright green
bright yellow
bright blue
bright magenta
bright purple
bright cyan
bright white
Styles
A log segment can be specified to be printed with a given style by providing a style specifier in the format parameters.
The style specifier must be one of the following strings:
bold
italic
underline
strike
dimmed
Multiple styles can be applied to a single log segment, but they must not be repeated, i.e.
"{s:bold:underline:italic}"
is allowed, but "{s:bold:bold}"
isn't.
Width, alignment and padding
A log segment can be specified to be printed with a given minimum width and alignment by providing a format parameter.
The alignment can be specified to be left (<
), right (>
), or center-aligned (^
). If no alignment is specified, left alignment is used by default.
The minimum width is specified after the alignment. For example, "{L} {f:>10}: {s}"
is printed as follows:
[DEBUG] main.rs: hello
The log segment is padded with spaces by default in order to fill the specified segment width. A specifier can be padded with zeros by prefixing the width specifier with a zero, e.g. {l:03}
prints a line number 24 as 024
.
Nested formatting
Log segments can be grouped and formatted together by nesting formats. Format parameters for the grouped log segments must be provided after the group, separated by %
.
Nested formats allow for more intricate formatting. For example, "{[{L:bold}]%underline} {s}"
prints
[DEBUG] hello
where only DEBUG
is formatted bold, and [DEBUG]
is underlined.
Formats can be nested several levels. This provides a great level of flexibility to customize the logger formatting.
For example, the width and alignment of a group of log segments can be specified with nested formats.
"{{[{L}]%bold} {f}:{l}%35} {s}"
prints something like this:
[DEBUG] main.rs:20 hello
[DEBUG] goodbye.rs:304 goodbye
Restrictions
- Format strings must include the
{s}
metadata specifier. - At the moment it is not possible to escape curly brackets (i.e.
{
,}
) in the format string, therefore curly brackets cannot be printed as part of the logger format. - The same restriction exists for the
%
character.
Passing log format to printers
The format string can be passed to probe-rs
, probe-run
and defmt-print
using the --log-format
option.
This option can be passed to the printer in .cargo/config.toml
, but due to limitations in cargo
, the command has to be split as follows:
# .cargo/config.toml
runner = [
"probe-rs",
"run",
"--chip",
"nRF52840_xxAA",
"--log-format",
"{L} {s}",
]
The format of the logs printed by the host can also be customized using the --host-log-format
option.
JSON output
Structured logging for
probe-run
.
As an alternative to its human-focused output, probe-run
offers structured JSON output. There are two use-cases:
- building software on top of
probe-run
- storing
probe-run
's output, in order to analyze it over time
β οΈ probe-run
v0.3.3+ is necessary to use this feature!
How to use it?
π: Sounds great, how can I use it?
To activate the JSON output, just add the --json
flag to your invocation of probe-run
. If you are using our app-template
edit .cargo/config.toml
like this:
[target.'cfg(all(target_arch = "arm", target_os = "none"))']
- runner = "probe-run --chip $CHIP"
+ runner = "probe-run --chip $CHIP --json"
Now probe-run
will output one line with a JSON object for each log-statement to stdout
, and your output will look similar to this:
$ DEFMT_LOG=debug cargo run --bin levels
{"schema_version":1}
(HOST) INFO flashing program (2 pages / 8.00 KiB)
ββ probe_run @ src/main.rs:93
(HOST) INFO success!
ββ probe_run @ src/main.rs:126
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
{"data":"info","host_timestamp":1643113115873940726,"level":"INFO","location":{"file":"src/bin/levels.rs","line":10,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"0"}
{"data":"warn","host_timestamp":1643113115873952269,"level":"WARN","location":{"file":"src/bin/levels.rs","line":12,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"1"}
{"data":"debug","host_timestamp":1643113115873957827,"level":"DEBUG","location":{"file":"src/bin/levels.rs","line":13,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"2"}
{"data":"error","host_timestamp":1643113115873981443,"level":"ERROR","location":{"file":"src/bin/levels.rs","line":14,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"3"}
{"data":"println","host_timestamp":1643113115873987212,"level":null,"location":{"file":"src/bin/levels.rs","line":15,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"4"}
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(HOST) INFO device halted without error
ββ probe_run::backtrace @ src/backtrace/mod.rs:108
π€―: But wait a moment?! ... That is not only JSON! How am I supposed to process that?
That is easy. As mentioned, the JSON output goes to stdout
. All the other output, like host logs and backtraces go to stderr
and therefore can be processed separately.
For example, you can redirect the JSON output to a file and still see the host logs in the terminal:
$ DEFMT_LOG=debug cargo rb levels > levels.json
(HOST) INFO flashing program (2 pages / 8.00 KiB)
ββ probe_run @ src/main.rs:93
(HOST) INFO success!
ββ probe_run @ src/main.rs:126
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(HOST) INFO device halted without error
ββ probe_run::backtrace @ src/backtrace/mod.rs:108
Afterwards levels.json
looks like this:
{"schema_version":1}
{"data":"info","host_timestamp":1643113389707243978,"level":"INFO","location":{"file":"src/bin/levels.rs","line":10,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"0"}
{"data":"warn","host_timestamp":1643113389707290115,"level":"WARN","location":{"file":"src/bin/levels.rs","line":12,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"1"}
{"data":"debug","host_timestamp":1643113389707299759,"level":"DEBUG","location":{"file":"src/bin/levels.rs","line":13,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"2"}
{"data":"error","host_timestamp":1643113389707306961,"level":"ERROR","location":{"file":"src/bin/levels.rs","line":14,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"3"}
{"data":"println","host_timestamp":1643113389707313290,"level":null,"location":{"file":"src/bin/levels.rs","line":15,"module_path":{"crate_name":"levels","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"4"}
π€: That seems convenient, but what is this schema version in the first line?
It indicates the version of the json format you are using. probe-run
will always output it as a header at the beginning of each stream of logs. We anticipate that the format will slightly change while probe-run
and defmt
evolve. Using this version you always know which revision is in use and can act upon that.
π€: Sounds great!
Data transfer objects
π€: So, what can I do with the JSON output?
There really are no boundaries. You can process the JSON with any programming language you like and also store it any data store of your choice to process and analyze it later. If you are saving the output for later, it might make sense to store the schema version together with additional metadata like e.g. a device id or firmware version. One option is to use a program like jq to extract the parts of interest.
If you wish to deserialize the entire data back into a Rust program, you will need to be able to decode the SchemaVersion
object at the start of the stream, as well as the JsonFrame
objects which follow after the schema version. To do that, we supply a few things in defmt_json_schema
:
- a
SchemaVersion
struct indefmt_json_schema::SchemaVersion
, - a versioned
JsonFrame
struct indefmt_json_schema::{schema_version}::JsonFrame
and - a
SCHEMA_VERSION
constant for each version of theJsonFrame
indefmt_json_schema::{version}::SCHEMA_VERSION
.
You can use all of this together with serde_json
like following:
extern crate defmt_json_schema; extern crate serde_json; use defmt_json_schema::{v1, SchemaVersion}; const DATA: &str = r#"{"schema_version":1} {"data":"Hello, world!","host_timestamp":1642698490360848721,"level":null,"location":{"file":"src/bin/hello.rs","line":9,"module_path":{"crate_name":"hello","modules":[],"function":"__cortex_m_rt_main"}},"target_timestamp":"0"} {"data":"S { a: 8 }","host_timestamp":1642698490361019228,"level":"INFO","location":{"file":"src/bin/hello.rs","line":26,"module_path":{"crate_name":"hello","modules":["{impl#0}"],"function":"abc"}},"target_timestamp":"1"}"#; fn main() { let mut data = DATA.lines().collect::<Vec<_>>(); // first we decode the schema version let schema_version: SchemaVersion = serde_json::from_str(data[0]).unwrap(); // and then handle the rest of the data (depending on the schema version) match schema_version { v1::SCHEMA_VERSION => handle_v1(&data[1..]), // v2::SCHEMA_VERSION => handle_v2(&data[1..]), _ => unreachable!(), }; } fn handle_v1(data: &[&str]) { println!("Detected version \"1\" of JsonFrame!"); use v1::JsonFrame; for &data in data.iter() { let json_frame: JsonFrame = serde_json::from_str(data).unwrap(); println!("{:?}", json_frame); } }
You can find an example with reading the content from a file here.
Migrating from v0.2.x
to v0.3.0
This guide covers how to upgrade a library or application using defmt v0.2.x
to version v0.3.0
.
Cargo.toml
Update the version of defmt
to "0.3"
(or "0.3.0"
, which is equivalent).
Additionally please remove the defmt-*
cargo features from your [features]
section.
[dependencies]
- defmt = "0.2"
+ defmt = "0.3"
[features]
default = [
"other-feature",
- "defmt-default",
- "dependency-a/defmt-trace",
]
other-feature = []
- defmt-default = []
- defmt-trace = []
- defmt-debug = []
- defmt-info = []
- defmt-warn = []
- defmt-error = []
Set the log-level with DEFMT_LOG
Setting the log-level via cargo features is superseded by the new DEFMT_LOG
environment variable.
To log everything on the INFO
level and above, run your application like following:
$ DEFMT_LOG=info cargo run
For more details how to configure the log-level using DEFMT_LOG
see the user docs.
This new mechanism is modelled to be similar to the well-known RUST_LOG
and now also supports log-configuration down to the module-level!
Upgrade display hints
π‘ Easily change the display hints, using the search-and-replace feature of your editor. (vs code)
Rename display hint Β΅s
to us
Due to ambiguity in-between Β΅
(micro sign) and ΞΌ
(small mu), the display hint for microseconds changed to be us
.
Therefore you likely need to update your timestamp definition.
- defmt::timestamp!("{=u32:Β΅s}", {
+ defmt::timestamp!("{=u32:us}", {
// ...
});
As well as all other logging calls where you were using Β΅s
.
- defmt::info!("{=u8:Β΅s}", time)
+ defmt::info!("{=u8:us}", time)
Drop u24
type hint
The u24
type hint got dropped, cause it was confusing users and complicated internal parser logic.
Therefore replace it with u32
in all logging calls.
- defmt::info!("{=u24}", 42);
+ defmt::info!("{=u32}", 42);
Adapt manual trait Logger
implementations
The Logger
trait has seen a couple of big changes, for one the function signatures of a few methods have changed, the previous Write
trait got removed while its write
method is part of Logger
now and the method flush
was added.
π‘ If you are using one of our loggers,
defmt-rtt
anddefmt-itm
, no action is required!
Let's see what a new implementation of the Logger
in crate defmt-itm
looks like compared to the previous implementation in version 0.2
. The following abbreviated example code shows how the Logger
worked before.
#[defmt::global_logger]
struct Logger;
static TAKEN: AtomicBool = AtomicBool::new(false);
unsafe impl defmt::Logger for Logger {
fn acquire() -> Option<NonNull<dyn defmt::Write>> {
// disable interrupts
if !TAKEN.load(Ordering::Relaxed) {
// acquire the lock
TAKEN.store(true, Ordering::Relaxed);
Some(NonNull::from(&Logger as &dyn defmt::Write))
} else {
None
}
}
unsafe fn release(_: NonNull<dyn defmt::Write>) {
// release the lock
TAKEN.store(false, Ordering::Relaxed);
// re-enable interrupts
}
}
impl defmt::Write for Logger {
fn write(&mut self, bytes: &[u8]) {
unsafe { itm::write_all(&mut (*ITM::ptr()).stim[0], bytes) }
}
}
And here is how it conceptually works now:
#![allow(unused)] fn main() { extern crate cortex_m; extern crate defmt; use std::sync::atomic::{AtomicBool,Ordering}; use cortex_m::{itm,peripheral::ITM}; #[defmt::global_logger] struct Logger; static TAKEN: AtomicBool = AtomicBool::new(false); static mut ENCODER: defmt::Encoder = defmt::Encoder::new(); unsafe impl defmt::Logger for Logger { fn acquire() { // disable interrupts if TAKEN.load(Ordering::Relaxed) { panic!("defmt logger taken reentrantly") } // acquire the lock TAKEN.store(true, Ordering::Relaxed); unsafe { ENCODER.start_frame(do_write) } } unsafe fn flush() { // Omitted. We will come to this in a second. } unsafe fn release() { ENCODER.end_frame(do_write); // release the lock TAKEN.store(false, Ordering::Relaxed); // re-enable interrupts } unsafe fn write(bytes: &[u8]) { ENCODER.write(bytes, do_write); } } fn do_write(bytes: &[u8]) { unsafe { itm::write_all(&mut (*ITM::ptr()).stim[0], bytes) } } }
Let us go through the changes step-by-step:
- Drop
trait Write
:- Extract the
fn write
from thetrait Write
and name itfn do_write
. - Remove the
impl defmt::Write for Logger
. - Remove the the first argument of
&mut self
fromfn do_write
.
- Extract the
- Add a new
static mut ENCODER: defmt::Encoder = defmt::Encoder::new()
outside theimpl defmt::Logger for Logger
-block. - Adapt
fn acquire
:- Remove the return type
Option<NonNull<dyn defmt::Write>>
fromfn acquire
. - Replace all
return None
with an explicitpanic!
, with a descriptive error message. - Call
unsafe { ENCODER.start_frame(do_write) }
, after you acquired the lock.
- Remove the return type
- Adapt
fn release
:- Call
ENCODER.end_frame(do_write);
, before releasing the lock.
- Call
- Add new method
unsafe fn write
toimpl defmt::Logger for Logger
:extern crate defmt; static mut ENCODER: defmt::Encoder = defmt::Encoder::new(); fn do_write(bytes: &[u8]) {} unsafe fn write(bytes: &[u8]) { ENCODER.write(bytes, do_write); }
And that is already it!
Flush
One final thing is left before your custom trait Logger
implementation works again: You need to implement fn flush
.
This functionality is what gets used when calling defmt::flush
, whose docs say:
Block until host has read all pending data.
The flush operation will not fail, but might not succeed in flushing all pending data. It is implemented as a βbest effortβ operation.
The idea is to ensure that all data is read by the host. Take defmt-rtt
as an example:
#![allow(unused)] fn main() { extern crate defmt; use std::sync::atomic::{AtomicUsize,Ordering}; #[defmt::global_logger] struct Logger; static READ: AtomicUsize = AtomicUsize::new(0); static WRITE: AtomicUsize = AtomicUsize::new(0); unsafe impl defmt::Logger for Logger { fn acquire() { // ... } unsafe fn flush() { // busy wait, until the read- catches up with the write-pointer let read = || READ.load(Ordering::Relaxed); let write = || WRITE.load(Ordering::Relaxed); while read() != write() {} } unsafe fn release() { // ... } unsafe fn write(bytes: &[u8]) { // ... } } }
If your transport doesn't allow to ensure that all data got read, flush
should at least flush as much data as possible. Take defmt-itm
as an example for this:
#![allow(unused)] fn main() { extern crate defmt; #[defmt::global_logger] struct Logger; unsafe impl defmt::Logger for Logger { fn acquire() { // ... } unsafe fn flush() { // wait for the queue to be able to accept more data while !stim_0().is_fifo_ready() {} // delay "a bit" to drain the queue // This is a heuristic and might be too short in reality. // Please open an issue if it is! asm::delay(100); } unsafe fn release() { // ... } unsafe fn write(bytes: &[u8]) { // ... } } // mock cortex_m-crate struct STIM0; impl STIM0 { fn is_fifo_ready(&self) -> bool { true } } fn stim_0() -> STIM0 { STIM0 } mod asm { pub fn delay(cycles: usize) {} } }
defmt::flush
can be used before a hard-reset of the device, where you would loose data if you do not flush.
Since you are the expert of your transport, implement the method now!
Unified probe-run
backtrace options
The new --backtrace
and --backtrace-limit
of probe-run
should simplify the configuration.
cargo run --bin panic --backtrace=always --backtrace-limit=5
Using --backtrace
you can now specify if you want to have a backtrace never
, always
or only in case of an exception (the auto
option, which is the default). For the latter two options you can specify the maximum backtrace length, which defaults to 50
and can be set to unlimited with 0
.
See the probe-run
-README for all the options.
Congrats! π
If you followed all the steps in this guide, your application should work flawlessly again and make use of all the internal and user-facing improvements shipped in this release!
Design & impl notes
Unstructured, braindump-ish notes about the design and implementation of defmt
WARNING the notes here may not accurately reflect the current implementation. This document is synchronized with the implementation at a best effort basis.
Optimization goals
defmt
optimizes for data throughput first and then for runtime cost.
Constraints
No double compilation
Say you want print logs from target/device app that uses crate foo
.
That crate foo
uses the Format
trait on some of its data structures.
In this scenario we want to avoid having to compile foo
for the host.
In other words, foo
should only be (cross) compiled for the target device.
This is the biggest difference between defmt
and some serde
library that does binary serialization.
The serde
library requires a Deserialize
step that requires compiling foo
for the host (see derive(Serialize)
).
defmt
avoids this by placing all the required information for formatting in a "side table" (see the interning section).
This comes with the downside that the host can only perform limited actions on the data it receives from the device: namely formatting.
The host cannot invoke foo::Struct.method()
for example but that may not even be a sensible operation on the host anyways, e.g. foo::USB::RegisterValue.store_volatile()
would make the host crash.
We want to avoid this "double" compilation (cross compile for the target and compile for the host) because:
- it doubles compilation time (wait time)
- compiling device-specific code for the host can become a headache quickly: see inline/external assembly, build scripts, etc.
Interning
All string literals are interned in a custom ELF section. This has proven to be the way that requires the less post-processing and implementation work. It is not without downsides as we'll see.
The basic pattern for interning a string is this:
#[export_name = "the string that will be interned"]
#[link_section = ".my_custom_section.some_unique_identifier"]
// ^ this is the INPUT linker section
static SYM: u8 = 0;
// index of the interned string
let index = &SYM as *const u8 as usize;
A linker script is required to group all these strings into a single OUTPUT linker section:
SECTIONS
{
/* NOTE: simplified */
.my_custom_section /* <- name of the OUTPUT linker section */
(INFO) /* <- metadata section: not placed in Flash */
: 0 /* <- start address of this section */
{
*(.my_custom_section.*); /* <- name of the INPUT linker section */
/*^ ^ glob pattern */
/*^ from any object file (~= crate) */
}
}
With this linker script the linker will tightly pack all the interned strings in the chosen linker section. The linker will also discard strings that end no being used in the final binary AKA "garbage collection". Garbage collection will only work correctly if every string is placed in a different INPUT linker section.
After you have linked the program you can display the interned strings using the nm
tool.
$ arm-none-eabi-nm -CSn elf-file
00000000 00000001 N USB controller is ready
00000001 00000001 N entering low power mode
00000002 00000001 N leaving low power mode
(..)
The nm
shows all the symbols in the ELF file.
In ELF files one function = one symbol and one static variable = one symbol.
So function foo
will show as crate_name::module_name::foo
in the nm
output; same thing with a static variable X
.
The four columns in the output, from left to right, contain:
- the address of the symbol
- the size of the symbol in bytes
- the type of the symbol
- the symbol name
As you can see the interned string is the symbol name. Although we cannot write:
static "USB controller is ready": u8 = 0;
We can write:
#![allow(unused)] fn main() { #[export_name = "USB controller is ready"] static SYM: u8 = 0; }
The next thing to note is that each interned string symbol is one byte in size (because static SYM
has type u8
).
Thanks to this the addresses of the symbols are consecutive: 0, 1, 2, etc.
Dealing with duplicates
The linker hates it when it finds two symbol that have the same name. For example, this is an error:
#![allow(unused)] fn main() { #[no_mangle] static X: u32 = 0; #[export_name = "X"] static Y: u32 = 0; //~ error: symbol `X` is already defined }
This produces two symbols with the name "X".
rustc
catches this issue early and reports an error at compile time.
How can this occur in logging? The user may write:
#![allow(unused)] fn main() { extern crate defmt; fn foo() { defmt::info!("foo started .."); // .. defmt::info!(".. DONE"); // <- } fn bar() { defmt::info!("bar started .."); // .. defmt::info!(".. DONE"); // <- } }
Because macros are expanded in isolation each info!(".. DONE")
statement will produce this to intern its string:
#[export_name = ".. DONE"]
#[link_section = ".."]
static SYM: u8 = 0;
which results in a collision.
To avoid this issue we store each interned string as a JSON object with 3 fields: the message itself, the name of the crate that invoked the macro, and a 64-bit integer "disambiguator". The disambiguator is a hash of the source code location of the log statement so it should be unique per crate. Now these two macro invocations will produce something like this:
// first info! invocation
{
#[export_name = "{ \"package\": \"my-app\", \"data\": \".. DONE\", \"disambiguator\": \"1379186119\" }"]
#[link_section = ".."]
static SYM: u8 = 0;
}
// ..
// second info! invocation
{
#[export_name = "{ \"package\": \"my-app\", \"data\": \".. DONE\", \"disambiguator\": \"346188945\" }"]
#[link_section = ".."]
static SYM: u8 = 0;
}
These symbols do not collide because their disambiguator fields are different so the program will link correctly.
Because duplicate strings are kept in the final binary this linker-based interner is not really an interner. A proper interner returns the same index when the same string is interned several times.
However, two log statements that log the same string will often have different source code locations. Assigning a different interner index to each log statement means we can distinguish between the two thus we can report their correct source code location.
Logging levels
defmt
supports several logging levels.
To avoid serializing the logging level at runtime (that would reduce throughput), interned strings are clustered by logging level.
The defmt
linker script looks closer to this:
SECTIONS
{
.defmt (INFO) : 0
{
*(.defmt.error.*); /* cluster of ERROR level log strings */
_defmt_warn = .; /* creates a symbol between the clusters */
*(.defmt.warn.*); /* cluster of WARN level log strings */
_defmt_info = .;
*(.defmt.info.*);
_defmt_debug = .;
*(.defmt.debug.*);
_defmt_trace = .;
*(.defmt.trace.*);
}
}
And the string interning that each logging macro does uses a different input linker section. So this code:
#![allow(unused)] fn main() { extern crate defmt; defmt::warn!("Hello"); defmt::warn!("Hi"); defmt::error!("Good"); defmt::error!("Bye"); }
Would expand to this:
// first warn! invocation
{
#[export_name = "{\"package\":\"my-app\",\"tag\":\"defmt_warn\",\"data\":\"Hello\",\"disambiguator\":\"8864866341617976971\"}"]
#[link_section = ".defmt.{\"package\":\"my-app\",\"tag\":\"defmt_warn\",\"data\":\"Hello\",\"disambiguator\":\"8864866341617976971\"}"]
static SYM: u8 = 0;
}
// ..
// first error! invocation
{
#[export_name = "{\"package\":\"my-app\",\"tag\":\"defmt_error\",\"data\":\"Bye\",\"disambiguator\":\"2879057613697528561\"}"]
#[link_section = ".defmt.{\"package\":\"my-app\",\"tag\":\"defmt_error\",\"data\":\"Bye\",\"disambiguator\":\"2879057613697528561\"}"]
static SYM: u8 = 0;
}
Then after linking we'll see something like this in the output of nm
:
$ arm-none-eabi-nm -CSn elf-file
00000000 00000001 N Bye
00000001 00000001 N Good
00000002 00000000 N _defmt_warn
00000002 00000001 N Hi
00000003 00000001 N Hello
00000003 00000000 N _defmt_info
(..)
There you can see that ERROR level logs are clustered at the beginning.
After that cluster comes the cluster of WARN level logs.
Between the two clusters you see the zero-sized _defmt_warn
symbol.
We know before-hand the name of the _defmt_*
symbols which are used as delimiters.
We can look their addresses first and when we lookup a string index in this table we can compare it to those addresses to figure the logging level it corresponds to.
- if
index < indexof(_defmt_warm)
then ERROR log level - if
indexof(_defmt_warn) <= index < indexof(_defmt_info)
then WARN log level
And so on so forth.
Serialization
In this section we'll see how log data is "put on the wire".
Interned strings
Let's ignore timestamps for now and also ignore how access to the global logger is synchronized. This is the simplest case: logging a string literal with no formatting.
#![allow(unused)] fn main() { extern crate defmt; defmt::info!("Hello, world!"); }
As we saw in the previous section this string will get interned.
Interning converts the string into a usize
index.
This usize
index will be further compressed using LEB128.
Some examples: (values on the right are u8
arrays)
1usize
->[1]
127usize
->[127]
128usize
->[128, 1]
255usize
->[255, 1]
Because string indices start at zero and it's unlikely that a program will intern more than 2^14 strings string literals will be serialized as 1 or 2 bytes indices.
Integers
Integers will be serialized in little endian order using to_le_bytes()
.
usize
and isize
values will be subject to LEB128 compression.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("The answer is {=i16}!", 300); // on the wire: [3, 44, 1] // string index ^ ^^^^^ `300.to_le_bytes()` // ^ = intern("The answer is {=i16}!") defmt::error!("The answer is {=u32}!", 131000); // on the wire: [4, 184, 255, 1, 0] // ^^^^^^^^^^^^^^^ 131000.to_le_bytes() defmt::error!("The answer is {=usize}!", 131000); // on the wire: [4, 184, 255, 1] // ^^^^^^^^^^^ 131000.to_le_bytes()[..3] }
NOTE(japaric) unclear to me if LEB128 encoding (more compression but more)
u16
andu32
is worth the trade-off
TODO(japaric) evaluate zigzag encoding for
isize
?
Slices
For slices ({=[u8]}
) the length is LEB128 encoded and serialized first and then followed by the slice data.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("Data: {=[u8]}!", [0, 1, 2]); // on the wire: [1, 3, 0, 1, 2] // string index ^ ^ ^^^^^^^ the slice data // LEB128(length) ^ }
Strings
Strings that are passed directly (i.e. not as indices of interned strings) as format string parameters ({:str}
) must be prefixed with their LEB128 encoded length.
This behavior is analogous to that of Slices.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("Hello, {=str}!", "world"); // on the wire: [1, 5, 199, 111, 114, 108, 100] // string index ^ ^ ^^^^^^^^^^^^^^^^^^^^^^^ the slice data // LEB128(length) ^ }
Arrays
For arrays ({=[u8; N]}
) the length is not serialized.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("Data: {=[u8; 3]}!", [0, 1, 2]); // on the wire: [1, 0, 1, 2] // string index ^ ^^^^^^^ the slice data }
Bitfield
The integer argument is serialized in little endian format (to_le_bytes
).
#![allow(unused)] fn main() { extern crate defmt; let x = 0u16; defmt::error!("l: {0=0..8}, m: {0=8..12}, h: {0=12..16}", x /*: u16*/); // on the wire: [1, 1, 2] // string index ^ ^^^^ `u16::to_le_bytes(x)` }
Leading or trailing bytes that are not needed to display a bitfield are removed during serialization:
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("m: {0=8..12}", 0b0110_0011_0000_1111_u32); // on the wire: [1, 0b0110_0011] // string index ^ ^^^^^^^^^^ argument truncated into u8: // leading and trailing byte are irrelevant }
Bool
Booleans are compressed in bytes, bitflags-style.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("x: {=bool}, y: {=bool}, z: {=bool}", false, false, true); // on the wire: [1, 0b001] // string index ^ ^^^^^ the booleans: `0bxyz` }
When mixed with other data, the first {=bool}
allocates an output byte that
fits up to 7 more bools.
{=bool}
s in the formatting string are batched together as follows: Any non-{=bool}
arguments are emitted as-is, while {=bool}
arguments are collected into a byte and emitted when 8 {=bool}
s have been collected. This means that for every set of 8 {=bool}
s, the byte containing them will be serialized at the position of their last member.
If more than 0 but less than 8 {=bool}
s have been encountered at the end of the log frame, a byte containing them will be emitted last.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("x: {=bool}, y: {=u8}, z: {=bool}", false, 0xff, true); // on the wire: [1, 0xff, 0b01] // string index ^ ^^^^^ ^^^^ the booleans: `0bxz` // | // u8 }
β οΈ If the final parameter is not a {=bool}
but there are yet to be compressed {=bool}
s present in the format string beforehand, the final output byte containing all compressed booleans will be at the end.
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("x: {=bool}, y: {=u8}", false, 0xff); // on the wire: [1, 0xff, 0b0,] // string index ^ ^^^^^ ^^^^ the booleans: `0bx` // | // u8 }
β οΈ If some {=bool}
s are nested inside a struct, they will still be compressed as if they were passed as regular arguments.
#![allow(unused)] fn main() { extern crate defmt; use defmt::Format; #[derive(Format)] struct Flags { a: bool, b: bool, } defmt::error!("x: {=bool}, {=?}", false, Flags { a: true, b: false }); // on the wire: [1, 2, 0b010,] // string index ^ ^ ^^^^ all booleans: `0bxab` // | // index of "Flags { a: {=bool}, b: {=bool}} " }
Format
The untyped argument (=?
) requires one level of indirection during serialization.
First let's see how a primitive implements the Format
trait:
#![allow(unused)] fn main() { extern crate defmt; macro_rules! internp { ($l:literal) => { defmt::export::make_istr(0) } } trait Format { fn format(&self, fmt: defmt::Formatter); } impl Format for u8 { fn format(&self, fmt: defmt::Formatter) { let t = internp!("{=u8}"); defmt::export::istr(&t); defmt::export::u8(self) // on the wire: [1, 42] // string index ^ ^^ `self` } } }
Format
will use the write!
macro.
This will send the string index of {=u8}
followed by the one-byte data.
In general, write!
can use {=?}
so Format
nesting is possible.
Now let's look into a log invocation:
#![allow(unused)] fn main() { extern crate defmt; defmt::error!("The answer is {=?}!", 42u8); // on the wire: [2, 1, 42] // string index ^ ^^^^^ `42u8.format(/*..*/)` // ^ = intern("The answer is {=?}!") }
This will send the string index of "The answer is {:?}!" and invoke the argument's Format::format
method.
Format Slices
{=[?]}
will serialize the length (LEB128 compressed) first, then the first element will be serialized in (recursively) tagged format. The rest of elements will be serialized untagged.
"Tagged" means that the data will be preceded by the string indices that indicate how to format the data.
Example:
#![allow(unused)] fn main() { extern crate defmt; use defmt::Format; #[derive(Format)] struct X { y: Y, } #[derive(Format)] struct Y { z: u8, } fn serialize() { let xs = [X { y: Y { z: 42 }}, X { y: Y { z: 24 }}]; defmt::info!("{=[?]}", &xs[..]); // on-the-wire: [ // 1, // "{=[?]}" // 2, // `leb(xs.len())` // 2, // "X {{ y: {=?} }}" / outer tag with format nesting through `=?` // 3, // "Y {{ z: {=u8} }}" / inner tag // 42, // xs[0].y.z // (no tags for the second element) // 24, // xs[1].y.z // ] } }
Single Format trait
core::fmt
has several formatting traits, like Hex
and Bin
.
These appear as different formatting parameters, like :x
and :b
, in format strings and change how integers are formatted: 15
vs 0xF
vs 0b1111
.
defmt
does not have all these formatting traits.
The rationale is that the device should not make the decision about how an integer is formatted.
The formatting is done in the host so the host should pick the format.
With interactive displays, e.g. web UI, it even becomes possible to change the format on demand, e.g. click the number to change from decimal to hexadecimal representation.
Timestamp
defmt::timestamp!
needs to be as efficient as possible, because it is implicitly invoked on every single log invocation.
The timestamp format string index is not transmitted over the wire.
Instead, it is marked with a defmt_timestamp
tag and the decoder loads it from the ELF file.
Linker magic is used to make sure that it doesn't get defined twice, and that the symbol doesn't get discarded (which can happen since its address is never used).
The us
format specifier was introduced to allow replicating the timestamp format of previous defmt versions, which always used a LEB128-encoded u64
timestamp and treated it as a number of microseconds.
Global logger
The global logger needs to operate correctly (be memory safe and not interleave log data) in presence of race conditions and re-entrant invocations. Race conditions can be avoided with mutexes but re-entrancy can occur even if mutexes are used and shouldn't result in deadlocks.
Re-entrancy
Where can re-entrancy occur? Turns out that with global singletons it can occur about anywhere; you don't need interrupts (preemption) to cause re-entrancy. See below:
#![allow(unused)] fn main() { extern crate defmt; let x = 0u8; defmt::info!("The answer is {=?}!", x /*: Struct */); }
As you have seen before this will first send the string index of "The answer is {=?}!" and then call x
's Format::format
method.
The re-entrancy issue arises if the Format
implementation calls a logging macro:
#![allow(unused)] fn main() { extern crate defmt; struct X; impl defmt::Format for X { fn format(&self, f: defmt::Formatter) { // ^ this is a handle to the global logger defmt::info!("Hello!"); // .. } } }
f
is a handle to the global logger.
The info!
call inside the format
method is trying to access the global logger again.
If info!
succeeds then you have two exclusive handles (Formatter
) to the logger and that's UB.
If info!
uses a spinlock to access the logger then this will deadlock.
Acquire-release
One solution to the re-entrancy issue that's deadlock-free is to make the log macros take the logger and hold it until it's done with it. In case of nesting any inner take attempt will silently fail.
So the log macros may expand to something like this: (let's ignore data races / race conditions for now)
#![allow(unused)] fn main() { struct Logger; impl Logger { fn acquire() -> Option<Self> { None } fn serialize_interned_string_and_etc(&self) {} } fn release<T>(_: T) {} if let Some(logger) = Logger::acquire() { logger.serialize_interned_string_and_etc(); release(logger); // <- logger can be acquired again after this } else { // silent failure: do nothing here } }
This means that invoking logging macros from Format
implementations will silently fail.
But note that allowing such operation would result in interleaving of log frames.
To a decoder/parser interleaved log frames are the same as corrupted log frames.
So we actually want to forbid this operation.
Evaluation order
Consider this log invocation:
#![allow(unused)] fn main() { extern crate defmt; defmt::info!("x={=?}", foo()); fn foo() -> u8 { defmt::info!("Hello"); 42 } }
Depending on when foo
is invoked this can result in potential re-entrancy / nesting and cause info!("Hello")
to be lost.
So we'll make the macro evaluate format arguments before the acquire operation.
Something like this:
(core::fmt
does a similar match
operation)
#![allow(unused)] fn main() { struct Logger; impl Logger { fn acquire() -> Option<Self> { None } } fn foo() -> u8 { 0 } match (foo()) { // evaluate formatting arguments (_0) => { if let Some(logger) = Logger::acquire() { // serialize `_0`, etc. } } } }
Preemption
Preemption can also result in re-entrancy. How to deal with it? Assuming single-core systems there are two approaches:
-
Disable interrupts in
acquire
; re-enable them inrelease
. This means that the logging macros block higher priority interrupts. -
Have a separate logger per priority level.
acquire
andrelease
are now lock-free and don't block interrupts. This requires multiplexing in the transport.
Deserialization
The host has received the log data (binary data). How to make sense of it?
Let's assume:
- no data loss during transport (reliable transport)
- no interleaving of log frames (no nesting of logging macros)
With these assumptions the decoder can expect the stream of log data to be a series of log frames.
Log frames
Each log statement produces one log frame. Consider this log call: (let's include the timestamp this time)
#![allow(unused)] fn main() { extern crate defmt; defmt::info!("answer={=u8}", 42); // on the wire: [2, 125, 42] <- arguments // string index ^ ^^^ timestamp }
A log frame will consist of:
- A string index that must be either of the error, warn, info, debug or trace kind.
- String indices generated by
write!
(used inFormat
implementations) are of a different kind
- String indices generated by
- A timestamp (LEB128 encoded)
- Zero or more formatting arguments
To be able to decode the last component the host will have to lookup the format string, whose index is the first part of the log frame, and parse it. Parsing that string will tell the host how many and how big (in bytes) the formatting arguments are.
Lookup
We have so far looked at the string table using nm
.
Programmatically the table can be found in the .symtab
section.
Each entry in this table represents a symbol and each entry has:
shndx
, a section header index (?). This should match the index of the.defmt
section.value
, this is the address of the symbol. For.defmt
, this is the string indexname
, an index into some data structure full of strings.get_name
returns the interned string- the other info is not relevant