— Gavin Gray
Rough data hit my eyes recently and the takeaway was unsurprising: Rust associated types are misunderstood. In Chapter 17 of the Rust Book Experiment Will Crichton decided to inject questions on API design. The questions have been up long enough for us to see the preliminary data, and so far, the results are not great. Many questions involving associated types on traits were answered incorrectly. I’ve been knee-deep in Rust traits recently and feel I have enough energy to channel some passion into a post on associated types. This post serves two purposes
The second purpose, really a goal, may sound scary to you at first, but it’s my job to convince you
Associated types are used all over the Rust standard library. If you’re unfamiliar with Rust, it may shock you to find that common operations
are implemented with traits and associated types, but let’s not get ahead of ourselves. Before we dig into the broader use of associated types let’s look at one common example, the Mul trait.
pub trait Mul<Rhs> {
type Output;
fn mul(self, rhs: Rhs) -> Self::Output;
}
pub trait BadMul<Rhs, Output> {
fn mul(self, rhs: Rhs) -> Output;
}
In the trait definition for BadMul there are three type parameters: Self, Rhs, and Output. Self is an implicit type parameter to every trait and denotes the type for which the trait will be implemented. In the context of multiplication, Self is the type of the left-hand side operand. This means that the method mul has the type signature fn mul(self: Self, rhs: Rhs) -> Output. The implementation for BadMul between two integers is as follows
impl BadMul<i32, i32> for i32 {
fn mul(self, rhs: i32) -> i32 {
self * rhs
}
}
Edit, in Rust, like any language with sized integers, multiplication between two i32s may overflow. I’ve chosen to ignore overflow for this post.In this trait implementation we specified that all three type parameters, Self, Rhs, and Output, equal i32. In elementary school you memorized integer multiplication tables and as you got older learned that we can multiply much more. Let’s take a breath and jot down some of the types we could substitute for our type parameters.
The last type signature makes no sense. When you multiply an integer and a float you get a float, not a two-dimensional matrix, at least in traditional multiplication. So what went wrong? The type for Output is dependent on the actual types for Self and Rhs. We aren’t free to choose the type for Output if we’ve chosen Self and Rhs—there exists one natural choice for Output once the others have been fixed. The dependency between type parameters indicates that Output should be an associated type on the trait BadMul. Of course, after moving Output to be an associated type we should also rename the trait to GoodMul, or to std::ops::Mul. I claim that if such a dependency exists between type parameters, the dependent type parameter should become an associated type.
Before moving forward, let’s correct the record by providing a better Mul trait definition with Output as an associated type and an example implementation for multiplying integers and floats.
pub trait Mul<Rhs> {
type Output;
fn mul(self, rhs: Rhs) -> Self::Output;
}
impl Mul<f32> for i32 {
type Output = f32;
fn mul(self, rhs: f32) -> Self::Output {
self as f32 * rhs
}
}
If you were to stop reading here the rule-of-thumb for whether or not a type parameter should be an associated type would read
A type parameter dependent on other type parameters should be an associated type.
Though I do believe this is an OK rule-of-thumb, it’s more hand-wavy than I’d like. Let’s make it more specific by considering one more change to the Mul trait definition. Consider the type parameter Rhs, should Rhs be an associated type? Is Rhs dependent on the type of Self? To help you decide, let’s assign Self to be the concrete type Vec<i32>. Provided this assignment to Self, let’s list types we could assign to Rhs.
We should not assign Rhs = Vec<i32> because Vec<i32> * Vec<i32> is an ambiguous operation. Multiplication between vectors could mean element-wise multiplication, an inner vector product, or an outer vector product. Though some languages/libraries do choose a default operation for vector multiplication, choosing a default is, in my opinion, poor design and for this post we will not consider it valid. Therefore, the choice to assign Vec<i32> to Self has reduced the possible types assignable to Rhs. So yes, Rhs is dependent on Self, but should it be an associated type? The answer is no. To see why, let’s write another trait definition that declares Rhs an associated type. This version shall be called RestrictedMul.
trait RestrictedMul {
type Rhs;
type Output;
fn mul(self, rhs: Self::Rhs) -> Self::Output;
}
impl RestrictedMul for i32 {
type Rhs = i32;
type Output = i32;
fn mul(self, rhs: Self::Rhs) -> Self::Output {
self * rhs
}
}
impl RestrictedMul for i32 {
type Rhs = f32;
type Output = f32;
fn mul(self, rhs: Self::Rhs) -> Self::Output {
self * rhs
}
}
The above code fails to type check. See for yourself in this playground
error[E0119]: conflicting implementations of trait `RestrictedMul` for type `i32`
--> src/main.rs:17:1
|
8 | impl RestrictedMul for i32 {
| -------------------------- first implementation here
...
17 | impl RestrictedMul for i32 {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `i32`
The core issue is that for one left-hand side type, for example Self = i32, there are multiple types we want to assign to the Rhs. After all, you can multiply i32 with many other types: i32, i64, and Matrix<i32>, etc. When a type is a type parameter, it signals that there is more than one way to instantiate the type. (I.e., assign the parameter a type.) When a type is an associated type, it signals that it can be instantiated with exactly one type per implementation block. Associated types aren’t just those dependent on other types, but specify a mapping from many types to one associated type. Our rule-of-thumb should read
A type parameter should be an associated type if for all implementation blocks there exists a relationship mapping the other type parameters to it. (I.e., you can create a table mapping the other parameters to the associated type, like we did for Mul.)
This rule-of-thumb is better than the prior notion of dependence: Not only are associated types dependent on the trait’s type parameters, but they are instantiated with a single type. In a previous post on Rust’s standard library, Pretzel Hammer puts forth a rule-of-thumb that reads
Use associated types when there should only be a single impl of the trait per type. Use generic types when there can be many possible impls of the trait per type.
We are trying to say the same thing. I think their rule lacks a bit of specificity, especially when traits have both type parameters and associated types. Take for example the Mul trait, there are several implementations with i32 as the Self type. Though both of these rules-of-thumb are intuition-based and require foresight into how future developers will use your trait. This foresight is a little scary, especially if you’re new to Rust or traits. But I argue that you engage in this reasoning frequently; every time you write a function you perform similar reasoning. We can piggyback on your engineering instinct about functions to help you reason about associated types.
I don’t want to get lost in type-theoretic weeds, so let’s allow ourselves to forget static types for a moment. Let’s write a multiplication function in a dynamically typed language, I prefer Julia, so Julia is what I will use. (If you’re unfamiliar, the syntax is quite easy.)
function mul(a, b)
a * b
end
The function mul has two inputs and one output, C = mul(A, B). I shall ask again, what are some possible values we can pick for A, B, and C? Here’s some examples
Just as before, we find that the set for C is not the same as the others. If you pick a random number from each of the sets and assign the value to our variables, the line 2 = mul(0, 1) would be nonsense; we know that 0 * 1 is equal to 0, not 2. The value of C depends on the values chosen for A and B. You know this because you spent all that time in elementary school memorizing multiplication tables. These tables map two operand values to the multiplied output value.
⨉ | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
2 | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 |
3 | 3 | 6 | 9 | 12 | 15 | 18 | 21 | 24 | 27 | 30 | 33 | 36 |
4 | 4 | 8 | 12 | 16 | 20 | 24 | 28 | 32 | 36 | 40 | 44 | 48 |
5 | 5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 | 55 | 60 |
6 | 6 | 12 | 18 | 24 | 30 | 36 | 42 | 48 | 54 | 60 | 66 | 72 |
⨉ | i32 | i64 | f32 | Matrix<i32> |
---|---|---|---|---|
i32 | i32 | i64 | f32 | Matrix<i32> |
i64 | i64 | i64 | ? | Matrix<i64> |
f32 | f32 | ? | f32 | Matrix<f32> |
Matrix<i32> | Matrix<i32> | Matrix<i64> | Matrix<f32> | Matrix<i32> |
This table shows that if you provide two input types, which are represented as the row and column header, you get back an output type. This behavior is what I call a type-level function: You provide two input types and get back an output type, just how functions on values take input values (parameters) and return an output value (the return).
By now, you may start to see why I describe associated types as the output of a type-level function. A multiplication function on values maps two input numbers to a third and it’s natural to define this mapping as a function. For the same two input values the mul function always returns the same output value.
For traits, the same type parameters always map to the same associated type. One benefit for type parameters is that they guarantee this. For example, multiplying a Matrix<i32> with a i32 always returns Matrix<i32>—no questions asked. In Rust you access the output type with
<i32 as Mul<i32>>::Output
/ \ \
Self Rhs Output
but if we wanted to mimic the structure of the Julia mul function, we could write
i32 = Mul<i32, i32>
/ / \
Output Self Rhs
By moving Output from a type parameter to an associated type in the Mul trait we statically guarantee that for two input types there will exist only one implementation of the multiply method. The definition of BadMul did not guarantee this. Remember that we provided an implementation for fn mul(self: i32, rhs: i32) -> i32, using BadMul we could provide a second implementation that defines multiplication as fn mul(self: i32, rhs: i32) -> f32. This poor decision shall have some negative implications, as we will see.
// A refresher on the definition
pub trait BadMul<Rhs, Output> {
fn mul(self, rhs: Rhs) -> Output;
}
// The only correct implementation
impl BadMul<i32, i32> for i32 {
fn mul(self, rhs: i32) -> i32 {
self * rhs
}
}
// Some engineer's poor decision to provide another
impl BadMul<i32, f32> for i32 {
fn mul(self, rhs: i32) -> f32 {
(self * rhs) as f32
}
}
In this version of the Rust universe there are now two different implementations for multiplication between integers. The Rust compiler would need to infer the output type to correctly pick which implementation block to use. Even if there is only a single implementation block, the Rust compiler cannot pick the return type based on the implementation block—it still needs to be inferred. This would require developers to annotate more types than would otherwise be necessary. Edit, I just learned about the type inference change in Rust 1.80 that broke any crate with time as a dependency. Pre-1.80, Rust allowed inference to resolve a type if there was a single implementation block, despite this being unsound. Post-1.80 disallowed this behavior, thus breaking inference for any crate that relied on the pattern, one example being the time crate.
fn main() {
println!("{}", 3 * 2);
println!("{}", <i32 as BadMul<i32, _>>::mul(3, 2));
}
If the definition of Mul in the Rust standard library were written like BadMul, the first line in the above code would behave like the second line of code does. This is undesirable because the code fails to type check.
error[E0283]: type annotations needed
--> src/main.rs:20:18
|
20 | println!("{}", <i32 as BadMul<i32, _>>::mul(3, 2));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| cannot infer type of the type parameter `Output`
| declared on the trait `BadMul
The above code doesn’t compile because the Rust compiler doesn’t know how to infer the type parameter Output. It needs to be specified. The broader implication is that type parameters that should be associated types require developers to annotate more types. To recap, associated types guarantee that there exists a single type for the implementation block, thus helping the compiler to infer types rather than requiring it to infer more than necessary.
Reasoning about associated types may be easier with the Mul trait, after all, it fits my proposed mental model very well and multiplication is something with which we’re all familiar. Lots of traits in the Rust standard library use associated types, and some may not be as obvious. Let’s look at two more traits: Iterator, and Deref, and specifically why they have associated types.
Iterators: Even if you don’t use iterators explicitly, for and while loops are syntactic sugar for iterator-based code. Needless to say they’re used quite a bit. Below is the trait definition for Iterator. Read it and ask yourself, why is Item an associated type?
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
Iterators have only one type parameter, Self, the type implementing the iterator. For all of the types that Self could be, each one has a single specific type for Item. OK, not a table in this case, just a plain map. Again, we can write a table describing the relationship between Self and Item.
👾 | Iter<'a, T> | IterMut<'a, T> | Chars<'a> | Chunks<'a, T> |
---|---|---|---|---|
&'a T | &'a mut T | char | &'a [T] |
Item is an associated type because there exists a mapping from Self types to Item
So that everyone is on the same page, below is the definition for the Java Iterator interface.
public interface Iterator<Item> {
// Returns `true` if the iteration has more elements.
boolean hasNext();
// Returns the next element in the iteration.
// @throws `NoSuchElementException` if the iteration has no more elements
E next();
}
In Rust the type of iterator elements Item is an associated type and I want you to understand why that needs to be the case in Rust while Java gets away with using a type parameter. A naïve response might be that Java doesn’t have type inference for variable assignments. Historically this was true, but Java 10 introduced the var binding that enables local inference. For example, creating a vector and iterating over the contents is shown below in Rust and Java.
// create a vector with two zeros
let v = vec![0; 2];
// iterate over the vector and print the contents
for i in &v {
println!("{i}");
}
// create a vector with two zeros
var v = new Vector<>() {{ add(0); add(0); }};
// iterate over the vector and print the contents
for (var i : v) {
System.out.println(i);
}
Each code snippet behaves the same and neither specifies the type of the iterated elements. So the real reason can’t be for mere type inference. The real reason is that Rust’s type system affects how the code runs in a deeper way than Java’s. And it all boils down to the fact that Java has a runtime system, the Java Virtual Machine (JVM).
Only caring about the existence of a method at runtime differs drastically from Rust, the Rust compiler needs to know the exact address of the functions to be called; there is no runtime to figure this out. In the above code, here are some things Rust needs to know
All of these individual pieces of Rust’s type machinery are in place so that the compiler can replace calls like into_iter and fmt with specific function addresses. Unlike Java, there is no runtime system that will find the methods later. Rust needs to know now at compile time!
This is how Java can get away with using a type parameter for iterator items, and why Rust developers should learn how and when to properly use associated types. Let’s return to the Rust standard library and look at the last example, pointer dereferencing.
Dereferences: Pointer dereferencing is a well-known operation to systems programmers. If you’ve used a C-like language at some point, you’re acquainted with pointer dereferencing and may have even dereferenced the null pointer. One feature of Rust that you may be unfamiliar with is called deref coercion. This means that Rust will automatically insert dereference operations where possible. It’s the feature that enables the following weird piece of code to type check.
use std::sync::Arc;
use std::rc::Rc;
fn takes_a_ref(x: &i32) { /* ... */ }
fn weird_func(y: &&&&&Box<Arc<Rc<&&&&&i32>>>) {
takes_a_ref(y)
}
If you don’t like counting, there’s five references to the Box, and five inside the Rc to the integer, for a total of thirteen references. However, when we call takes_a_ref(y) we don’t have to insert dereferences, the compiler does it for us. Et voilà, deref coercion in action! If you ever find yourself with thirteen references to an integer, please reevaluate your life decisions that led you to that point. How is the compiler able to do this without accidentally screwing up our types? Let’s look at the trait definition Deref that defines how a type is dereferenced.
pub trait Deref {
type Target;
fn deref(&self) -> &Self::Target;
}
// Example implementation for immutable references
impl<T> Deref for &T {
type Target = T;
// &&T -> &T
fn deref(&self) -> &Self::Target {
*self
}
}
Observe that Target, the resulting type of the dereference, is an associated type. This is a very important design choice for the trait. With Target as an associated type it is statically guaranteed that a type always dereferences to the same type. If Target were instead a type parameter, deref coercion could potentially insert dereferences resulting in code that doesn’t type check, due to ambiguous types like we saw with the BadMul trait. Just like all the other traits we’ve looked at so far, we can write a table that describes the relationship between the Self type and Target.
* | String | &T | &mut T | Box<T> | Arc<T> | Vec<T> |
---|---|---|---|---|---|---|
str | T | T | T | T | [T] |
Implementing Deref is especially helpful for pointers, or smart pointers like Box and Arc. Implementing Deref allows us programers to treat these types like pointers and leaves the deref gymnastics for the compiler to insert.
I’ve provided a general rule-of-thumb to help you determine when a type parameter should be an associated type.
A type parameter should be an associated type if for all implementation blocks there exists a relationship mapping the other type parameters to it. (I.e., you can create a table mapping the other parameters to the associated type, like we did for Mul.)
The important difference is that associated types can only be specified once per implementation, type parameters can vary. This relationship has implications on how behaves, previously we saw how making Output a type parameter on the BadMul trait would require us to write more annotations—the compiler couldn’t infer the type for us. There are other motivating factors behind associated types, specified well in the RFC, I will summarize a couple here.
you want to read other’s writing on associated types, consider the following:
If you have questions or would like to correct this post, as always, my email is open.