Skip to main content

Modeling Finite State Machines with Rust

· 7 min read

A core concept in Rust is the ownership rule that "governs how a Rust program manages memory". This distinctive feature in Rust allows it to obviate the need for an automatic garbage collector (GC), support fearless concurrency, and enable its use in low-resource environments. While avoiding the need for GC is immediately apparent, it also allows expressing more constraints that can be enforced at compile-time. Therefore, even languages that use GC for memory management such as Haskell and Scala may implement a similar concept in the future (Haskell linear type proposal, Scala's Capabilities for Uniqueness and Borrowing). In the same vein, languages such as Dahlia use the same concept to model hardware constraints to express FPGA designs.

Rust is also a statically typed language, which like other similar languages, catches errors such as calling a non-existent function or passing a wrong type of argument. Combining these two features leads to some interesting patterns. Here, I will explore how these two features can be combined to model a finite state machine (FSM).

FSMs model two aspects:

  • A machine defines legal stimuli for each of its finite states. For example, if you model a simple light switch using a state machine, it is legal to send "turn off" only when the current state is "on". Upon receiving a stimulus, the machine will change its state to another or the same state. For example, if the current state is "on", the machine will transition to the "off" state upon receiving "turn off".
  • A machine is only in one state at a time (we are leaving out orthogonal regions since that is well–-orthogonal to our discussion!). In our light switch example, the state machine will be either in the "on" or "off" state, but never both.

Traditional statically typed systems (Java, Go, Scala--basically all statically typed languages that are not Rust) can enforce the first aspect at compile time, but leave the second aspect to a runtime check. Rust can help go a step further to enforce even the second aspect at compile time.

Consider the file abstraction expressed in the following state machine:

FSM for file access

This can be implemented with the following code (I am using Scala and only simple constructs from it to avoid distractions, but the core idea is the same in any statically typed language):

class File(val name: String) {
def open(): OpenedFile = {
val f = new java.io.File(name)
new OpenedFile(new FileInputStream(f))
}
}

class OpenedFile(stream: FileInputStream) {
def read(): Array[Byte] = {
val buf = new Array[Byte](1024)
stream.read(buf)
buf
}

def close(): ClosedFile = {
stream.close()
new ClosedFile
}
}

class ClosedFile()

Essentially, we are modeling that one may read or close only an opened file. Each class represents a distinct state in the machine and expresses the stimuli it can respond to for that state through methods. This ensures that the following program will not typecheck.

val file = File()
file.close() // ❌ Error: File doesn't define a close method

val openedFile = file.open()
openedFile.open(); // ❌ Error: OpenedFile doesn't define an open method

openedFile.read()

val closedFile = openedFile.close()
closedFile.close() // ❌ Error: ClosedFile doesn't define a close method

However, since all the objects (file, openedFile, closedFile) are alive simultaneously, this code doesn't model the aspect of the state machine that it may only be in one state at a given time. So the following code snippets that don't make sense typechecks just fine.

openedFile.close()
openedFile.close()

Or even worse:

val openedFile = file.open()
openedFile.read()

openedFile.close()

openedFile.read() // Reading a closed file

However, Rust's ownership system allows modeling even this exclusivity constraint. But first, quickly about the ownership system. In Rust, an object may have only one owner at a time. So if we were to write the following code, it would not typecheck.

let hello1 = String::from("Hello");
println!("{hello1}"); // prints "Hello"
let hello2 = hello1; // `hello2` now owns the string and crucially `hello1` doesn't
println!("{hello1}"); // ❌ compile-time error: value borrowed here after move

The string object "Hello" is owned by hello1 until it is assigned to hello2 which becomes its owner, at which point hello1 is no longer usable.

However, Rust allows any number of immutable references to an object or a single mutable reference, but not both. So the following code compiles fine:

let hello = String::from("Hello");
println!("{hello}"); // prints "Hello"
let hello_ref1 = &hello; // immutable reference
let hello_ref2 = &hello; // immutable reference
println!("{hello}"); // prints "Hello"
println!("{hello_ref1}"); // prints "Hello"
println!("{hello_ref2}"); // also prints "Hello"

Let's use this ownership system to model the state machine in Rust. Here is the code (again, using only simple constructs from Rust to avoid distractions).

struct File {
name: String,
}

impl File {
fn open(self) -> OpenedFile {
OpenedFile {
file: std::fs::File::open(self.name).unwrap(),
}
}
}

struct OpenedFile {
file: std::fs::File,
}

impl OpenedFile {
fn read(&mut self) -> [u8; 1024] {
let mut buf = [0; 1024];
let _ = self.file.read(&mut buf).unwrap();
buf
}

fn close(self) -> ClosedFile {
ClosedFile
}
}

struct ClosedFile;

The look and feel of the code is similar to the Scala code, but the use of self vs &self/&mut self makes a crucial difference.

  • The open, and close methods take self. Invocation of those methods is akin to let hello2 = hello1; shown earlier; specifically, the ownership of an object on such a method is invoked is moved to the self of the called method. This also means that the object on which those methods are invoked is no longer usable after the method call. This allows us to model transition to a new state.
  • On the other hand, the read method takes a mutable reference to OpenedFile (&mut self), which is akin to let hello_ref1 = &hello1; shown earlier. The object on which the read method is called continues to be alive and this models the transition to the same state.

As an aside, note that there is no call to close the file. The closing of the file happens automatically when the OpenedFile object goes out of scope, taking along with it the file field. More precisely, the std::io::File type implements the Drop trait to close the file, which is invoked when the std::io::File value goes out of scope.

Unsurprisingly, the following code gives all the right compile time errors for non-existent methods.

let file = File {
name: "foo.txt".to_string(),
};
file.close(); // ❌ Error: File doesn't define a close method

let openedFile = file.open();
openedFile.open(); // ❌ Error: OpenedFile doesn't define an open method

openedFile.read();

let closedFile = openedFile.close();
closedFile.close(); // ❌ Error: ClosedFile doesn't define a close method

But the following code also gives all the right compile time errors for the violation of the one-owner rule, which for us means only-in-one-state rule.

let file = File {
name: "foo.txt".to_string(),
};
let openedFile = file.open();
let closedFile = openedFile.close(); // `openedFile` moved due to this method call
let closedFile = openedFile.close(); // ❌ Error: value used here after move

as well as:

let file = File {
name: "foo.txt".to_string(),
};
let mut openedFile = file.open();
openedFile.read();

openedFile.close();

openedFile.read(); // ❌ Error: value borrowed here after move

In essence, each class represents a state, each method implements a transition, and each method invocation represents sending a stimulus. The return value of each method is the new state. Even methods such as read (which returns just the data) can be thought to be logically equivalent to taking the ownership of the object and then returning the same object:

    fn read(mut self) -> (OpenedFile, [u8; 1024]) {
let mut buf = [0; 1024];
let _ = self.file.read(&mut buf).unwrap();
(self, buf)
}

and using it as:

    let open_file = f.open();
let (open_file, contents) = open_file.read();

The point of a type system is to reject programs that don't make sense. Rust's ownership system stretches the boundary of type systems to reject more programs that would have been accepted by other type systems. Pretty neat!

Twitter share