Home Artificial Intelligence Find out how to Make Your Python Packages Really Fast with Rust

Find out how to Make Your Python Packages Really Fast with Rust

0
Find out how to Make Your Python Packages Really Fast with Rust

Goodbye, slow code

Towards Data Science
Photo by Chris Liverani on Unsplash

Python is… slow. This just isn’t a revelation. A lot of dynamic languages are. In truth, Python is so slow that many authors of performance-critical Python packages have turned to a different language — C. But C just isn’t fun, and C has enough foot guns to cripple a centipede.

Introducing Rust.

Rust is a memory-efficient language with no runtime or garbage collector. It’s incredibly fast, super reliable, and has a extremely great community around it. Oh, and it’s also super easy to embed into your Python code due to excellent tools like PyO3 and maturin.

Sound exciting? Great! Because I’m about to point out you create a Python package in Rust step-by-step. And should you don’t know any Rust, don’t worry — we’re not going to be doing anything too crazy, so it is best to still have the opportunity to follow along. Are you ready? Let’s oxidise this snake.

Pre-requisites

Before we start, you’re going to want to put in Rust in your machine. You’ll be able to do this by heading to rustup.rs and following the instructions there. I might also recommend making a virtual environment that you could use for testing your Rust package.

Script overview

Here’s a script that, given a number n, will calculate the nth Fibonacci number 100 times and time how long it takes to achieve this.

It is a very naive, totally unoptimised function, and there are many ways to make this faster using Python alone, but I’m not going to be going into those today. As an alternative, we’re going to take this code and use it to create a Python package in Rust

Maturin setup

Step one is to put in maturin, which is a construct system for constructing and publishing Rust crates as Python packages. You’ll be able to do this with pip install maturin.

Next, create a directory to your package. I’ve called mine fibbers. The ultimate setup step is to run maturin init out of your latest directory. At this point, you’ll be prompted to pick out which Rust bindings to make use of. Select pyo3.

Image by creator.

Now, should you take a have a look at your fibbers directory, you’ll see just a few files. Maturin has created some config files for us, namely a Cargo.toml and pyproject.toml. The Cargo.toml file is configuration for Rust’s construct tool, cargo, and accommodates some default metadata concerning the package, some construct options and a dependency for pyo3. The pyproject.toml file is fairly standard, but it surely’s set to make use of maturin because the construct backend.

Maturin can even create a GitHub Actions workflow for releasing your package. It’s a small thing, but makes life so much easier whenever you’re maintaining an open source project. The file we mostly care about, nonetheless, is the lib.rs file within the src directory.

Here’s an summary of the resulting file structure.

fibbers/
├── .github/
│ └── workflows/
│ └── CI.yml
├── .gitignore
├── Cargo.toml
├── pyproject.toml
└── src/
└── lib.rs

Writing the Rust

Maturin has already created the scaffold of a Python module for us using the PyO3 bindings we mentioned earlier.

The predominant parts of this code are this sum_as_string function, which is marked with the pyfunction attribute, and the fibbers function, which represents our Python module. All of the fibbers function is de facto doing is registering our sum_as_string function with our fibbers module.

If we installed this now, we’d have the opportunity to call fibbers.sum_as_string() from Python, and it will all work as expected.

Nevertheless, what I’m going to do first is replace the sum_as_string function with our fib function.

This has the exact same implementation because the Python we wrote earlier — it takes in a positive unsigned integer n and returns the nth Fibonacci number. I’ve also registered our latest function with the fibbers module, so we’re good to go!

Benchmarking our function

To put in our fibbers package, all we have now to do is run maturin developin our terminal. This can download and compile our Rust package and install it into our virtual environment.

Image by creator.

Now, back in our fib.py file, we are able to import fibbers, print out the results of fibbers.fib() after which add a timeit case for our Rust implementation.

If we run this now for the tenth Fibonacci number, you’ll be able to see that our Rust function is about 5 times faster than Python, despite the very fact we’re using the same implementation!

Image by creator.

If we run for the twentieth and thirtieth fib numbers, we are able to see that Rust gets as much as being about 15 times faster than Python.

twentieth fib number results. Image by creator.
thirtieth fib number results. Image by creator.

But what if I told you that we’re not even at maximum speed?

You see, by default, maturin developwill construct the dev version of your Rust crate, which can forego many optimisations to cut back compile time, meaning this system isn’t running as fast because it could. If we head back into our fibbers directory and run maturin developagain, but this time with the --release flag, we’ll get the optimised production-ready version of our binary.

If we now benchmark our thirtieth fib number, we see that Rust now gives us a whopping 40 times speed improvement over Python!

thirtieth fib number, optimised. Image by creator.

Rust limitations

Nevertheless, we do have an issue with our Rust implementation. If we attempt to get the fiftieth Fibonacci number using fibbers.fib(), you’ll see that we actually hit an overflow error and get a special answer to Python.

Rust experiences integer overflow. Image by creator.

It is because, unlike Python, Rust has fixed-size integers, and a 32-bit integer isn’t large enough to carry our fiftieth Fibonacci number.

We will get around this by changing the kind in our Rust function from u32 to u64, but that may use more memory and won’t be supported on every machine. We could also solve it by utilizing a crate like num_bigint, but that’s outside the scope of this text.

One other small limitation is that there’s some overhead to using the PyO3 bindings. You’ll be able to see that here where I’m just getting the first Fibonacci number, and Python is definitely faster than Rust due to this overhead.

Python is quicker for n=1. Image by creator.

Things to recollect

The numbers in this text weren’t recorded on an ideal machine. The benchmarks were run on my personal machine, and will not reflect real-world problems. Please take care with micro-benchmarks like this one usually, as they are sometimes imperfect and emulate many features of real world programs.

LEAVE A REPLY

Please enter your comment!
Please enter your name here