Rust applications

Hi,
Lately, I have been researching other programming languages that are more efficient than python. Also, I wanted to use that in ML applications from data gathering to preprocessing and beyond. As I research Rust looks like a great candidate and great work has been done:

  • datafusion
  • rust-bert
  • and of course Fast version of tokenizer!
    I wanted other people’s opinions in the forum on Rust and its applications and the future of it.
    Thanks.
4 Likes

Apologies for bumping an older post, but I have a little bit of expertise in this area and wanted to share my experience.

I’ve used Rust for a handful of ML projects, one to do image sorting (like Google Photos) and one to embed GPT-2 in a game. I’m partial to Tract (sonos/tract) because it allows one to embed an ONNX model file in the executable and doesn’t have any dependency on PyTorch DLLs, so you get a single small executable that “just works”. Tch-rs I think may have better ergonomics because you don’t have to do as much fiddling with the data before running the model, but you have an extra gigabyte or so of DLLs and dependencies that need to be deployed on the target system. Not so friendly.

In general, I like Python for training models and doing the interactive data science components, but prefer Rust for the ability to deploy standalone applications that run in real-time scenarios.

7 Likes

Thanks, @JosephCatrambone for the reply.
I was curious about your embed GPT-2 in a game project. if possible can you share what you’ve done?
I recently did some minor research on how to boost data engineering and came across polars and the benchmark was amazing!

1 Like

Sure. It was nothing complicated. I was trying something for the AI and Games Jam 2022. I used Godot as the base engine and built a GDNative script in Rust which embedded Tract and the ONNX distribution of GPT-2. It was too slow to be fun, but it worked.

Here’s some of the code I used: Embedding GPT-2 in Godot via Rust · GitHub

The project would compile into a self contained dll and was able to be referenced from Godot.

2 Likes

I am newbie. What is the biggest difference between these programming languages?

1 Like

Apologies for the late reply. Python and Rust are fairly different in a lot of ways. I think it’s fair to say they’re about as different as apples and oranges. They’re not as different as Lisp and C or Haskell and Ruby, but they’re rather different.

It can be challenging to describe what makes languages different from each other in a way that’s meaningful (or at least not surface-level) and approachable, but it can help to start with “what languages want to do as primary goals and what they don’t care about as non-goals.” Python cares about ‘readability’ and ‘productivity’. It does not care about ‘speed’ or ‘portability’. A project written in Python is harder to deploy on another system because it requires a whole constellation of dependencies that can’t be packaged with a given file. Python is “fast enough”, but not particularly fast, especially compared with C or Rust. Python has a LOT built into the standard library and language. They call this, “batteries included”. By comparison, C and JavaScript are extremely barebones. Rust cares about safety and speed over compile time and development time.

On the surface, Rust and Python have a fair number of differences, too:

Trait \ Language Rust Python
Objects/Classing Trait-based Inheritance Based
Syntax Curly braces Whitespace Delimited
Runtime Compiled Interpreted
Typing Strong+Static Typing Duck/Weakly Typed
Standard Library Philosophy Minimalist Complete
Programming Paradigm Mostly imperative Mostly imperative
Environment and Build Tooling Great (Cargo) Great (pip + pyenv)

(Before folks yell at me about the ‘mostly imperative’ or the typing comments, I know these are simplifications.)


Updates to the original discussion, while we’re here:

In the time since the original reply was written there has been a veritable Cambrian explosion of learning solutions in Rust. I still personally enjoy using Python as the first place to develop models before moving them to Rust, but there are great options now:

dfdx - dfdx - Rust - A pure-Rust CUDA accelerated learning toolkit.

Burn - GitHub - burn-rs/burn: Burn - A Flexible and Comprehensive Deep Learning Framework in Rust - A flexible and comprehensive deep learning framework.

tch-rs - GitHub - LaurentMazare/tch-rs: Rust bindings for the C++ api of PyTorch. - Bindings for the Torch library in Rust.

tract - tract - Rust - A tiny, self-contained, no-nonsense library for using pre-trained models in Rust.

Candle - GitHub - huggingface/candle: Minimalist ML framework for Rust - A minimalist library by HuggingFace for deep learning in Rust. Very new.

Of these, I’ve only used tch, tract, and Candle. tch has a more friendly interface and can load PyTorch models, but includes the whole of the Torch library making for HUGE (near-gigabyte, last I checked) executables. Tract is the one I use most often for integrating trained ONNX models. Candle is a relative newcomer and doesn’t load architectures the same way that ONNX loaders will, but is still a quite promising candidate and I’ll probably spend some time playing with it in the months to come.

4 Likes

@JosephCatrambone are you on twitter and what is the user name so I can follow you?

Newbie here.