Lately, I have been researching other programming languages that are more efficient than python. Also, I wanted to use that in ML applications from data gathering to preprocessing and beyond. As I research Rust looks like a great candidate and great work has been done:
Apologies for bumping an older post, but I have a little bit of expertise in this area and wanted to share my experience.
I’ve used Rust for a handful of ML projects, one to do image sorting (like Google Photos) and one to embed GPT-2 in a game. I’m partial to Tract (sonos/tract) because it allows one to embed an ONNX model file in the executable and doesn’t have any dependency on PyTorch DLLs, so you get a single small executable that “just works”. Tch-rs I think may have better ergonomics because you don’t have to do as much fiddling with the data before running the model, but you have an extra gigabyte or so of DLLs and dependencies that need to be deployed on the target system. Not so friendly.
In general, I like Python for training models and doing the interactive data science components, but prefer Rust for the ability to deploy standalone applications that run in real-time scenarios.
Thanks, @JosephCatrambone for the reply.
I was curious about your
embed GPT-2 in a game project. if possible can you share what you’ve done?
I recently did some minor research on how to boost data engineering and came across polars and the benchmark was amazing!
Sure. It was nothing complicated. I was trying something for the AI and Games Jam 2022. I used Godot as the base engine and built a GDNative script in Rust which embedded Tract and the ONNX distribution of GPT-2. It was too slow to be fun, but it worked.
Here’s some of the code I used: Embedding GPT-2 in Godot via Rust · GitHub
The project would compile into a self contained dll and was able to be referenced from Godot.