From the course: Rust LLMOps
Unlock this course with a free trial
Join today to access over 24,700 courses taught by industry experts.
Rust CLI inference - Rust Tutorial
From the course: Rust LLMOps
Rust CLI inference
- [Narrator] Here's a diagram that shows the flow of building and running a rust command line interface that is going to perform inference using a Hugging Face transformer model. First up, it starts with the model being downloaded from Hugging Face in the hub in order to get access to a pre-trained transformer and you could use a model like BERT or GPT2, then the model is loaded in Rust using Rust BERT or transformers crate, which would provide the Rust bindings for Hugging Face models. Another option as well is used the Rust candle interface and this allows the model to be used for inference in Rust code. The next step would be actually using the CLI, and you could do that with the clap crate, which allows for easy argument parsing and you could have flexible user input to the CLI. You could even customize the prompts, for example, if you wanted to have a prelude to every single prompt, maybe you know something that would get you better results. You could also put that into the…
Contents
-
-
-
(Locked)
Function: The essence of programming6m 48s
-
(Locked)
Operationalizing microservices1m 57s
-
(Locked)
Continuous integration for microservices6m 54s
-
(Locked)
What is a Makefile and how do you use it?2m 41s
-
(Locked)
What is DevOps?2m 29s
-
(Locked)
Kaizen methodology4m 6s
-
(Locked)
Infrastructure as code for continuous delivery2m 50s
-
(Locked)
Responding to compromised resources and workloads4m 16s
-
(Locked)
Monitoring and logging1m 47s
-
(Locked)
Auditing networks1m 6s
-
(Locked)
Rust: Secure by design4m 52s
-
(Locked)
Preventing data races with the Rust compiler3m 29s
-
(Locked)
AWS config for security4m 26s
-
(Locked)
Demo: AWS Security Hub3m 39s
-
(Locked)
Securing accounts with 2FA3m 11s
-
(Locked)
Access permissions overview4m 4s
-
(Locked)
Repository permission levels2m 37s
-
(Locked)
Repository privacy settings2m 52s
-
(Locked)
Key concepts in the GitHub ecosystem3m 43s
-
(Locked)
Demo: GitHub Actions3m 50s
-
(Locked)
Demo: Codespaces6m 8s
-
(Locked)
Demo: Copilot8m 9s
-
(Locked)
Candle framework in Rust2m 58s
-
(Locked)
GitHub Codespaces with GPU5m 55s
-
(Locked)
VS Code SSH to AWS accelerated5m 14s
-
(Locked)
Candle hello world2m 56s
-
(Locked)
Exploring StarCoder in Rust5m 54s
-
(Locked)
Whisper Candle transcriber5m 51s
-
(Locked)
Exploring remote development on AWS2m 10s
-
(Locked)
Rust for large language models (LLMs)1m 56s
-
(Locked)
Serverless inference1m 52s
-
(Locked)
Rust CLI inference2m 2s
-
(Locked)
Rust chat inference1m 59s
-
(Locked)
The continuous build binary2m 6s
-
(Locked)
The chat loop with StarCoder2m 4s
-
(Locked)
Invoke an LLM on an AWS G5 instance, part 14m 36s
-
(Locked)
Invoke an LLM on an AWS G5 instance, part 23m 1s
-
(Locked)
Rust-BERT introduction1m 51s
-
(Locked)
Installing and setting up Rust-BERT5m 38s
-
(Locked)
Basic syntax and model loading in Rust-BERT2m 4s
-
(Locked)
Rust sentiment analysis in the CLI4m 13s
-
(Locked)
Rust-PyTorch introduction1m 54s
-
(Locked)
Rust-PyTorch hello world2m 28s
-
(Locked)
PyTorch pretrained models3m 39s
-
(Locked)
Running pretrained PyTorch models in Rust6m 41s
-
(Locked)
Introduction to ONNX1m 25s
-
(Locked)
ONNX conversions2m 5s
-
(Locked)
Extending Google Bard4m 22s
-
(Locked)
Exploring Google Colab with Bard4m 22s
-
(Locked)
Exploring the Colab AI4m 56s
-
(Locked)
Exploring the Google Cloud Generative AI App Builder2m 29s
-
(Locked)
AWS Bedrock for responsible AI4m 39s
-
(Locked)
AWS Bedrock with Claude7m 20s
-
(Locked)
Summarizing text with Claude5m 28s
-
(Locked)
Using the AWS Bedrock API1m 39s
-
(Locked)
-