From the course: Rust LLMOps
Unlock this course with a free trial
Join today to access over 24,700 courses taught by industry experts.
Invoke an LLM on an AWS G5 instance, part 1 - Rust Tutorial
From the course: Rust LLMOps
Invoke an LLM on an AWS G5 instance, part 1
- So this week, I'm going to dive into a interesting scenario here where you can use the package management system in Rust to invoke large language models. So I think everybody's talking about larger language models and how cool they are, but what's interesting is, you'll see people giving these really complex workflows, applications, all this complexity around running a large language model. But because of the beauty of Rust and the cargo package management system, you don't have to do anything. You can just actually use the cargo system to invoke models. I'm going to do this with huggingface candle and I'm going to do it on an AWS GPU. That's a very powerful GPU. So let's go ahead and take a look at how we would do that. So first up here, I've got huggingface candle. Let's just look at what this does. So this is a minimalistic ML framework for Rust. And you can see, the code is really simple. But the biggest thing here, and I think this is why it's so exciting, and I would say…
Contents
-
-
-
(Locked)
Function: The essence of programming6m 48s
-
(Locked)
Operationalizing microservices1m 57s
-
(Locked)
Continuous integration for microservices6m 54s
-
(Locked)
What is a Makefile and how do you use it?2m 41s
-
(Locked)
What is DevOps?2m 29s
-
(Locked)
Kaizen methodology4m 6s
-
(Locked)
Infrastructure as code for continuous delivery2m 50s
-
(Locked)
Responding to compromised resources and workloads4m 16s
-
(Locked)
Monitoring and logging1m 47s
-
(Locked)
Auditing networks1m 6s
-
(Locked)
Rust: Secure by design4m 52s
-
(Locked)
Preventing data races with the Rust compiler3m 29s
-
(Locked)
AWS config for security4m 26s
-
(Locked)
Demo: AWS Security Hub3m 39s
-
(Locked)
Securing accounts with 2FA3m 11s
-
(Locked)
Access permissions overview4m 4s
-
(Locked)
Repository permission levels2m 37s
-
(Locked)
Repository privacy settings2m 52s
-
(Locked)
Key concepts in the GitHub ecosystem3m 43s
-
(Locked)
Demo: GitHub Actions3m 50s
-
(Locked)
Demo: Codespaces6m 8s
-
(Locked)
Demo: Copilot8m 9s
-
(Locked)
Candle framework in Rust2m 58s
-
(Locked)
GitHub Codespaces with GPU5m 55s
-
(Locked)
VS Code SSH to AWS accelerated5m 14s
-
(Locked)
Candle hello world2m 56s
-
(Locked)
Exploring StarCoder in Rust5m 54s
-
(Locked)
Whisper Candle transcriber5m 51s
-
(Locked)
Exploring remote development on AWS2m 10s
-
(Locked)
Rust for large language models (LLMs)1m 56s
-
(Locked)
Serverless inference1m 52s
-
(Locked)
Rust CLI inference2m 2s
-
(Locked)
Rust chat inference1m 59s
-
(Locked)
The continuous build binary2m 6s
-
(Locked)
The chat loop with StarCoder2m 4s
-
(Locked)
Invoke an LLM on an AWS G5 instance, part 14m 36s
-
(Locked)
Invoke an LLM on an AWS G5 instance, part 23m 1s
-
(Locked)
Rust-BERT introduction1m 51s
-
(Locked)
Installing and setting up Rust-BERT5m 38s
-
(Locked)
Basic syntax and model loading in Rust-BERT2m 4s
-
(Locked)
Rust sentiment analysis in the CLI4m 13s
-
(Locked)
Rust-PyTorch introduction1m 54s
-
(Locked)
Rust-PyTorch hello world2m 28s
-
(Locked)
PyTorch pretrained models3m 39s
-
(Locked)
Running pretrained PyTorch models in Rust6m 41s
-
(Locked)
Introduction to ONNX1m 25s
-
(Locked)
ONNX conversions2m 5s
-
(Locked)
Extending Google Bard4m 22s
-
(Locked)
Exploring Google Colab with Bard4m 22s
-
(Locked)
Exploring the Colab AI4m 56s
-
(Locked)
Exploring the Google Cloud Generative AI App Builder2m 29s
-
(Locked)
AWS Bedrock for responsible AI4m 39s
-
(Locked)
AWS Bedrock with Claude7m 20s
-
(Locked)
Summarizing text with Claude5m 28s
-
(Locked)
Using the AWS Bedrock API1m 39s
-
(Locked)
-