From the course: Hands-On AI: RAG using LlamaIndex
Unlock the full course today
Join today to access over 24,700 courses taught by industry experts.
Loading data - LlamaIndex Tutorial
From the course: Hands-On AI: RAG using LlamaIndex
Loading data
- [Instructor] So we know that we can't build a rag system without external data, but in order to use our external data, we need to load it, and that's what we're going to learn how to do in this module. So note that this module makes use of html2text. You can run this cell and install that, or just install it from the command line like I've done here. Make sure that you have connected to the environment and now we can go ahead and get right to it. So preparing data for an LLM involves creating kind of an ingestion pipeline. And this is similar to traditional ML, where we have data cleaning or an ETL process, and ingestion happens in three stages. There's loading the data, transforming the data, and then finally indexing and storing the data. So let's walk through how to load data. So to use data with an LLM, we load it using a data connector, which is known as a reader in Llamaindex. A reader is going to format data, in this case a text document, into a Llamaindex document object…