From the course: Advanced RAG Applications with Vector Databases
Embedding examples
- [Presenter] Let's look at some examples of how you can embed data. There are many ways to embed and there are many things that you can embed. The three primary methods we'll cover for embedding in this section are the basic embeddings, small to big, and big to small, and we'll also briefly discuss non-English examples. The most basic method of embedding is to just straight up embed the chunk. Sometimes this works for your most basic tasks. However, when it comes to advanced RAG use cases and putting things into production, you're going to need something a little more involved. Small to big is a term coined by former LlamaIndex head of TypeScript and Partnerships, Yi Ding, and he coined it at one of my first events in San Francisco. The idea behind small to big is that you embed a sentence, but you store the whole paragraph as text. Why would you do this? Well, it's good for increased context. Some texts have very short sentences, and it's helpful to retrieve not just the sentence or the one sentence proceeding or following it, but the entire paragraph in which that sentence was used. This is another way to help ensure semantic coherence like we covered in chunking. Big to small is the opposite of small to big. Instead of embedding a sentence and storing a paragraph, we embed a paragraph and store a sentence. Well, why would we do this? Sometimes sentences themselves don't always make sense, and the tactics of chunking sentences may leave some sentences broken. For example, if we have a period following Mr., then we may have a broken sentence. Embedding a whole paragraph and retrieving all the sentences separately lets us do some post-processing before feeding the chunks to an LLM to ensure that we get the right context. Finally, we're looking at non-English embeddings. Here's a special case. If you're not working with English data, you'll need an embedding model that was trained on non-English data. You have a few options. One of the easiest, but perhaps the not the most efficient or cost-effective methods, is to use an LLM that has multiple language data. Examples include GPT models beyond 3.5, Mixtral, and Queen. If you're looking for a more compute-friendly option, you can search the MTEB leaderboards for models in different languages such as French, Polish, Chinese, and more.
Contents
-
-
-
Introduction to preprocessing for RAG4m 57s
-
Chunking considerations5m 12s
-
Chunking examples4m 32s
-
Introduction to embeddings9m 50s
-
Embedding examples2m 57s
-
Metadata3m 12s
-
Demo: Chunking2m 32s
-
Demo: Metadata1m 23s
-
Demo: Embed and store2m
-
Demo: Querying1m 8s
-
Demo: Adding the LLM2m 1s
-
Challenge: Cite your document sources47s
-
Solution: Cite your document sources59s
-
Challenge: Change the chunk size44s
-
Solution: Change the chunk size55s
-
-
-
-