adporn.net Hands-on lab: Creating a Global Secondary Index (GSI) - Amazon Web Services (AWS) Video Tutorial | LinkedIn Learning, formerly Lynda.com

Start free trial Sign in

From the course: AWS Certified Developer - Associate (DVA-C02) Cert Prep

Hands-on lab: Creating a Global Secondary Index (GSI) - Amazon Web Services (AWS) Tutorial

From the course: AWS Certified Developer - Associate (DVA-C02) Cert Prep

Start my 1-month free trial Buy for my team

Hands-on lab: Creating a Global Secondary Index (GSI)

“

- [Instructor] Now let's see how a global secondary index can be applied to our example from the previous lecture. So again, here is the DynamoDB table of our music streaming application, and what we want is to obtain the 10 most followed 50 track playlists. Instead of scanning the table and iterating over each of the items, we can create a GSI of the table that uses the tracks as the primary key and followers as the sort key. I'll show you how this can be done. First, let's go to the table section and click the name of the table that we're interested in. From here, click on indexes. Click on the create index button. First, we will fill out the index details. As for partition key, we will enter the tracks attribute of the base table and data type will be number. Sort key is optional, but we'll make one because sort keys arrange the items in order according to their value, so we don't have to do any sorting in our code anymore. The JSON data that will be returned to us is already sorted in order. Let's use the followers attribute for our sort key. And the data type is also number. Next, we'll provide an index name. Let's just call it my GSI. Let's leave the index capacity as it is. Next we'll choose an attribute projection. For now, let's just project all attributes from the base table into the index. Finally, let's click on create index. Creating a GSI takes time, so I'll get back to you once it's completed. Okay, now that the GSI has been created, let's go to the item section and select the playlist table. So there are two options here. The query and scan operation. We'll do a query this time. First, let's select the GSI that we've created. Click on this dropdown box and click myGSI. Next, let's enter 50 as the tracks partition key value. I will not provide a value for the sort key. Remember, we only created a sort key to take advantage of its sorting capabilities. I'll take the sort descending box because I want the return items to be sorted in a descending fashion with the first playlist having the most number of followers. Let's run this query. As you can see, same as before with the scan operation, it returns the 1750 track playlists in this table, but this time the results are sorted according to the number of followers. So basically the first 10 items that we see here are the playlist that we want to see in our leaderboard application. So this is how you do a query in the console. This time I'll show you how to do with using the query API in Python. So first we need to import the boto3 module. If you don't have it, just install it on your computer using PIP install. I'll also import a second module called pretty print. This works similar to print. The difference is it formats the output in a more elegant and readable manner. Next, let's create a DynamoDB client object. To do this, we'll use the client class of boto3. Let's type in boto3 that client and specify DynamoDB as the service that we'll be working on. We'll use the DynamoDB client to make a query against the playlist table. We want to be able to print the response, so we'll store it in a response variable. So let's create that variable. Next type DynamoDB_client.query, followed by parentheses. The query function takes several parameters. First, let's specify a table name. For the value, outside playlist, followed by a comma. Next, the index name. Let's provide the name of our GSI and that will be my GSI. Next is the key condition expression parameter. The key condition, expression parameter, is a string that determines the items to be read from a table or index. So this is what determines our search criteria. These expressions use placeholders instead of actual values like this. So the way we do it, we'll specify the partition key name and value as an equality condition. First, let's specify the name of our partition key followed by the equal operator. Afterward, we need to provide a placeholder value for the tracks attributes. The placeholder value begins with the colon character. Then we'll just type in TV in here, which it stands for track value, but you can have a different name for this. Next is the expression attribute values parameter. This parameter takes values that are structured in a dictionary. This is where we'll provide the value for the tracks attributes. Type up pair of curly braces. For the key name, we'll use the tracks place holder. And for the value, we need to specify two things: the data type and the value itself. This is another dictionary. I'll type N as the key to denote the data type as number and type 50 as the value. For my leaderboard, I only want to display the name of the playlist and their number of followers. So I'll use the projection expression to identify the attributes that I want to receive. Let's type the projection expression parameter. This parameter takes a string value, so we'll just list the attributes that we want to retrieve and separate them by a comma. First is the playlist name, followed by the follower's attributes. I want the items to be returned in descending order. So I'll use the scan index forward parameter and set it to false. This parameter specifies the order for index traversal. If set to true, the traversal is performed in ascending order, which is also the default behavior. If set to false, the traversal is performed in the sending order. Lastly, I don't want to get all the 17 items, since I won't be needing all of them, so it'll just be a waste of data. So what I'll do is to set a limit, I will limit the return items to 10, which I'm sure are the top 10 popular playlists that I need for my leaderboard. To do that, let's use the limit parameter and set the value to 10. Next, let's print the response. We'll use pretty print instead of the standard print function in Python. So this is all that we need to make a query. Let's save this script and run it. As you can see, only 10 items are returned to us, and these items are the 10 most followed playlists. We can verify it by comparing these values with the result that we got in the console. We see here that the top followed playlist is Limitless Radio, followed by Sweet Crossroad, then Asset Motivation and so on. So we have indeed confirmed that the results are identical. Since we are using a query operation based on a partition key value, instead of just scanning the whole table, retrievals will be a lot faster and efficient. This is how you make a query in DynamoDB. I have shown you how to do it in the console and via the query API, and I hope that you've learned something from this demo. See you in the next lecture.

Contents

- AWS Certified Developer Associate DVA-C02 Exam overview
  
  3m 43s
- AWS Certified Developer Associate DVA-C02 Exam domains
  
  3m 17s