From the course: TensorFlow: Working with NLP
Unlock the full course today
Join today to access over 24,700 courses taught by industry experts.
Tokenizers
From the course: TensorFlow: Working with NLP
Tokenizers
- [Tutor] Let's head over to the Colab notebook to confirm our understanding of tokenization and code. So, in the first couple of cells, we're installing TensorFlow texts and the TensorFlow Models Official. We then go ahead and import these Python packages. And then we're loading a BERT model from TensorFlow hub. We're using a BERT model with the uncased widths. And so you can see that we have a vocabulary size of about 30,000 tokens. So you can see that the BERT model is a standard BERT model with uncased widths, and it has 12 layers. So our input sentence is going to be, "I like NLP." This is then tokenized and then the tokens are then converted to IDs. So, "I like NLP" is converted to, "I like NL" and then "##P." These IDs correspond to token IDs. If I enter two sentences, "I like NLP." And, "what about you?" I then enter that into the BERT model and you can see that we get this output results. So let's look…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.