From the course: TensorFlow: Working with NLP
Unlock the full course today
Join today to access over 24,700 courses taught by industry experts.
Fine-tuning BERT
From the course: TensorFlow: Working with NLP
Fine-tuning BERT
- [Instructor] As part of the pre-training step, when Google trained BERT with the next sentence prediction task which is a text classification task, a linear layer was added at the end of the BERT model. The only thing that was fed into the linear layer was from the CLS embedding. So in order for the BERT model to perform well, it learned that it needed to capture all the information required in the CLS token. This means that when we want to fine tune BERT, say on movie reviews, all we need to do is to add a linear classify layer and use the final embedding of the CLS token as the input to the linear classifier. In addition to a linear classifier, we often add a dropout layer to reduce overfitting. We then train or fine tune the model with a label dataset. Using the movie review example, this would be training the linear classifier with the movie review texts and their associated labels, either positive or negative.…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.