adporn.net Transfer learning - TensorFlow: Working with NLP Video Tutorial | LinkedIn Learning, formerly Lynda.com

LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: TensorFlow: Working with NLP

Unlock the full course today

Join today to access over 24,700 courses taught by industry experts.

Transfer learning

Transfer learning

From the course: TensorFlow: Working with NLP

Start my 1-month free trial Buy for my team

Transfer learning

“

- [Instructor] Transfer learning is made up of two components, pre-training and fine-tuning. So what does pre-training involve? Well, we're training a model from scratch. This means the model's weights are randomly initialized. The model is of no use at this point. The model is then trained on large amounts of data and then becomes useful. Now, let's compare the pre-training for some of the larger models. So BERT was released in 2018. The number of parameters was 109 million. It took Google 12 days to train BERT, and I've put an asterisk by the 8 times V100s because BERT wasn't trained on GPUs, but rather, Google's equivalent, TPUs or tensor processing units. So the size of the dataset used for training was 16 gigabytes, and the training tokens were 250 billion. And the data sources that was used to train BERT were Wikipedia and the BookCorpus. RoBERTa was developed by Facebook in 2019. The number of parameters was 125…

Contents

- (Locked)
  
  Next steps
  
  47s