From the course: AWS Certified Machine Learning Engineer Associate (MLA-C01) Cert Prep

Unlock this course with a free trial

Join today to access over 24,700 courses taught by industry experts.

Intro: Exploratory data analysis

Intro: Exploratory data analysis

- [Instructor] Hello, guys. In this section, we will dive into exploratory data analysis, focusing on understanding the data through visualization, identifying the data types, and examining the data distributions. We'll also explore feature engineering techniques, data transformations for numerical features, categorical text and image data. We will also talk about strategies for handling missing data, unbalanced data and outliers. We'll also introduce key AWS services for data processing including Amazon EMR, Apache Hadoop and Spark. And we'll also discuss EMR architectures and serverless options. We also have hands-on lab, which would walk you through launching an EMR cluster, transforming streaming data using Lambda and Spark, and leveraging AWS Glue for data cataloging, crawling and transformations. We also explore AWS Glue DataBrew for no code data preparation, Amazon SageMaker feature store for managing engineered features and Amazon Athena for querying data efficiently.

Contents