From the course: Azure AI Fundamentals (AI-900) Cert Prep: 2 Principles of Machine Learning on Azure
What is machine learning? - Azure Tutorial
From the course: Azure AI Fundamentals (AI-900) Cert Prep: 2 Principles of Machine Learning on Azure
What is machine learning?
- [Instructor] Okay, so let's start by defining what machine learning is and how it fits on an AI framework and which services on Azure provide these capabilities. A discussion about machine learning demands an understanding of artificial intelligence, so let's start with that. AI is a field of study focused on enabling machines to perform tasks previously only done by humans, such as reasoning, which is the capability of drawing conclusions from imperfect data, understanding or being able to interpret the meaning of some data, such as images, video, or voice. And interacting, which allows computers to communicate with people in more natural ways, such as voice or text. Think of Siri, Alexa, or Cortana. That's a huge contrast to algorithm computing, where in the end of the day, the computer is just executing the instructions that you have given to it line by line, and it enables capabilities never possible before, such as object detection, speech recognition, or even wonders such as self-driving vehicles. Machine learning, on the other hand, is a subset of AI responsible for the process of teaching or training a computer system to make predictions based on the available data. For example, if I have a data set of bank loans, I could train an ML model based on this data to foresee which customers might default on their payments. Keep in mind that machine learning is itself a gigantic field at this point. It contains several algorithms for more specific uses, such as linear and logistic regressions, support-vector machines, decision trees, and neural networks, just to cite some. But don't worry, you don't need to memorize any of these exotic names, as they're out of scope for AI 900. What you do need to know for the exam though is what machine learning can do. Trust me, this is the most important slide for this course, as roughly 1/3 of the ML questions are related to this topic. Despite of all the hype about machine learning, it can essentially respond five questions. Classification. Is this A or B? For example, is this person going to develop a heart disease on the next five years, yes or no? This would be an example of a dual class classification, but if there are more than two options, then it's called multiclass classification. In general, every time that you need to decide between two or more options, you're talking about classification. Regression, how much, how many? In regression, you're trying to predict values. So you're looking for a number. For example, what is going to be the price of a two-bedroom apartment in San Francisco by 2030 or what is going to be the total sales of Microsoft next year? Anomaly detection. Is this weird? Here, we're basically looking for data that stands out of a pattern, an outlier. These solutions might be quite useful for IT security breach and bank fraud detection, where you could be looking for suspicious logins or credit card transactions. Clustering. How is this organized? Here, we're trying to group or classify items together based on common characteristics that they share. A good example for this are recommendation systems, which groups people based on their preferences or purchasing habits. Whenever you see a recommended for you section on your favorite streaming or shopping website, that's probably a cluster implementation. These techniques are also often used by marketing professionals to perform market segmentation. Reinforcement. What's next? That's why it's called reinforcement learning, where machines are trying to make their own decisions without human interventions, such as teaching a computer to play a board game. Note that the exam does not focus much on reinforcement learning, as this is a more advanced topic. Don't be afraid of these questions though, as they're generally quite simple to respond, as long as you remember that options mean classification, numbers mean regression, outliers mean anomaly detection, and grouping means clustering, you should do great.