From the course: Introduction to Data Science
Demystifying data science
From the course: Introduction to Data Science
Demystifying data science
There are many definitions out there for data science. I like the definition given by Joseph Gonzalez, who's a professor at UC Berkeley. According to him, data science is the application of data-centric computational and inferential thinking to understand the world and solve problems. Data scientists have a unique role in industry, government, and other organizations, and that role is different from the roles of others, such as data engineers, statisticians, and business analysts. The first difference I want to help you understand is a difference between data scientists and data engineers. Data engineers work to make sure data flows smoothly between the source and the destination. Now, the source is where data is collected, and the destination is where data is extracted and processed. Data scientists work to make sure that value is extracted from data smoothly. Data engineers optimize data flow while data scientists optimize data processing and data scientists work with data engineers as well as business people to define metrics, establish how data is collected, and ensure that data science processes work well with enterprise data systems. And when data scientists and data engineers work together, it's important for the data scientists to write code that's reusable by the data engineers. The next difference I want to help you understand is the difference between data scientists and statisticians. Now, the amount of data that data scientists work with is often massive, so they spend a lot of time with tasks like large scale data collection and data cleaning. Meanwhile, statisticians rely on more traditional and smaller scale methods of data collection, such as surveys, polls, and experiments. Data scientists try out different methods to create machine learning models, and then they choose the method that results in the best model. On the other hand, statisticians work on improving one simple model to best fit the data, and data scientists do more than just analyze data. They also implement algorithms that process data automatically. And this enables data scientists to provide automated predictions and actions. To help bring this to life, I want to give you some examples of the types of things that data scientists can automate through data analysis and algorithms, such as analyzing NASA pictures to find new planets or asteroids or automated piloting planes, cars and more. You can automate book recommendations on Amazon or friend recommendations on Facebook. You can use automation and computational chemistry to simulate new molecules for cancer treatment. And automation can help with early detection of an epidemic or estimating the value of houses in the US in real-time, like on Zillow, or matching a Google AD with a user and a web page to maximize the chances of a conversion. Automation can help return highly relevant results to Google searches or detect credit card fraud and tax fraud and automation even helps with weather forecasts. The next difference I want to help you understand is the difference between data scientists and business analysts. Business analysts focus on database design and ROI assessment. And some business analysts work on finance planning and optimization and risk management and others manage projects at a high level. Now, data scientists can help business analysts. For example, data scientists can help automate the production of reports and speed up data extraction. According to data scientist, Vincent Granville, the collaboration between data scientists and business analysts has helped business analysts extract data that's 100 times larger than what they're used to and ten times faster. So there you have it. The power of data science is changing our world, and that's a great reason for you to continue learning.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.