From the course: Learning Data Science
Unlock this course with a free trial
Join today to access over 24,700 courses taught by industry experts.
Collecting unstructured data
From the course: Learning Data Science
Collecting unstructured data
- We've gone through a lot. So let's recap a little. In general, your data science team will work with three data types. There's your structured data. That's the data that's most likely a spreadsheet that has order in a consistent format. It's usually stored in a relational database. Next, you have your semi-structured data. That's data with some structure, but there's some added flexibility to change some of the field names. Finally, there's the most popular type of data. There's everything else. It's unstructured data. Some analysts estimate that 80% of data is unstructured. When you think about it, this makes a lot of sense. Think about the data that you create every day. Every time you leave a voicemail, every picture you upload to Facebook, the Microsoft memo you created at work or the PowerPoint presentation. Even when you search on the web, it's mostly unstructured data. The search for cats will bring up videos, songs, books, and even music. So what does all this data have in…