The Role of Statistics in Data Science

Sharing some thoughts on these two related fields.

Image for post
Image for post
Photo credit: @lira_n4 via Twenty20

Statistics is an established field that describes and analyzes data. Data science may be a relatively new field but it is quite similar to statistics that it also looks at patterns and trends of properties that we know about a sample or a population.

Trends and patterns are used to tell a story — this is how we communicate any findings to our audience. Some statistical tools and methods are applied in data science to be able to generate descriptions and predictions about the data.

Data needs to be explored to be able to find out how we can analyze it to give the insights we need. The whole experience of data analysis is determined by the quality of the data that we have.

This is where data preparation plays a crucial role. It’s one of the phases that take up— if not the phase that takes up —the most time in the data value chain. Data preparation helps protect the integrity of the data.

A Linux user currently exploring the wonders of data science. Curious as a cat and sleeps like one. She’s also into skygazing, anime, and gamification.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store