Myths and pitfalls

At this early stage, it is typical to see different and sometimes contradictory views appear in various communities. It is essential to share and discuss the myths and reality, memes, and pitfalls to ensure the healthy development of the field. From observations about the relevant communities, as well as experiences and lessons learned in conducting data science and analytics research, education, and services, several myths and pitfalls have emerged.

Read more in [1-5] about the following pitfalls:

  • Pitfalls about data science concepts
  • Data volume pitfalls
  • Data infrastructure pitfalls
  • Analytics pitfalls
  • Pitfalls about capabilities and roles
  • Other matters



[1], L. Cao. Data Science: Nature and Pitfalls, IEEE Intelligent Systems, Volume: 31, Issue: 5, 66-75, 2016 (The above contents were excerpted from this paper)

[2]. H.V. Jagadish, “Big Data and Science: Myths and Reality,” Big Data Research, vol. 2, no. 2, 2015, pp. 49–52.

[2] D. Donoho, “50 Years of Data Science,” 2015; http://courses. docs/50YearsDataScience.pdf.

[3] K. Broman, “Data Science Is Statistics,” blog, 2013; http://kbroman.wordpress. com/2013/04/05/data-science-isstatistics.

[4] P.J. Diggle, “Statistics: A Data Science for the 21st Century,” J. Royal Statistical Society: Series A, vol. 178, no. 4, 2015, pp. 793–813.

[5] I. Wladawsky-Berger, “Why Do We Need Data Science When We’ve Had Statistics for Centuries?” Wall Street J., 2014;