Data science educational framework

The purpose of data science and analytics education is to create data science profession [1], [2], to train and generate the necessary data and analytics know-how and proficiency in managing capability and capacity gaps, and to achieve the goals of data science innovation and the data economy. Accordingly, different levels of education and training are necessary: from attendance in public courses, corporate training, and undergraduate courses, to undertaking a Master of Data Science and/or PhD in Data Science program.

  • Public courses are designed for general communities to lift their understanding, skill level, professional approach and specialism in data science through multi-level short courses. These range from basic courses to intermediate and advanced courses. The knowledge map consists of such components as data science, data mining, machine learning, statistics, data management, computing, programming, system analysis and design, and modules related to case studies, hands-on practice, project management, communication, and decision support.
  • Corporate training and workshops can be customized to upgrade and foster corporate-wide thinking, knowledge, capability and practice for an entire enterprise, and are necessary to encourage innovation and raise productivity. This requires courses and workshops to be offered for the entire workforce, from senior executives, business owners, and business analysts to data modelers, data scientists, data engineers, and deployment and enterprise strategists. The scope and topics of such courses include data science, data engineering, analytics science, decision science, data and analytics software engineering, project management, communications, and case management.
  • Undergraduate courses may be offered on either a general data science basis that focuses on building foundations of data science and computing of data and analytics, or on specific areas such as data engineering, predictive modeling, and visualization. Double degrees or majors may be offered to train professionals who acquire knowledge and capabilities across such disciplines as business and analytics, statistics and computing.
  • Master of Data Science and Analytics aims to train specialists and talented individuals who have the capability to conduct a deep understanding of data and undertake analytics tasks in data mining, knowledge discovery and machine learning-based advanced analytics. Interdisciplinary experts may be trained, such as those who have a solid foundation in statistics, business, social science or other discipline and are able to integrate data-driven exploration technologies with disciplinary expertise and techniques. A critical area in which data science and analytics should be incorporated is the classic Master of Business Administration course. This is where new generation business leaders can be trained for the new economy and a global view of economic growth.
  • PhD in Data Science and Analytics aims to train high level talented individuals and specialists who are capable of independent thinking, leadership, research, best practice, and theoretical innovation to manage the current significant knowledge and capability gaps, and to achieve substantial theoretical breakthroughs, economic innovation, and productivity elevation. Interdisciplinary research is encouraged to train leaders who have a systematic and strategic understanding of all aspects of about data and economic innovation.


Figure 2 shows the level, objective, capability set and outcomes of hierarchical data science and analytics education and training.

Data science course framework

Fig. 4. Data science course framework.


Note: Excerpted from “Longbing Cao. Data Science: Profession and Education

[1] M. A. Walker, “The professionalisation of data science,” Int. J. of Data Science, vol. 1, no. 1, pp. 7–16, 2015.

[2] A. Manieri, S. Brewer, R. Riestra, Y. Demchenko, M. Hemmje, T. Wiktorski, T. Ferrari, and J. Frey, “Data science professional uncovered: How the EDISON project will contribute to a widely accepted profile for data scientists,” in 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), 2015, pp. 588–593.