Data Analysis for Decision-Making

Lecture Notes

Author
Affiliation

© Prof. Dr. Stephan Huber

Published

November 5, 2024

Preface

About the notes

A PDF version of these notes is available here.

Please note that while the PDF contains the same content, it has not been optimized for PDF format. Therefore, some parts may not appear as intended.

About the author

Figure 1: Prof. Dr. Stephan Huber

I am a Professor of International Economics and Data Science at HS Fresenius, holding a Diploma in Economics from the University of Regensburg and a Doctoral Degree (summa cum laude) from the University of Trier. I completed postgraduate studies at the Interdisciplinary Graduate Center of Excellence at the Institute for Labor Law and Industrial Relations in the European Union (IAAEU) in Trier. Prior to my current position, I worked as a research assistant to Prof. Dr. Dr. h.c. Joachim Möller at the University of Regensburg, a post-doc at the Leibniz Institute for East and Southeast European Studies (IOS) in Regensburg, and a freelancer at Charles University in Prague.

Throughout my career, I have also worked as a lecturer at various institutions, including the TU Munich, the University of Regensburg, Saarland University, and the Universities of Applied Sciences in Frankfurt and Augsburg. Additionally, I have had the opportunity to teach abroad for the University of Cordoba in Spain, the University of Perugia in Italy, and the Petra Christian University in Surabaya, Indonesia. My published work can be found in international journals such as the Canadian Journal of Economics and the Stata Journal. For more information on my work, please visit my private homepage at hubchev.github.io.

Prof. Dr. Stephan Huber
Hochschule Fresenius für Wirtschaft & Medien GmbH
Im MediaPark 4c
50670 Cologne

Office: 4e OG-3
Telefon: +49 221 973199-523
Mail: stephan.huber@hs-fresenius.de
Private homepage: www.hubchev.github.io
Github: https://github.com/hubchev

I was always fascinated by data and statistics. For example, in 1992 I could name all soccer players in Germany’s first division including how many goals they scored. Later, in 2003 I joined the introductory statistics course of Daniel Rösch. I learned among others that probabilities often play a role when analyzing data. I continued my data science journey with Harry Haupt’s Introductory Econometrics course, where I studied the infamous Jeffrey M. Wooldridge (2002) textbook. It got me hooked and so I took all the courses Rolf Tschernig offered at his chair of Econometrics, where I became a tutor at the University of Regensburg and a research assistant of Joachim Möller. Despite everything we did had to do with how to make sense out of data, we never actually used the term data science which is also absent in the more 850 pages long textbook by Wooldridge (2002). The book also remains silent about machine learning or artificial intelligence. These terms became popular only after I graduated. The Harvard Business Review article by Davenport & Patil (2012) who claimed that data scientist is “The Sexiest Job of the 21st Century” may have boosted the popularity.

Wooldridge, J. M. (2002). Introductory econometrics: A modern approach. In Delhi: Cengage Learnng (2nd ed.). South-Western.
Davenport, T. H., & Patil, D. (2012). Data scientist: The sexiest job of the 21st century. Harvard Business Review, 90(5), 70–76.

The term “data scientist” has become remarkably popular, and many people are eager to adopt this title. Although I am a professor of data science, my professional identity is more like that of an applied, empirically-oriented international economist. My hesitation to adopt the title “data scientist” also stems from the deep respect I have developed through my interactions with econometricians and statisticians. Considering their in-depth expertise, I feel like a passionate amateur.

Ultimately, I poke around in data to find something interesting. Much like my ten-year-old younger self who analyzed soccer statistics to gain a deeper understanding of the sport. The only thing that has changed since then is that I know more promising methods and can efficiently use tools for data processing and data analysis.

About the Course

The course totals 125 hours over one semester at the Master’s level, granting 5 ECTS points. It consists of 3 weekly contact hours (42 hours in total) and 83 hours of private study.

Abstract

This module provides essential skills for transforming data into actionable business insights. Upon completion of the module, students will be able to summarize the importance of analyzing business data and engage in discussions about different workflows for leveraging data analytics for new business trends. The module systematically develops skills to create plans for data collection and management. Students learn to recognize the challenges and opportunities of different quantitative empirical strategies. Decision principles, frameworks and tools, including decision trees and payoff tables, are covered in depth, focusing on the central role of decision support systems (DSS). Students will gain an in-depth understanding of the different roles in business analysis and a comprehensive overview of the entire data analysis workflow in a business context.

Learning outcomes/competences

After a successful completion of the module, the students are able to:

  • summarize the significance of business data analysis for decision-making and demonstrate the ability to justify and articulate diverse workflows to convert data into actionable information,
  • explain the role of data analytics to emerging trends in business and how organizations can use data analytics and decision support systems to solve problems,
  • identify and contrast the competencies required to solve business problems with data and be able to assign the various tasks of a data scientific workflow to professionals with the appropriate profile,
  • set up a plan for collecting, managing, analyzing, and applying data, reflectively applying quantitative methods,
  • distinguish and discuss empirical research strategies to identify causal mechanisms, causes, and effects,
  • identify the need of decision support systems and assess the possibilities of data-driven methods to improve decision-making of humans and organizations.

Module content

Introduction

  • Significance of business data analysis for decision-making.
  • Emerging trends: Evolution of computers and data processing, digitalization, artificial intelligence, machine learning, deep learning, big data, internet of things, cloud computing, and blockchain, industry 4.0, and remote working.
  • The role of business analytics and intelligence in converting data into actionable information for decision-makers.
  • Overview of various job roles for business analytics: data engineer, data analyst, machine learning engineer, business intelligence analyst, database administrator, data product manager, market research analyst, fraud analyst, …
  • Analytics applications in strategic insights.
  • Types of analytics: Descriptive, predictive, and prescriptive analytics.
  • Workflows and Data science life cycles: OSEMN, DSLC, CRISP-DM, Kanban, TDSP, …

Data literacy competencies

  • Types of data: Cross-section, panel, time-series, georeferenced, …
  • Types of variables: Continuous, count, ordinal, categorial, …
  • Conceptual framework: Knowledge and understanding of data and applications of data.
  • Data collection: Identify, collect, and assess data.
  • Data management: Organize, clean, convert, curate, and preserve data.
  • Data evaluation: Plan, conduct, evaluate, and assess data analyses.
  • Data application: Share, reflect, and evaluate results of analyses, comparing them with other findings while considering ethical issues and scientific standards.

Quantitative research design

  • Techniques for measuring socio-economic and business phenomena.
  • Strategies for identifying causes of effects and effects of causes.
  • The fundamental problem of causal inference.
  • Techniques to establish causation: matching, natural experiments, field experiments, and laboratory experiments.

Introduction to decision-making

  • Overview of the principles guiding rational decision-making.
  • Frameworks for structuring and representing the decision-making process.
  • Utilization of decision trees, payoff tables, Lagrange multipliers, and expected utility theory.
  • Exploration of bounded rationality in human decision-making behaviors.
  • The necessity, concept, and evolution of

Assessment Methods and Criteria

Students complete this module with an academic presentation. The presentation takes place during the lecture period; the exact date is set by the lecturer. The presentation last for 10-15 minutes per student. In addition, a handout (3-5 pages per student) should be produced outlining the key features of the project and the literature on which these decisions are based (project outline). The handout should be submitted to the lecturer by the date of the presentation at the latest.

Group work is permitted. The maximum group size is 5 students. In case of group work, it must be possible to clearly define and assess each student’s individual performance on the basis of specified sections, page numbers, or other objective criteria.

The presentation contributes 65% to the module grade, the handout contributes 35%. A passing grade in this module is achieved when the overall grade is greater than or equal to 4.0. To pass the course a student must give a 10 minute presentation and submit a handout of 3-5 pages.

Literature

There are tons of books around that are both insightful and entertaining and support the lecture. In Figure 2, I present a short list of books I recommend: Bergstrom & West (2021), Chivers & Chivers (2021), Dougherty & Ilyankou (2021), Ellenberg (2015), Harford (2020), Huff (1954), Huntington-Klein (2022), and Jones (2020).

Bergstrom, C. T., & West, J. D. (2021). Calling bullshit: The art of skepticism in a data-driven world. Penguin Books.
Chivers, T., & Chivers, D. (2021). How to read numbers: A guide to statistics in the news (and knowing when to trust them). Weidenfeld & Nicolson.
Dougherty, J., & Ilyankou, I. (2021). Hands-on data visualization interactive storytelling from spreadsheets to code. Accessed January 30, 2023; O’Reilly. https://handsondataviz.org/
Ellenberg, J. (2015). How not to be wrong: The power of mathematical thinking. Penguin Books.
Harford, T. (2020). How to make the world add up: Ten rules for thinking differently about numbers. The Bridge Street Press.
Huff, D. (1954). How to lie with statistics. WW Norton & company.
Huntington-Klein, N. (2022). The effect: An introduction to research design and causality. Accessed January 30, 2023; CRC Press. https://theeffectbook.net
Jones, B. (2020). Avoiding data pitfalls: How to steer clear of common blunders when working with data and presenting analysis and visualizations. John Wiley & Sons.
Figure 2: Books for data literacy

In addition, I highly recommend the two books mentioned in Figure 3. Vaughan (2020) teaches important skills for working effectively with data. For a more technical approach that focuses on statistics, Spiegelhalter (2019) is an excellent choice. Both books are easy to read even without advanced math skills. We will use these books in the course and some of your presentations will refer to them as well.

Vaughan, D. (2020). Analytical skills for AI and data science. O’Reilly Media.
Spiegelhalter, D. (2019). The art of statistics: Learning from data. Penguin UK.
Figure 3: Books for skills and statistics