A Primer on Data

Technical Sharing Session #3

TitleA Primer on Data
Presenter
Dr Tan Hong Ming

Lecturer, Department of Analytics & Operations

Research Fellow, Work Package 2
Date28 January 2022, Friday
Sypnosis

Gathering insights and training machine learning models relies on data. Having a poor understanding of the data on hand or using non-representative data can lead to false conclusions. This is particularly so if we are interested in cause and effect. Gathering high quality and useful data then is of paramount importance.

 

In this talk, we discuss some biases and pitfalls that can be found when working with data. We do so using real life examples and case studies. Finally, we discuss data dredging, also known as p-hacking, and one method that attempts to mitigate this.