TITLE: Causal Inference and the Data-fusion problem
ABSTRACT: Machine Learning is usually dichotomized into two categories, passive (e.g., supervised learning) and active (e.g., reinforcement learning), which, by and large, are studied separately.
Reality is more demanding. Passive and active modes of operation are but two extremes of a rich spectrum of data-collection modes (also called research designs) that generate the bulk of the data available in practical, large-scale situations. For example, a baby learns from its environment by both passively observing others and by interacting with its environment by actively performing interventions. In robotics, data from multiple observations and interventions are collected, coming from distinct experimental setups, different sampling conditions, and structurally different domains.
The goal of this tutorial is to introduce the principles and tools available for understanding and exploiting different data-collection modes that generate rich heterogenous datasets. I will start the tutorial by reviewing the fundamental results in causal inference relating passive and active modes of operation. I will then introduce the data-fusion problem, which is concerned with piecing together multiple datasets collected under heterogeneous conditions (to be formally defined) so as to answer causal and counterfactual queries. I will present a general non-parametric solution to the data-fusion problem where problems of confounding, sampling selection bias, and generalizability are solved.
I will finish discussing some recent results on the fundamental relationship between causal inference, autonomy, and decision-making.
SPEAKER: Elias Bareinboim
TUTORIAL WEB PAGE: Link
Back to the list