Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Causal inference in Python (github.com/akelleh)
87 points by aleyan on Feb 1, 2017 | hide | past | favorite | 6 comments


I'd love for someone to explain about this "inferring causality from observational data". I've read a little before, and it sounded really exciting, but I wondered if it was just the same as instrumental variables with multiple instruments. Anyone know more?


Statistical modeling often involves two parts, model selection and model inference. As it turns out, many models have a causal interpretation. For example, a simple linear model could be interpreted as x causes y using the equation y = b*x + error.

If you get a bunch of variables and relate them through linear equations where some cause others plus some error, then different patterns of causal relations imply different covariance matrices. Classically, people have used these covariance matrices to choose between possible causal models.

There are different approaches, but a common one in the behavioral sciences is to choose a few causal models to represent theories, and then perform model selection (like choosing between multiple regression models with different variables).

To answer your question, though, instrumental variables are a specific causal pattern in a model, but there can be other models, such as those with latent variables.


There are actually two sides of what is referred to as causal inference. Either (a) inferring a causal graph from the data, or (b) given a graph and data, measuring the causal effect of variables among each other.

The broad idea in (a) is to start with a fully connected graph, and eliminate edges between nodes that can be tested as independent, or independent conditionally on other nodes. This gives you a non-directed graph which can be oriented by several methods (identifying V-structures, looking at residuals of regressions of X on Y vs Y on X).

The theory in (b) actually generalizes instrumental variables and lays out graphical configurations where you can measure the causal effect of a variable onto another variable, and how to compute that effect.

A great reference: https://www.amazon.com/Causality-Reasoning-Inference-Judea-P...

A nice introduction: https://www.youtube.com/watch?v=RPgvfSeQB8A


There is also Tetrad. It's written in Java and has a GUI. https://github.com/cmu-phil/tetrad


Haha, was wondering if this was by Adam Kelleher. I've been to a few of the meetups he hosted about causal inference. Really smart guy.


I always wish the readme included a real-world example to help make the libraries more accessible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: