{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# DoWhy: Different estimation methods for causal inference\n", "This is a quick introduction to the DoWhy causal inference library.\n", "We will load in a sample dataset and use different methods for estimating the causal effect of a (pre-specified)treatment variable on a (pre-specified) outcome variable.\n", "\n", "First, let us add the required path for Python to find the DoWhy code and load all required packages" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os, sys\n", "sys.path.append(os.path.abspath(\"../../../\"))" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import logging\n", "\n", "import dowhy\n", "from dowhy import CausalModel\n", "import dowhy.datasets " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us load a dataset. For simplicity, we simulate a dataset with linear relationships between common causes and treatment, and common causes and outcome. \n", "\n", "Beta is the true causal effect. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Z0 | \n", "Z1 | \n", "W0 | \n", "W1 | \n", "W2 | \n", "W3 | \n", "W4 | \n", "v0 | \n", "y | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "1.0 | \n", "0.856829 | \n", "0.871424 | \n", "-0.792461 | \n", "-0.336331 | \n", "0.386621 | \n", "-0.068865 | \n", "True | \n", "9.124501 | \n", "
1 | \n", "1.0 | \n", "0.491077 | \n", "0.197358 | \n", "-0.505399 | \n", "-0.424140 | \n", "0.367762 | \n", "0.168461 | \n", "True | \n", "8.622930 | \n", "
2 | \n", "1.0 | \n", "0.665795 | \n", "0.945841 | \n", "-0.288969 | \n", "0.274395 | \n", "-1.312587 | \n", "2.382897 | \n", "True | \n", "17.977266 | \n", "
3 | \n", "1.0 | \n", "0.902905 | \n", "1.268346 | \n", "-0.059530 | \n", "0.315513 | \n", "-0.932715 | \n", "-1.360252 | \n", "True | \n", "8.367090 | \n", "
4 | \n", "1.0 | \n", "0.104740 | \n", "-1.342788 | \n", "-1.935350 | \n", "-0.649980 | \n", "-0.852453 | \n", "0.843568 | \n", "True | \n", "-1.326686 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
9995 | \n", "1.0 | \n", "0.577368 | \n", "1.846929 | \n", "0.755214 | \n", "-2.979011 | \n", "1.525415 | \n", "-0.225743 | \n", "True | \n", "9.653687 | \n", "
9996 | \n", "1.0 | \n", "0.131065 | \n", "1.880914 | \n", "-1.314365 | \n", "-0.538280 | \n", "-0.303415 | \n", "0.863559 | \n", "True | \n", "11.305263 | \n", "
9997 | \n", "1.0 | \n", "0.739417 | \n", "-0.974042 | \n", "-0.707890 | \n", "-0.028049 | \n", "-1.371608 | \n", "0.100693 | \n", "True | \n", "2.620035 | \n", "
9998 | \n", "1.0 | \n", "0.489953 | \n", "-0.363797 | \n", "-0.590689 | \n", "-1.905395 | \n", "-0.374315 | \n", "0.622429 | \n", "True | \n", "1.844830 | \n", "
9999 | \n", "1.0 | \n", "0.484942 | \n", "1.118425 | \n", "-0.414818 | \n", "-1.112958 | \n", "0.608269 | \n", "1.865714 | \n", "True | \n", "15.116874 | \n", "
10000 rows × 9 columns
\n", "