dodiscover.cd.BaseConditionalDiscrepancyTest#
- class dodiscover.cd.BaseConditionalDiscrepancyTest[source]#
 Abstract class for any conditional discrepancy test.
All CD tests are used in constraint-based causal discovery algorithms. This class interface is expected to be very lightweight to enable anyone to convert a function for CD testing into a class, which has a specific API.
Methods
compute_null(e_hat, X, Y[, null_reps, ...])Estimate null distribution using propensity weights.
test(df, group_col, y_vars, x_vars)Compute conditional discrepancy test.
- compute_null(e_hat, X, Y, null_reps=1000, random_state=None)[source]#
 Estimate null distribution using propensity weights.
- Parameters:
 - e_hatArray-like of shape (n_samples,)
 The predicted propensity score for
group_ind == 1.- XArray-Like of shape (n_samples, n_features_x)
 The X (covariates) array.
- YArray-Like of shape (n_samples, n_features_y)
 The Y (outcomes) array.
- null_reps
int, optional Number of times to sample null, by default 1000.
- random_state
int, optional Random generator, or random seed, by default None.
- Returns:
 - null_distArray-like of shape (n_samples,)
 The null distribution of test statistics.
- abstract test(df, group_col, y_vars, x_vars)[source]#
 Compute conditional discrepancy test.
Tests the null hypothesis: \(P(Y | X, group) = P(Y | X)\), where we are trying to determine if Y is (conditionally) independent from the group denoting the distribution, given X.
Another way of viewing this test is testing whether or not \(P_i(Y|X) = P_j(Y|X)\), where \(P_i(.)\) and \(P_j(.)\) denote distributions from different groups or environments denoted by the group_col.
- Parameters:
 - df
pd.DataFrame The dataframe containing the dataset.
- y_vars
Setofcolumn A column in
df.- group_col
column A column in
dfthat indicates which group of distribution each sample belongs to with a ‘0’, or ‘1’.- x_vars
Setofcolumn, optional A column in
df.
- df
 - Returns: