econml.grf._base_grf.BaseGRF

class econml.grf._base_grf.BaseGRF(n_estimators=100, *, criterion='mse', max_depth=None, min_samples_split=10, min_samples_leaf=5, min_weight_fraction_leaf=0.0, min_var_fraction_leaf=None, min_var_leaf_on_val=False, max_features='auto', min_impurity_decrease=0.0, max_samples=0.45, min_balancedness_tol=0.45, honest=True, inference=True, fit_intercept=True, subforest_size=4, n_jobs=-1, random_state=None, verbose=0, warm_start=False)[source]

Bases: BaseEnsemble

Base class for Genearlized Random Forests.

Solves a linear moment equations of the form:

E[J * theta(x) - A | X = x] = 0

where J is an (d, d) random matrix, A is an (d, 1) random vector and theta(x) is a local parameter to be estimated, which might contain both relevant and nuisance parameters.

Warning: This class should not be used directly. Use derived classes instead.

__init__(n_estimators=100, *, criterion='mse', max_depth=None, min_samples_split=10, min_samples_leaf=5, min_weight_fraction_leaf=0.0, min_var_fraction_leaf=None, min_var_leaf_on_val=False, max_features='auto', min_impurity_decrease=0.0, max_samples=0.45, min_balancedness_tol=0.45, honest=True, inference=True, fit_intercept=True, subforest_size=4, n_jobs=-1, random_state=None, verbose=0, warm_start=False)[source]

Methods

__init__([n_estimators, criterion, ...])

apply(X)

Apply trees in the forest to X, return leaf indices.

decision_path(X)

Return the decision path in the forest.

feature_importances([max_depth, ...])

Get feature importances based on the amount of parameter heterogeneity they create.

fit(X, T, y, *[, sample_weight])

Build a forest of trees from the training set (X, T, y) and any other auxiliary variables.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

get_subsample_inds()

Re-generate the example same sample indices as those at fit time using same pseudo-randomness.

oob_predict(Xtrain)

Return the relevant output predictions for each data point, using the trees that didn't use that data point.

predict(X[, interval, alpha])

Return the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs].

predict_alpha_and_jac(X[, slice, parallel])

Get the predicted alpha and jacobian values.

predict_and_var(X)

Return the prefix of relevant fitted local parameters and their covariance matrix for each x in X.

predict_full(X[, interval, alpha])

Return the fitted local parameters for each x in X, i.e. theta(x).

predict_interval(X[, alpha])

Return the confidence interval for the relevant fitted local parameters for each x in X.

predict_moment_and_var(X, parameter[, ...])

Return the value of the conditional expected moment vector and variance at each sample.

predict_projection(X, projector)

Return the product of the prefix of relevant fitted local parameters with a projector vector.

predict_projection_and_var(X, projector)

Return the product of the prefix of relevant fitted local parameters with a projector vector, and its variance.

predict_projection_var(X, projector)

Return the variance of the product of the prefix of relevant fitted local parameters with a projector vector.

predict_tree_average(X)

Return the prefix of relevant fitted local parameters for each X, i.e. theta(X)[1..n_relevant_outputs].

predict_tree_average_full(X)

Return the fitted local parameters for each X, i.e. theta(X).

predict_var(X)

Return the covariance matrix of the prefix of relevant fitted local parameters for each x in X.

prediction_stderr(X)

Return the standard deviation of each coordinate of the prefix of relevant fitted local parameters.

set_fit_request(*[, T, sample_weight])

Request metadata passed to the fit method.

set_params(**params)

Set the parameters of this estimator.

set_predict_request(*[, alpha, interval])

Request metadata passed to the predict method.

Attributes

feature_importances_

apply(X)[source]

Apply trees in the forest to X, return leaf indices.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

X_leaves – For each datapoint x in X and for each tree in the forest, return the index of the leaf x ends up in.

Return type:

ndarray of shape (n_samples, n_estimators)

decision_path(X)[source]

Return the decision path in the forest.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

  • indicator (sparse matrix of shape (n_samples, n_nodes)) – Return a node indicator matrix where non zero elements indicates that the samples goes through the nodes. The matrix is of CSR format.

  • n_nodes_ptr (ndarray of shape (n_estimators + 1,)) – The columns from indicator[n_nodes_ptr[i]:n_nodes_ptr[i+1]] gives the indicator value for the i-th estimator.

feature_importances(max_depth=4, depth_decay_exponent=2.0)[source]

Get feature importances based on the amount of parameter heterogeneity they create.

The higher, the more important the feature. The importance of a feature is computed as the (normalized) total heterogeneity that the feature creates. For each tree and for each split that the feature was chosen adds:

parent_weight * (left_weight * right_weight)
    * mean((value_left[k] - value_right[k])**2) / parent_weight**2

to the importance of the feature. Each such quantity is also weighted by the depth of the split. These importances are normalized at the tree level and then averaged across trees.

Parameters:
  • max_depth (int, default 4) – Splits of depth larger than max_depth are not used in this calculation

  • depth_decay_exponent (double, default 2.0) – The contribution of each split to the total score is re-weighted by 1 / (1 + depth)**2.0.

Returns:

feature_importances_ – Normalized total parameter heterogeneity inducing importance of each feature

Return type:

ndarray of shape (n_features,)

fit(X, T, y, *, sample_weight=None, **kwargs)[source]

Build a forest of trees from the training set (X, T, y) and any other auxiliary variables.

Parameters:
  • X (array_like of shape (n_samples, n_features)) – The training input samples. Internally, its dtype will be converted to dtype=np.float64.

  • T (array_like of shape (n_samples, n_treatments)) – The treatment vector for each sample

  • y (array_like of shape (n_samples,) or (n_samples, n_outcomes)) – The outcome values for each sample.

  • sample_weight (array_like of shape (n_samples,), default None) – Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node.

  • **kwargs (dictionary of array_like items of shape (n_samples, d_var)) – Auxiliary random variables that go into the moment function (e.g. instrument, censoring etc) Any of these variables will be passed on as is to the get_pointJ and get_alpha method of the children classes.

Returns:

self

Return type:

object

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

get_subsample_inds()[source]

Re-generate the example same sample indices as those at fit time using same pseudo-randomness.

oob_predict(Xtrain)[source]

Return the relevant output predictions for each data point, using the trees that didn’t use that data point.

This method is not available is the estimator was trained with warm_start=True.

Parameters:

Xtrain ((n_training_samples, n_features) matrix) – Must be the same exact X matrix that was passed to the forest at fit time.

Returns:

oob_preds – The out-of-bag predictions of the relevant output parameters for each of the training points

Return type:

(n_training_samples, n_relevant_outputs) matrix

predict(X, interval=False, alpha=0.05)[source]

Return the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs].

Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • interval (bool, default False) – Whether to return a confidence interval too

  • alpha (float in (0, 1), default 0.05) – The confidence level of the confidence interval. Returns a symmetric (alpha/2, 1-alpha/2) confidence interval.

Returns:

  • theta(X)[1, .., n_relevant_outputs] (array_like of shape (n_samples, n_relevant_outputs)) – The estimated relevant parameters for each row of X

  • lb(x), ub(x) (array_like of shape (n_samples, n_relevant_outputs)) – The lower and upper end of the confidence interval for each parameter. Return value is omitted if interval=False.

predict_alpha_and_jac(X, slice=None, parallel=True)[source]

Get the predicted alpha and jacobian values.

The value of the conditional jacobian E[J | X=x] and the conditional alpha E[A | X=x] use the forest as kernel weights, i.e.:

alpha(x) = (1/n_trees) sum_{trees} (1/ |leaf(x)|) sum_{val sample i in leaf(x)} w[i] A[i]
jac(x) = (1/n_trees) sum_{trees} (1/ |leaf(x)|) sum_{val sample i in leaf(x)} w[i] J[i]

where w[i] is the sample weight (1.0 if sample_weight is None).

Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • slice (list of int or None, default None) – If not None, then only the trees with index in slice, will be used to calculate the mean and the variance.

  • parallel (bool , default True) – Whether the averaging should happen using parallelism or not. Parallelism adds some overhead but makes it faster with many trees.

Returns:

  • alpha (array_like of shape (n_samples, n_outputs)) – The estimated conditional A, alpha(x) for each sample x in X

  • jac (array_like of shape (n_samples, n_outputs, n_outputs)) – The estimated conditional J, jac(x) for each sample x in X

predict_and_var(X)[source]

Return the prefix of relevant fitted local parameters and their covariance matrix for each x in X.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

  • theta(x)[1, .., n_relevant_outputs] (array_like of shape (n_samples, n_relevant_outputs)) – The estimated relevant parameters for each row of X

  • var(theta(x)) (array_like of shape (n_samples, n_relevant_outputs, n_relevant_outputs)) – The covariance of theta(x)[1, .., n_relevant_outputs]

predict_full(X, interval=False, alpha=0.05)[source]

Return the fitted local parameters for each x in X, i.e. theta(x).

Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • interval (bool, default False) – Whether to return a confidence interval too

  • alpha (float in (0, 1), default 0.05) – The confidence level of the confidence interval. Returns a symmetric (alpha/2, 1-alpha/2) confidence interval.

Returns:

  • theta(x) (array_like of shape (n_samples, n_outputs)) – The estimated relevant parameters for each row x of X

  • lb(x), ub(x) (array_like of shape (n_samples, n_outputs)) – The lower and upper end of the confidence interval for each parameter. Return value is omitted if interval=False.

predict_interval(X, alpha=0.05)[source]

Return the confidence interval for the relevant fitted local parameters for each x in X.

Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • alpha (float in (0, 1), default 0.05) – The confidence level of the confidence interval. Returns a symmetric (alpha/2, 1-alpha/2) confidence interval.

Returns:

lb(x), ub(x) – The lower and upper end of the confidence interval for each parameter. Return value is omitted if interval=False.

Return type:

array_like of shape (n_samples, n_relevant_outputs)

predict_moment_and_var(X, parameter, slice=None, parallel=True)[source]

Return the value of the conditional expected moment vector and variance at each sample.

For the given parameter estimate for each sample:

M(x; theta(x)) := E[J | X=x] theta(x) - E[A | X=x]

where conditional expectations are estimated based on the forest weights, i.e.:

M_tree(x; theta(x)) := (1/ |leaf(x)|) sum_{val sample i in leaf(x)} w[i] (J[i] theta(x) - A[i])
M(x; theta(x) = (1/n_trees) sum_{trees} M_tree(x; theta(x))

where w[i] is the sample weight (1.0 if sample_weight is None), as well as the variance of the local moment vector across trees:

Var(M_tree(x; theta(x))) = (1/n_trees) sum_{trees} M_tree(x; theta(x)) @ M_tree(x; theta(x)).T
Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • parameter (array_like of shape (n_samples, n_outputs)) – An estimate of the parameter theta(x) for each sample x in X

  • slice (list of int or None, default None) – If not None, then only the trees with index in slice, will be used to calculate the mean and the variance.

  • parallel (bool , default True) – Whether the averaging should happen using parallelism or not. Parallelism adds some overhead but makes it faster with many trees.

Returns:

  • moment (array_like of shape (n_samples, n_outputs)) – The estimated conditional moment M(x; theta(x)) for each sample x in X

  • moment_var (array_like of shape (n_samples, n_outputs)) – The variance of the conditional moment Var(M_tree(x; theta(x))) across trees for each sample x

predict_projection(X, projector)[source]

Return the product of the prefix of relevant fitted local parameters with a projector vector.

That is:

mu(x) := <theta(x)[1..n_relevant_outputs], projector(x)>
Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • projector (array_like of shape (n_samples, n_relevant_outputs)) – The projector vector for each sample x in X

Returns:

mu(x) – The estimated inner product of the relevant parameters with the projector for each row x of X

Return type:

array_like of shape (n_samples, 1)

predict_projection_and_var(X, projector)[source]

Return the product of the prefix of relevant fitted local parameters with a projector vector, and its variance.

That is:

mu(x) := <theta(x)[1..n_relevant_outputs], projector(x)>

as well as the variance of mu(x).

Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • projector (array_like of shape (n_samples, n_relevant_outputs)) – The projector vector for each sample x in X

Returns:

  • mu(x) (array_like of shape (n_samples, 1)) – The estimated inner product of the relevant parameters with the projector for each row x of X

  • var(mu(x)) (array_like of shape (n_samples, 1)) – The variance of the estimated inner product

predict_projection_var(X, projector)[source]

Return the variance of the product of the prefix of relevant fitted local parameters with a projector vector.

That is:

Var(mu(x)) for mu(x) := <theta(x)[1..n_relevant_outputs], projector(x)>
Parameters:
  • X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

  • projector (array_like of shape (n_samples, n_relevant_outputs)) – The projector vector for each sample x in X

Returns:

var(mu(x)) – The variance of the estimated inner product

Return type:

array_like of shape (n_samples, 1)

predict_tree_average(X)[source]

Return the prefix of relevant fitted local parameters for each X, i.e. theta(X)[1..n_relevant_outputs].

This method simply returns the average of the parameters estimated by each tree. predict should be preferred over pred_tree_average, as it performs a more stable averaging across trees.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

theta(X)[1, .., n_relevant_outputs] – The estimated relevant parameters for each row of X

Return type:

array_like of shape (n_samples, n_relevant_outputs)

predict_tree_average_full(X)[source]

Return the fitted local parameters for each X, i.e. theta(X).

This method simply returns the average of the parameters estimated by each tree. predict_full should be preferred over pred_tree_average_full, as it performs a more stable averaging across trees.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

theta(X) – The estimated relevant parameters for each row of X

Return type:

array_like of shape (n_samples, n_outputs)

predict_var(X)[source]

Return the covariance matrix of the prefix of relevant fitted local parameters for each x in X.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

var(theta(x)) – The covariance of theta(x)[1, .., n_relevant_outputs]

Return type:

array_like of shape (n_samples, n_relevant_outputs, n_relevant_outputs)

prediction_stderr(X)[source]

Return the standard deviation of each coordinate of the prefix of relevant fitted local parameters.

Parameters:

X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float64.

Returns:

std(theta(x)) – The standard deviation of each theta(x)[i] for i in {1, .., n_relevant_outputs}

Return type:

array_like of shape (n_samples, n_relevant_outputs)

set_fit_request(*, T: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') BaseGRF

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • T (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for T parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_predict_request(*, alpha: bool | None | str = '$UNCHANGED$', interval: bool | None | str = '$UNCHANGED$') BaseGRF

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • alpha (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for alpha parameter in predict.

  • interval (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for interval parameter in predict.

Returns:

self – The updated object.

Return type:

object