Skip to content

curvefit.core.residual_model._ResidualModel

A model for describing the out of sample residuals

This is a model that describes how the out of sample residuals vary with a set of covariates. The goal is to be able to understand what the residuals moving into the future will look like based on key covariates (e.g. how much data do we currently have, how far out are we trying to predict, etc.).

Ultimately this class simulates residuals into the future. It does this in coefficient of variation space (if predicting in log space --> absolute residuals; if predicting in linear space --> relative residuals).

This is the BASE class and should not be used directly. It should be subclassed and the methods overwritten based on the type of residual model.

See the subclasses descriptions for each of their methods including SmoothResidualModel.

Arguments

  • cv_bounds (List[float]): a 2-element list of bounds on the coefficient of variation. The first element is the lower bound and the second is the upper bound
  • covariates (Dict[str: None, str]): a dictionary of covariates to use in the model. The keys of the dictionary are the names of covariates to include in the residual model fitting and the optional values for each key are the subsets of the covariates to use (e.g. only use the subset of data where covariate1 > 10. in the fitting).
  • exclude_groups (List[str]): a list of groups to exclude from the fitting process (not excluded from making predictions)

Methods

fit_residuals

Fits the residual model to the residual data frame passed in.

  • residual_df (pd.DataFrame): a data frame that contains all of the covariates and a residual observation that will be used for fitting the model (the design matrix)

simulate_residuals

Simulates residuals from the fitted residual model for particular covariate values. Returns an array of simulated residuals of size (num_simulations, num_covs) where num_covs is the product of the length of the values in covariate_specs (get this by doing _ResidualModel._expand_grid to get all covariate value combinations).

  • covariate_specs (Dict[str: np.array]): a dictionary of covariate values to create residual simulations for
  • num_simulations (int): number of residual simulations to produce