weave.smoother#

Smooth data across multiple dimensions using weighted averages.

class weave.smoother.Smoother(dimensions)[source]#

Smoother function.

dimensions#

Smoothing dimensions.

Type:

list of Dimension

inverse_weights#

Whether or not to use inverse-distance weights.

Type:

bool

Create smoother function.

Parameters:

dimensions (Dimension or list of Dimension) – Smoothing dimensions.

Examples

Create a space-time smoother to smooth data across age, year, and location.

>>> from weave.dimension import Dimension
>>> from weave.smoother import Smoother
>>> age = Dimension(
        name='age_id',
        coordinates='age_mean',
        kernel='exponential',
        radius=1
    )
>>> year = Dimension(
        name='year_id',
        kernel='tricubic',
        exponent=0.5
    )
>>> location = Dimension(
        name='location_id',
        coordinates=['super_region', 'region', 'country'],
        kernel='depth',
        radius=0.9
    )
>>> dimensions = [age, year, location]
>>> smoother = Smoother(dimensions)
__call__(data, observed, stdev=None, smoothed=None, fit=None, predict=None, down_weight=1)[source]#

Smooth data across dimensions with weighted averages.

For each point in predict, smooth values in observed using a weighted average of points in fit, where weights are calculated based on proximity across dimensions. Return a data frame of points in predict with column smoothed containing smoothed values.

Parameters:
  • data (pandas.DataFrame) – Input data structure.

  • observed (str) – Column name of values to smooth.

  • stdev (str, optional) – Column name of standard deviations. Required for inverse-distance kernels.

  • smoothed (str, optional) – Column name of smoothed values. If None, append ‘_smooth’ to observed.

  • fit (str, optional) – Column name indicating points to include in weighted averages. If None, all points in data are used.

  • predict (str, optional) – Column name indicating where to predict smoothed values. If None, predictions are made for all points in data.

  • down_weight (int or float in [0, 1], optional) – Down-weight neighbors for in-sample points. Default is 1, which corresponds to no down-weighting. If 0, in-sample points are not smoothed.

Returns:

Points in predict with smoothed values smoothed.

Return type:

pandas.DataFrame

Examples

Using the smoother created in the previous example, smooth data across age, year, and location. Create smoothed version of column count for all points using all points.

>>> from pandas import DataFrame
>>> data = DataFrame({
        'age_id': [1, 2, 3, 4, 4],
        'age_mean': [0.5, 1.5, 2.5, 3.5, 3.5],
        'year_id': [1980, 1990, 2000, 2010, 2020],
        'location_id': [5, 5, 6, 7, 9],
        'super_region': [1, 1, 1, 1, 2],
        'region': [3, 3, 3, 4, 8],
        'country': [5, 5, 6, 7, 9],
        'count': [1.0, 2.0, 3.0, 4.0, 5.0]
    })
>>> smoother(data, 'count')
   age_id  ...  count  count_smooth
0       1  ...    1.0      1.250974
1       2  ...    2.0      2.084069
2       3  ...    3.0      2.919984
3       4  ...    4.0      3.988642
4       4  ...    5.0      5.000000

Create smoothed version of one column for all points using a subset of points.

>>> data['train'] = [True, False, False, True, True]
>>> smoother(data, 'count', fit='train')
   age_id  ...  count  train  count_smooth
0       1  ...    1.0   True      1.032967
1       2  ...    2.0  False      1.032967
2       3  ...    3.0  False      1.300000
3       4  ...    4.0   True      3.967033
4       4  ...    5.0   True      5.000000

Create a smoothed version of one column for a subset of points using all points.

>>> data['test'] = [False, True, True, False, False]
>>> smoother(data, 'count', predict='test')
   age_id  ...  count  test  count_smooth
0       2  ...    2.0  True      2.084069
1       3  ...    3.0  True      2.919984