weave.smoother#
Smooth data across multiple dimensions using weighted averages.
- class weave.smoother.Smoother(dimensions)[source]#
Smoother function.
- inverse_weights#
Whether or not to use inverse-distance weights.
- Type:
bool
See also
Create smoother function.
Examples
Create a space-time smoother to smooth data across age, year, and location.
>>> from weave.dimension import Dimension >>> from weave.smoother import Smoother >>> age = Dimension( name='age_id', coordinates='age_mean', kernel='exponential', radius=1 ) >>> year = Dimension( name='year_id', kernel='tricubic', exponent=0.5 ) >>> location = Dimension( name='location_id', coordinates=['super_region', 'region', 'country'], kernel='depth', radius=0.9 ) >>> dimensions = [age, year, location] >>> smoother = Smoother(dimensions)
- __call__(data, observed, stdev=None, smoothed=None, fit=None, predict=None, down_weight=1)[source]#
Smooth data across dimensions with weighted averages.
For each point in predict, smooth values in observed using a weighted average of points in fit, where weights are calculated based on proximity across dimensions. Return a data frame of points in predict with column smoothed containing smoothed values.
- Parameters:
data (pandas.DataFrame) – Input data structure.
observed (str) – Column name of values to smooth.
stdev (str, optional) – Column name of standard deviations. Required for inverse-distance kernels.
smoothed (str, optional) – Column name of smoothed values. If None, append ‘_smooth’ to observed.
fit (str, optional) – Column name indicating points to include in weighted averages. If None, all points in data are used.
predict (str, optional) – Column name indicating where to predict smoothed values. If None, predictions are made for all points in data.
down_weight (int or float in [0, 1], optional) – Down-weight neighbors for in-sample points. Default is 1, which corresponds to no down-weighting. If 0, in-sample points are not smoothed.
- Returns:
Points in predict with smoothed values smoothed.
- Return type:
pandas.DataFrame
Examples
Using the smoother created in the previous example, smooth data across age, year, and location. Create smoothed version of column count for all points using all points.
>>> from pandas import DataFrame >>> data = DataFrame({ 'age_id': [1, 2, 3, 4, 4], 'age_mean': [0.5, 1.5, 2.5, 3.5, 3.5], 'year_id': [1980, 1990, 2000, 2010, 2020], 'location_id': [5, 5, 6, 7, 9], 'super_region': [1, 1, 1, 1, 2], 'region': [3, 3, 3, 4, 8], 'country': [5, 5, 6, 7, 9], 'count': [1.0, 2.0, 3.0, 4.0, 5.0] }) >>> smoother(data, 'count') age_id ... count count_smooth 0 1 ... 1.0 1.250974 1 2 ... 2.0 2.084069 2 3 ... 3.0 2.919984 3 4 ... 4.0 3.988642 4 4 ... 5.0 5.000000
Create smoothed version of one column for all points using a subset of points.
>>> data['train'] = [True, False, False, True, True] >>> smoother(data, 'count', fit='train') age_id ... count train count_smooth 0 1 ... 1.0 True 1.032967 1 2 ... 2.0 False 1.032967 2 3 ... 3.0 False 1.300000 3 4 ... 4.0 True 3.967033 4 4 ... 5.0 True 5.000000
Create a smoothed version of one column for a subset of points using all points.
>>> data['test'] = [False, True, True, False, False] >>> smoother(data, 'count', predict='test') age_id ... count test count_smooth 0 2 ... 2.0 True 2.084069 1 3 ... 3.0 True 2.919984