Overview
CurveFit is an extendable nonlinear mixed effects model for fitting curves. The main application in this development is COVID19 forecasting, so that the curves we consider are variants of logistic models. However the interface allows any userspecified parametrized family.
Parametrized curves have several key features that make them useful for forecasting:
 We can capture key signals from noisy data.
 Parameters are interpretable, and can be modeled using covariates in a transparent way.
 Parametric forms allow for more stable inversion approaches, for current and future work.
 Parametric functions impose rigid assumptions that make forecasting more stable.
COVID19 functional forms
We considered two functional forms so far when modeling the COVID19 epidemic.

Generalized Logistic:

Generalized Gaussian Cumulative Distribution Function
Each form has comparable fundamental parameters:
 Level : Controls the ultimate level.
 Slope : Controls speed of infection.
 Inflection : Time at which the rate of change is maximal.
We can fit these parameters to data, but this by itself does not account for covariates, and cannot connect different locations together. The next section therefore specifies statistical models that do this.
Statistical Model
Statistical assumptions link covariates across locations. Key aspects are the following:

Parameters may be influenced by covariates, e.g. those that reflect social distancing

Parameters may be modeled in a different space, e.g. are nonnegative

Parameters and covariate multipliers may be locationspecific, with assumptions placed on their variation.
CurveFit specification is tailored to these three requirements. Every parameter in any functional form can be specified through a link function, covariates, fixed, and random effects. The final estimation problem is a nonlinear mixed effects model, with userspecified priors on fixed and random effects.
For example, consider the ERF functional form with covariates . Assume we are fitting data in logcumulativedeathrate space. Input data are:
 : social distancing covariate value at location
 : cumulative death rate in location at time
We specify the statistical model as follows:

Measurement model:

model specification:
 model specification:
 model specification:
In this example, the user specifies
 prior mean
 variance parameters .
CurveFit estimates:
 fixed effects
 random effects
Exponential link functions are used to model nonnegative parameters .
Constraints
Simple bound constraints on parameters can be used to make the model more robust. For any fixed or random effect, the user can enter simple bound constraints of the form The parameters returned by CurveFit are guaranteed to satisfy these simple bounds.
Optimization Procedure
The optimization problem we obtain from specifying functional forms, priors, and constraints on all parameters is a boundconstrained nonlinear least squares problem. We explain the solver, derivative computation, and initialization procedure below.
Solver
We solve the problem using LBFGSB. The LBFGSB algorithm uses gradients to build a Hessian approximation, and efficiently uses that approximation and projected gradient method onto the bound constraints to identify parameter spaces over which solutions can be efficiently found, see the paper. It is a standard and robust algorithm that's well suited to the task.
Derivatives
We do not explicitly compute derivatives of the nonlinear least squares objective induced from the problem specification. Instead, we use the complex step method to do this. The complex step method is a simple example of Automatic Differentiation, that is, it can provide machine precision derivatives at the cost of a function evaluation. This is very useful given the flexibility on functional forms.
Uncertainty
Currently CurveFit uses modelbased uncertainty, with outofsample approaches under development.
Predictive ValidityBased Uncertainty
We have a tool that evaluates predictive validity out of sample for the model forecasts. It iteratively holds out data points starting with only one data point used for fitting and adding them back in one by one, comparing the predictions with the observed data. The standard deviation observed for these residuals  along the dimensions of how much data the model sees and how far out the model needs to predict into the future  are then used to simulate draws (random realizations of the mean function) that can be used to construct uncertainty intervals. This approach is orthogonal to modelbased uncertainty described below.
ModelBased Uncertainty
We partition modelbased uncertainty into estimates coming from fixed and random components. Fixed effects capture the variation of the mean effects, and random effects uncertainty captures the variation across locations.
 Fixed Effects
For any estimator obtained by solving a nonlinear least squares problem, we can use the Fisher information matrix to get an asymptotic approximation to the uncertainty. Let where is any prior variance and is the variance of observations. Then our approximation for the variance matrix of the estimate is given by where is the Jacobian matrix evaluated at . The Jacobian is also computed using the complex step method.
 Random effects
To obtain the variance of the random effects, we derive an empirical variance matrix across locations. Given a set of zero mean random effect estimates , with each a vector of of random effect types, we get an empirical matrix by To obtain posterior uncertainty for each specific location, we use the empirical as a prior, and any data at the location as the measurement model, and refit the location: Within each location, this is analogous to the fixed effects analysis. The locationspecific uncertainty is then estimated from the same Fisher information analysis: