polykin.copolymerization¤

fit_copo_data ¤

fit_copo_data(
    data_Ff: list[CopoDataset_Ff] = [],
    data_fx: list[CopoDataset_fx] = [],
    data_Fx: list[CopoDataset_Fx] = [],
    r_guess: tuple[float, float] = (1.0, 1.0),
    method: Literal["NLLS", "ODR"] = "NLLS",
    alpha: float = 0.05,
    plot_data: bool = True,
    JCR_linear: bool = True,
    JCR_exact: bool = False,
    JCR_npoints: int = 200,
    JCR_rtol: float = 0.01,
) -> CopoFitResult

Fit copolymer composition data and estimate reactivity ratios.

This function employs rigorous nonlinear algorithms to estimate the reactivity ratios from experimental copolymer composition data of type \(F(f)\), \(f(x;f_0)\), and \(F(x,f_0)\).

The fitting is performed using one of two methods: nonlinear least squares (NLLS) or orthogonal distance regression (ODR). NLLS considers only observational errors in the dependent variable, whereas ODR takes into account observational errors in both the dependent and independent variables. Although the ODR method is statistically more general, it is also more complex and can (at present) only be used for fitting \(F(f)\) data. Whenever composition drift data is provided, NLLS must be utilized.

The joint confidence region (JCR) of the reactivity ratios is generated using approximate (linear) and/or exact methods. In most cases, the linear method should be sufficiently accurate. Nonetheless, for these types of fits, the exact method is computationally inexpensive, making it perhaps a preferable choice.

Reference

Van Herk, A.M. and Dröge, T. (1997), Nonlinear least squares fitting applied to copolymerization modeling. Macromol. Theory Simul., 6: 1263-1276.
Boggs, Paul T., et al. "Algorithm 676: ODRPACK: software for weighted orthogonal distance regression." ACM Transactions on Mathematical Software (TOMS) 15.4 (1989): 348-364.

PARAMETER	DESCRIPTION
`data_Ff`	F(f) instantaneous composition datasets. TYPE: `list[CopoDataset_Ff]` DEFAULT: `[]`
`data_fx`	f(x) composition drift datasets. TYPE: `list[CopoDataset_fx]` DEFAULT: `[]`
`data_Fx`	F(x) composition drift datasets TYPE: `list[CopoDataset_Fx]` DEFAULT: `[]`
`r_guess`	Initial guess for the reactivity ratios. TYPE: `tuple[float, float]` DEFAULT: `(1.0, 1.0)`
`method`	Optimization method. `NLLS` for nonlinear least squares or `ODR` for orthogonal distance regression. The `ODR` method is only available for F(f) data. TYPE: `Literal['NLLS', 'ODR']` DEFAULT: `'NLLS'`
`alpha`	Significance level. TYPE: `float` DEFAULT: `0.05`
`plot_data`	If `True`, comparisons between experimental and fitted data will be plotted. TYPE: `bool` DEFAULT: `True`
`JCR_linear`	If `True`, the JCR will be computed using the linear method. TYPE: `bool` DEFAULT: `True`
`JCR_exact`	If `True`, the JCR will be computed using the exact method. TYPE: `bool` DEFAULT: `False`
`JCR_npoints`	Number of points where the JCR is evaluated. The computational effort increases linearly with `npoints`. TYPE: `int` DEFAULT: `200`
`JCR_rtol`	Relative tolerance for the determination of the JCR. The default value (1e-2) should be adequate in most cases, as it implies a 1% accuracy in the JCR coordinates. TYPE: `float` DEFAULT: `0.01`

RETURNS	DESCRIPTION
`CopoFitResult`	Dataclass with all fit results.

See Also

confidence_ellipse: linear method used to calculate the joint confidence region.
confidence_region: exact method used to calculate the joint confidence region.
fit_Finemann_Ross: alternative method.

Source code in src/polykin/copolymerization/fitting.py

def fit_copo_data(
    data_Ff: list[CopoDataset_Ff] = [],
    data_fx: list[CopoDataset_fx] = [],
    data_Fx: list[CopoDataset_Fx] = [],
    r_guess: tuple[float, float] = (1.0, 1.0),
    method: Literal["NLLS", "ODR"] = "NLLS",
    alpha: float = 0.05,
    plot_data: bool = True,
    JCR_linear: bool = True,
    JCR_exact: bool = False,
    JCR_npoints: int = 200,
    JCR_rtol: float = 1e-2,
) -> CopoFitResult:
    r"""Fit copolymer composition data and estimate reactivity ratios.

    This function employs rigorous nonlinear algorithms to estimate the
    reactivity ratios from experimental copolymer composition data of type
    $F(f)$, $f(x;f_0)$, and $F(x,f_0)$.

    The fitting is performed using one of two methods: nonlinear least squares
    (NLLS) or orthogonal distance regression (ODR). NLLS considers only
    observational errors in the dependent variable, whereas ODR takes into
    account observational errors in both the dependent and independent
    variables. Although the ODR method is statistically more general, it is
    also more complex and can (at present) only be used for fitting $F(f)$
    data. Whenever composition drift data is provided, NLLS must be utilized.

    The joint confidence region (JCR) of the reactivity ratios is generated
    using approximate (linear) and/or exact methods. In most cases, the linear
    method should be sufficiently accurate. Nonetheless, for these types of
    fits, the exact method is computationally inexpensive, making it perhaps a
    preferable choice.

    **Reference**

    *   Van Herk, A.M. and Dröge, T. (1997), Nonlinear least squares fitting
        applied to copolymerization modeling. Macromol. Theory Simul.,
        6: 1263-1276.
    *   Boggs, Paul T., et al. "Algorithm 676: ODRPACK: software for weighted
        orthogonal distance regression." ACM Transactions on Mathematical
        Software (TOMS) 15.4 (1989): 348-364.

    Parameters
    ----------
    data_Ff : list[CopoDataset_Ff]
        F(f) instantaneous composition datasets.
    data_fx : list[CopoDataset_fx]
        f(x) composition drift datasets.
    data_Fx : list[CopoDataset_Fx]
        F(x) composition drift datasets
    r_guess : tuple[float, float]
        Initial guess for the reactivity ratios.
    method : Literal['NLLS', 'ODR']
        Optimization method. `NLLS` for nonlinear least squares or `ODR` for
        orthogonal distance regression. The `ODR` method is only available for
        F(f) data.
    alpha : float
        Significance level.
    plot_data : bool
        If `True`, comparisons between experimental and fitted data will be
        plotted.
    JCR_linear : bool, optional
        If `True`, the JCR will be computed using the linear method.
    JCR_exact : bool, optional
        If `True`, the JCR will be computed using the exact method.
    JCR_npoints : int
        Number of points where the JCR is evaluated. The computational effort
        increases linearly with `npoints`.
    JCR_rtol : float
        Relative tolerance for the determination of the JCR. The default value
        (1e-2) should be adequate in most cases, as it implies a 1% accuracy in
        the JCR coordinates.

    Returns
    -------
    CopoFitResult
        Dataclass with all fit results.

    See Also
    --------
    * [`confidence_ellipse`](../math/misc/confidence_ellipse.md): linear method
      used to calculate the joint confidence region.
    * [`confidence_region`](../math/misc/confidence_region.md): exact method
      used to calculate the joint confidence region.
    * [`fit_Finemann_Ross`](fit_Finemann_Ross.md): alternative method.

    """
    # Calculate and check 'ndata'
    npar = 2
    ndata = 0
    for dataset in data_Ff:
        ndata += len(dataset.f1)
    for dataset in data_fx:
        ndata += len(dataset.x)
    for dataset in data_Fx:
        ndata += len(dataset.x)
    if ndata < npar:
        raise FitError("Not enough data to estimate reactivity ratios.")

    # Check method choice
    if method == "ODR" and (len(data_fx) > 0 or len(data_Fx) > 0):
        raise FitError("ODR method not implemented for drift data.")

    # Fit data
    if method == "NLLS":
        r_opt, cov, sse = _fit_copo_NLLS(data_Ff, data_fx, data_Fx, r_guess, ndata)
    elif method == "ODR":
        r_opt, cov, sse = _fit_copo_ODR(data_Ff, r_guess)
    else:
        raise ValueError(f"Method {method} is not valid.")

    # Standard parameter errors and confidence intervals
    if ndata > npar and cov is not None:
        se_r = np.sqrt(np.diag(cov))
        ci_r = se_r * tdist.ppf(1.0 - alpha / 2, ndata - npar)
    else:
        se_r = [np.nan, np.nan]
        ci_r = [np.nan, np.nan]

    # Plot data vs model
    plots = {}
    if plot_data:
        xmax = 1.0
        npoints = 500
        # Plot F(f) data
        if data_Ff:
            plots["Ff"] = plt.subplots()
            ax = plots["Ff"][1]
            ax.set_xlabel(r"$f_1$")
            ax.set_ylabel(r"$F_1$")
            ax.set_xlim(0.0, 1.0)
            ax.set_ylim(0.0, 1.0)
            for dataset in data_Ff:
                ax.scatter(dataset.f1, dataset.F1, label=dataset.name)
            f1 = np.linspace(0.0, 1.0, npoints)
            F1_est = inst_copolymer_binary(f1, *r_opt)
            ax.plot(f1, F1_est)
            ax.legend(loc="best")

        # Plot f(x) data
        if data_fx:
            plots["fx"] = plt.subplots()
            ax = plots["fx"][1]
            ax.set_xlabel(r"$x$")
            ax.set_ylabel(r"$f_1$")
            ax.set_xlim(0.0, 1.0)
            ax.set_ylim(0.0, 1.0)
            x = np.linspace(0.0, xmax, npoints)
            for dataset in data_fx:
                ax.scatter(dataset.x, dataset.f1, label=dataset.name)
                f1_est = monomer_drift_binary(dataset.f10, x, *r_opt)
                ax.plot(x, f1_est)
            ax.legend(loc="best")

        # Plot F(x) data
        if data_Fx:
            plots["Fx"] = plt.subplots()
            ax = plots["Fx"][1]
            ax.set_xlabel(r"$x$")
            ax.set_ylabel(r"$F_1$")
            ax.set_xlim(0.0, 1.0)
            ax.set_ylim(0.0, 1.0)
            x = np.linspace(0.0, xmax, npoints)
            for dataset in data_Fx:
                f10 = dataset.f10
                ax.scatter(dataset.x, dataset.F1, label=dataset.name)
                f1_est = monomer_drift_binary(f10, x, *r_opt)
                F1_est = f1_est + (f10 - f1_est) / (x + 1e-10)
                F1_est[0] = inst_copolymer_binary(f10, *r_opt)
                ax.plot(x, F1_est)
            ax.legend(loc="best")

        # Parity plot
        plots["parity"] = plt.subplots()
        ax = plots["parity"][1]
        ax.set_xlabel("Observed value")
        ax.set_ylabel("Predicted value")
        ax.set_xlim(0.0, 1.0)
        ax.set_ylim(0.0, 1.0)
        ax.plot([0.0, 1.0], [0.0, 1.0], color="black", linewidth=0.5)
        for dataset in data_Ff:
            F1_est = inst_copolymer_binary(dataset.f1, *r_opt)
            ax.scatter(dataset.F1, F1_est, label=dataset.name)
        for dataset in data_fx:
            f1_est = monomer_drift_binary(dataset.f10, dataset.x, *r_opt)
            ax.scatter(dataset.f1, f1_est, label=dataset.name)
        for dataset in data_Fx:
            f1_est = monomer_drift_binary(dataset.f10, dataset.x, *r_opt)
            F1_est = f1_est + (dataset.f10 - f1_est) / (dataset.x + 1e-10)
            ax.scatter(dataset.F1, F1_est, label=dataset.name)
        ax.legend(loc="best")

    # Joint confidence region
    if (JCR_linear or JCR_exact) and cov is not None:

        plots["JCR"] = plt.subplots()
        ax = plots["JCR"][1]
        ax.set_xlabel(r"$r_1$")
        ax.set_ylabel(r"$r_2$")
        colors = ["tab:blue", "tab:orange"]
        idx = 0

        if JCR_linear:
            confidence_ellipse(
                ax,
                center=r_opt,
                cov=cov,
                ndata=ndata,
                alpha=alpha,
                color=colors[idx],
                label="linear JCR",
            )
            idx += 1

        if JCR_exact:
            jcr = confidence_region(
                center=r_opt,
                sse=sse,
                ndata=ndata,
                alpha=alpha,
                width=2 * ci_r[0],
                rtol=JCR_rtol,
                npoints=JCR_npoints,
            )

            # ax.scatter(r1, r2, c=colors[idx], s=5)
            ax.plot(*jcr, color=colors[idx], label="exact JCR")

        ax.legend(loc="best")

    result = CopoFitResult(
        method=method,
        r1=r_opt[0],
        r2=r_opt[1],
        alpha=alpha,
        ci_r1=ci_r[0],
        ci_r2=ci_r[1],
        se_r1=se_r[0],
        se_r2=se_r[1],
        cov=cov,
        plots=plots,
    )

    return result