Skip to content

polykin.copolymerization¤

fit_copo_data ¤

fit_copo_data(
    data_Ff: list[CopoDataset_Ff] = [],
    data_fx: list[CopoDataset_fx] = [],
    data_Fx: list[CopoDataset_Fx] = [],
    r_guess: tuple[float, float] = (1.0, 1.0),
    method: Literal["NLLS", "ODR"] = "NLLS",
    alpha: float = 0.05,
    plot_data: bool = True,
    JCR_linear: bool = True,
    JCR_exact: bool = False,
    JCR_npoints: int = 200,
    JCR_rtol: float = 0.01,
) -> CopoFitResult

Fit copolymer composition data and estimate reactivity ratios.

This function employs rigorous nonlinear algorithms to estimate the reactivity ratios from experimental copolymer composition data of type \(F(f)\), \(f(x;f_0)\), and \(F(x,f_0)\).

The fitting is performed using one of two methods: nonlinear least squares (NLLS) or orthogonal distance regression (ODR). NLLS considers only observational errors in the dependent variable, whereas ODR takes into account observational errors in both the dependent and independent variables. Although the ODR method is statistically more general, it is also more complex and can (at present) only be used for fitting \(F(f)\) data. Whenever composition drift data is provided, NLLS must be utilized.

The joint confidence region (JCR) of the reactivity ratios is generated using approximate (linear) and/or exact methods. In most cases, the linear method should be sufficiently accurate. Nonetheless, for these types of fits, the exact method is computationally inexpensive, making it perhaps a preferable choice.

Reference

  • Van Herk, A.M. and Dröge, T. (1997), Nonlinear least squares fitting applied to copolymerization modeling. Macromol. Theory Simul., 6: 1263-1276.
  • Boggs, Paul T., et al. "Algorithm 676: ODRPACK: software for weighted orthogonal distance regression." ACM Transactions on Mathematical Software (TOMS) 15.4 (1989): 348-364.
PARAMETER DESCRIPTION
data_Ff

F(f) instantaneous composition datasets.

TYPE: list[CopoDataset_Ff] DEFAULT: []

data_fx

f(x) composition drift datasets.

TYPE: list[CopoDataset_fx] DEFAULT: []

data_Fx

F(x) composition drift datasets

TYPE: list[CopoDataset_Fx] DEFAULT: []

r_guess

Initial guess for the reactivity ratios.

TYPE: tuple[float, float] DEFAULT: (1.0, 1.0)

method

Optimization method. NLLS for nonlinear least squares or ODR for orthogonal distance regression. The ODR method is only available for F(f) data.

TYPE: Literal['NLLS', 'ODR'] DEFAULT: 'NLLS'

alpha

Significance level.

TYPE: float DEFAULT: 0.05

plot_data

If True, comparisons between experimental and fitted data will be plotted.

TYPE: bool DEFAULT: True

JCR_linear

If True, the JCR will be computed using the linear method.

TYPE: bool DEFAULT: True

JCR_exact

If True, the JCR will be computed using the exact method.

TYPE: bool DEFAULT: False

JCR_npoints

Number of points where the JCR is evaluated. The computational effort increases linearly with npoints.

TYPE: int DEFAULT: 200

JCR_rtol

Relative tolerance for the determination of the JCR. The default value (1e-2) should be adequate in most cases, as it implies a 1% accuracy in the JCR coordinates.

TYPE: float DEFAULT: 0.01

RETURNS DESCRIPTION
CopoFitResult

Dataclass with all fit results.

See also
Source code in src/polykin/copolymerization/fitting.py
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
def fit_copo_data(data_Ff: list[CopoDataset_Ff] = [],
                  data_fx: list[CopoDataset_fx] = [],
                  data_Fx: list[CopoDataset_Fx] = [],
                  r_guess: tuple[float, float] = (1.0, 1.0),
                  method: Literal['NLLS', 'ODR'] = 'NLLS',
                  alpha: float = 0.05,
                  plot_data: bool = True,
                  JCR_linear: bool = True,
                  JCR_exact: bool = False,
                  JCR_npoints: int = 200,
                  JCR_rtol: float = 1e-2
                  ) -> CopoFitResult:
    r"""Fit copolymer composition data and estimate reactivity ratios.

    This function employs rigorous nonlinear algorithms to estimate the
    reactivity ratios from experimental copolymer composition data of type
    $F(f)$, $f(x;f_0)$, and $F(x,f_0)$. 

    The fitting is performed using one of two methods: nonlinear least squares
    (NLLS) or orthogonal distance regression (ODR). NLLS considers only
    observational errors in the dependent variable, whereas ODR takes into
    account observational errors in both the dependent and independent
    variables. Although the ODR method is statistically more general, it is
    also more complex and can (at present) only be used for fitting $F(f)$
    data. Whenever composition drift data is provided, NLLS must be utilized.

    The joint confidence region (JCR) of the reactivity ratios is generated
    using approximate (linear) and/or exact methods. In most cases, the linear
    method should be sufficiently accurate. Nonetheless, for these types of
    fits, the exact method is computationally inexpensive, making it perhaps a
    preferable choice.

    **Reference**

    *   Van Herk, A.M. and Dröge, T. (1997), Nonlinear least squares fitting
        applied to copolymerization modeling. Macromol. Theory Simul.,
        6: 1263-1276.
    *   Boggs, Paul T., et al. "Algorithm 676: ODRPACK: software for weighted
        orthogonal distance regression." ACM Transactions on Mathematical
        Software (TOMS) 15.4 (1989): 348-364.

    Parameters
    ----------
    data_Ff : list[CopoDataset_Ff]
        F(f) instantaneous composition datasets.
    data_fx : list[CopoDataset_fx]
        f(x) composition drift datasets.
    data_Fx : list[CopoDataset_Fx]
        F(x) composition drift datasets
    r_guess : tuple[float, float]
        Initial guess for the reactivity ratios.
    method : Literal['NLLS', 'ODR']
        Optimization method. `NLLS` for nonlinear least squares or `ODR` for
        orthogonal distance regression. The `ODR` method is only available for
        F(f) data.
    alpha : float
        Significance level.
    plot_data : bool
        If `True`, comparisons between experimental and fitted data will be
        plotted.
    JCR_linear : bool, optional
        If `True`, the JCR will be computed using the linear method.
    JCR_exact : bool, optional
        If `True`, the JCR will be computed using the exact method.
    JCR_npoints : int
        Number of points where the JCR is evaluated. The computational effort
        increases linearly with `npoints`.
    JCR_rtol : float
        Relative tolerance for the determination of the JCR. The default value
        (1e-2) should be adequate in most cases, as it implies a 1% accuracy in
        the JCR coordinates. 

    Returns
    -------
    CopoFitResult
        Dataclass with all fit results.

    See also
    --------
    * [`confidence_ellipse`](../math/confidence_ellipse.md): linear method
      used to calculate the joint confidence region.
    * [`confidence_region`](../math/confidence_region.md): exact method
      used to calculate the joint confidence region.
    * [`fit_Finemann_Ross`](fit_Finemann_Ross.md): alternative method.  

    """

    # Calculate and check 'ndata'
    npar = 2
    ndata = 0
    for dataset in data_Ff:
        ndata += len(dataset.f1)
    for dataset in data_fx:
        ndata += len(dataset.x)
    for dataset in data_Fx:
        ndata += len(dataset.x)
    if ndata < npar:
        raise FitError("Not enough data to estimate reactivity ratios.")

    # Check method choice
    if method == 'ODR' and (len(data_fx) > 0 or len(data_Fx) > 0):
        raise FitError("ODR method not implemented for drift data.")

    # Fit data
    if method == 'NLLS':
        r_opt, cov, sse = _fit_copo_NLLS(data_Ff, data_fx, data_Fx, r_guess,
                                         ndata)
    elif method == 'ODR':
        r_opt, cov, sse = _fit_copo_ODR(data_Ff, r_guess)
    else:
        raise ValueError(f"Method {method} is not valid.")

    # Standard parameter errors and confidence intervals
    if ndata > npar and cov is not None:
        se_r = np.sqrt(np.diag(cov))
        ci_r = se_r*tdist.ppf(1. - alpha/2, ndata - npar)
    else:
        se_r = [np.nan, np.nan]
        ci_r = [np.nan, np.nan]

    # Plot data vs model
    plots = {}
    if plot_data:
        xmax = 1.
        npoints = 500
        # Plot F(f) data
        if data_Ff:
            plots['Ff'] = plt.subplots()
            ax = plots['Ff'][1]
            ax.set_xlabel(r"$f_1$")
            ax.set_ylabel(r"$F_1$")
            ax.set_xlim(0., 1.)
            ax.set_ylim(0., 1.)
            for dataset in data_Ff:
                ax.scatter(dataset.f1, dataset.F1, label=dataset.name)
            f1 = np.linspace(0., 1., npoints)
            F1_est = inst_copolymer_binary(f1, *r_opt)
            ax.plot(f1, F1_est)
            ax.legend(loc="best")

        # Plot f(x) data
        if data_fx:
            plots['fx'] = plt.subplots()
            ax = plots['fx'][1]
            ax.set_xlabel(r"$x$")
            ax.set_ylabel(r"$f_1$")
            ax.set_xlim(0., 1.)
            ax.set_ylim(0., 1.)
            x = np.linspace(0., xmax, npoints)
            for dataset in data_fx:
                ax.scatter(dataset.x, dataset.f1, label=dataset.name)
                f1_est = monomer_drift_binary(dataset.f10, x, *r_opt)
                ax.plot(x, f1_est)
            ax.legend(loc="best")

        # Plot F(x) data
        if data_Fx:
            plots['Fx'] = plt.subplots()
            ax = plots['Fx'][1]
            ax.set_xlabel(r"$x$")
            ax.set_ylabel(r"$F_1$")
            ax.set_xlim(0., 1.)
            ax.set_ylim(0., 1.)
            x = np.linspace(0., xmax, npoints)
            for dataset in data_Fx:
                f10 = dataset.f10
                ax.scatter(dataset.x, dataset.F1, label=dataset.name)
                f1_est = monomer_drift_binary(f10, x, *r_opt)
                F1_est = f1_est + (f10 - f1_est)/(x + 1e-10)
                F1_est[0] = inst_copolymer_binary(f10, *r_opt)
                ax.plot(x, F1_est)
            ax.legend(loc="best")

        # Parity plot
        plots['parity'] = plt.subplots()
        ax = plots['parity'][1]
        ax.set_xlabel("Observed value")
        ax.set_ylabel("Predicted value")
        ax.set_xlim(0., 1.)
        ax.set_ylim(0., 1.)
        ax.plot([0., 1.], [0., 1.], color='black', linewidth=0.5)
        for dataset in data_Ff:
            F1_est = inst_copolymer_binary(dataset.f1, *r_opt)
            ax.scatter(dataset.F1, F1_est, label=dataset.name)
        for dataset in data_fx:
            f1_est = monomer_drift_binary(dataset.f10, dataset.x, *r_opt)
            ax.scatter(dataset.f1, f1_est, label=dataset.name)
        for dataset in data_Fx:
            f1_est = monomer_drift_binary(dataset.f10, dataset.x, *r_opt)
            F1_est = f1_est + (dataset.f10 - f1_est)/(dataset.x + 1e-10)
            ax.scatter(dataset.F1, F1_est, label=dataset.name)
        ax.legend(loc="best")

    # Joint confidence region
    if (JCR_linear or JCR_exact) and cov is not None:

        plots['JCR'] = plt.subplots()
        ax = plots['JCR'][1]
        ax.set_xlabel(r"$r_1$")
        ax.set_ylabel(r"$r_2$")
        colors = ['tab:blue', 'tab:orange']
        idx = 0

        if JCR_linear:
            confidence_ellipse(ax,
                               center=r_opt,
                               cov=cov,
                               ndata=ndata,
                               alpha=alpha,
                               color=colors[idx], label='linear JCR')
            idx += 1

        if JCR_exact:
            jcr = confidence_region(center=r_opt,
                                    sse=sse,
                                    ndata=ndata,
                                    alpha=alpha,
                                    width=2*ci_r[0],
                                    rtol=JCR_rtol,
                                    npoints=JCR_npoints)

            # ax.scatter(r1, r2, c=colors[idx], s=5)
            ax.plot(*jcr, color=colors[idx], label='exact JCR')

        ax.legend(loc="best")

    result = CopoFitResult(method=method,
                           r1=r_opt[0], r2=r_opt[1],
                           alpha=alpha,
                           ci_r1=ci_r[0], ci_r2=ci_r[1],
                           se_r1=se_r[0], se_r2=se_r[1],
                           cov=cov,
                           plots=plots)

    return result