Distributions¶
Overview¶
The polykin.distributions module offers a number of classes to deal with analytical and experimental (data) distributions commonly encountered in polymerization kinetics.
| Class | Required arguments | Optional arguments | Category |
|---|---|---|---|
Flory |
DPn | M0, name | analytical |
Poisson |
DPn | M0, name | analytical |
LogNormal |
DPn, PDI | M0, name | analytical |
SchulzZimm |
DPn, PDI | M0, name | analytical |
DataDistribution |
size_data, pdf_data | kind, M0, name, ... | data |
All analytical distributions have the number-average degree of polymerization (DPn) as first required positional argument. Additionally, the LogNormal and SchulzZimm distributions take the polydispersity index (PDI) as second required positional argument. In the case of DataDistribution, the required arguments are the measured/simulated values of the distribution curve. The average molar-mass of the repeating units (M0) and the distribution name (name) are optional keyword arguments.
Analytical distributions¶
# %pip install polykin
from polykin.distributions import Flory, Poisson, LogNormal, SchulzZimm
To instantiate an analytical distribution, we call the respective class constructor with the desired argument values. Here are some examples.
# Poisson distribution with DPn=100, and default M0=0.100 kg/mol and name=''
p = Poisson(100)
# Flory with DPn=120, and user-defined M0 and name
f = Flory(120, M0=0.065, name='polymer-F')
# Log-Normal with DPn=150, PDI=1.5, default M0 and user-defined name
g = LogNormal(150, 1.5, name='polymer-L')
# Schulz-Zimm with DPn=100, PDI=3, and user-defined M0 and name
s = SchulzZimm(100, 3, M0=0.80, name='polymer-Z')
The most important properties of a distribution can be displayed by evaluating the object directly.
f
type: Flory name: polymer-F DPn: 120.0 DPw: 239.0 DPz: 358.5 PDI: 1.99 M0: 0.065 kg/mol Mn: 7.800 kg/mol Mw: 15.535 kg/mol Mz: 23.302 kg/mol
or, equivalently, by using the print() function.
print(g)
type: LogNormal name: polymer-L DPn: 150.0 DPw: 225.0 DPz: 337.5 PDI: 1.50 M0: 0.100 kg/mol Mn: 15.000 kg/mol Mw: 22.500 kg/mol Mz: 33.750 kg/mol
The independent properties of the distribution (i.e., the variables which are arguments of the respective class constructor) can be modified after instantiation, if desired.
f.DPn = 100
f.M0 = 0.084
f.name = 'polymer-Y'
print(f)
type: Flory name: polymer-Y DPn: 100.0 DPw: 199.0 DPz: 298.5 PDI: 1.99 M0: 0.084 kg/mol Mn: 8.400 kg/mol Mw: 16.716 kg/mol Mz: 25.074 kg/mol
Any of these properties can be accessed by its name.
f.Mw/f.Mn # this should equal PDI
1.9899999999999995
Probability values¶
Probability density function (pdf) as well as cumulative distribution function (cfd) values can be obtained for any chain-length by using the methods pdf() and cdf(), respectively.
# mass pdf for a single chain-length
f.pdf(120)
0.003628853228028265
# mass pdf for a list/array of chain-lengths
f.pdf([110, 120, 130])
array([0.00367815, 0.00362885, 0.00355536])
# number pdf for a list/array of chain molar masses
p.pdf([9., 10., 11.], kind='number', sizeasmass=True) #
array([0.25042941, 0.40061472, 0.23418417])
# mass cdf at the first 3 characteristic DP averages
f.cdf([f.DPn, f.DPw, f.DPz], kind='mass')
array([0.26793532, 0.59535432, 0.80159978])
The pdf and cdf values are, of course, internally consistent, as illustrated by the example below.
import numpy as np
x = np.arange(1, p.DPn)
pdf = p.pdf(x, kind='number')
cdf = p.cdf(x[-1], kind='number')
print('sum(pdf):', np.sum(pdf))
print('cdf: ', cdf)
sum(pdf): 0.486634197669204 cdf: 0.4866341976692096
Random samples¶
Random samples of chain-lengths - for use in stochastic simulations, etc. - can be generated using the method random().
p.random() # single value
91
p.random(5) # vector of length 5
array([92, 90, 87, 94, 86], dtype=int64)
g.random((3, 4)) # array of shape (3,4)
array([[688., 290., 150., 190.],
[128., 530., 93., 88.],
[ 59., 122., 216., 234.]])
The random values are, of course, consistent with the distribution properties, as illustrated below for DPw.
x = f.random(10**5)
DPw_random = np.sum(x**2)/np.sum(x)
print("DPw random:", DPw_random)
print("DPw 'true':", f.DPw)
DPw random: 199.36937223959595 DPw 'true': 198.9999999999998
Plots¶
The method plot() allows for a quick visualization of the corresponding number, mass or GPC probability density function (pdf) or cumulative distribution function (cdf). If no arguments are specified, the mass pdf is displayed.
f.plot() # same as f.plot('mass')
Many other types of plots can be generated. For example, we can draw a plot overlaying the number and mass pdf, with an x-axis based on molar mass rather than chain-length.
f.plot(['number','mass'], sizeasmass=True)
If we select a GPC-type plot, the x-axis scale is automatically switched to 'log'.
g.plot('gpc', sizeasmass=True)
The plot() method tries to automatically adjust the range of the x-axis. The built-in algorithm works rather well for kind='number' and kind='mass', but less so for kind='gpc'. Whenever required, the range can be adjusted with the keyword argument xrange=(xmin, xmax).
g.plot('gpc', sizeasmass=True, xrange=(1., 300.))
The cdf can also be displayed by setting the corresponding keyword cdf to the desired y-axis. If cdf=1, the cdf is displayed on the primary axis (replacing the pdf). If cdf=2, the cdf is displayed on the secondary axis, while the pdf remains on the primary axis.
g.plot('gpc', sizeasmass=True, cdf=1) # cdf on primary y-axis (replaces pdf)
g.plot('gpc', sizeasmass=True, cdf=2) # cdf on secondary y-axis
If we want to save the plot or further manipulate it, we can set return_objects=True, which will give us handles to the figure and axes objects.
# fig, ax = g.plot('gpc', sizeasmass=True, cdf=2, return_objects=True)
# fig.savefig(g.name, bbox_inches='tight') # save the plot to `polymer-L.png`
Lastly, the function plotdists() can be used to overlay multiple distributions on the same plot. The optional keywords arguments are those of the plot() method.
from polykin.distributions import plotdists
_ = plotdists([f, g, s], kind='gpc', xrange=(1, 1e4))
Mixture distributions¶
Any of the abovementioned individual distributions can be combined in any proportion to obtain a resulting mixture distribution. This feature can be used, for instance, to compute the distribution of a polymer blend, as illustrated below.
First, let's define some sample polymer distributions.
a = Poisson(10, name="A")
b = Flory(50, name="B")
c = LogNormal(1000, 2, name="C")
d = SchulzZimm(10**4, 2, name="D")
Now, we define the blend using common algebra language. The (positive) numerical factors are the mass parts used to prepare the blend.
blend = 0.5*a + 1*b + 0.5*c + 0.2*d
As illustrated above, we can the object directly or use the print() command to display the key values of the distribution and of its components.
blend
type: MixtureDistribution name: A+B+C+D DPn: 31.2 DPw: 2320.2 DPz: 24295.4 PDI: 74.37 M0: 0.100 kg/mol Mn: 3.120 kg/mol Mw: 232.020 kg/mol Mz: 2,429.542 kg/mol # Weight Distribution DPn DPw PDI ------------------------------------------------------- 1 0.500 Poisson 1.00e+01 1.09e+01 1.09 2 1.000 Flory 5.00e+01 9.90e+01 1.98 3 0.500 LogNormal 1.00e+03 2.00e+03 2.00 4 0.200 SchulzZimm 1.00e+04 2.00e+04 2.00
print(blend.components_table)
# Weight Distribution DPn DPw PDI ------------------------------------------------------- 1 0.500 Poisson 1.00e+01 1.09e+01 1.09 2 1.000 Flory 5.00e+01 9.90e+01 1.98 3 0.500 LogNormal 1.00e+03 2.00e+03 2.00 4 0.200 SchulzZimm 1.00e+04 2.00e+04 2.00
And the built-in plot() method is also available for mixtures.
blend.plot('gpc', cdf=2)
Data distributions¶
from polykin.distributions import DataDistribution
Data distributions are constructed from numerical values of the distribution curve. A typical example, would be a molar mass distribution curve determined by size-exclusion chromatography (SEC). Since we do not have experimental data at hand, we'll use sample data provided with the package.
from polykin.distributions import sample_mmd
sampleX = DataDistribution(sample_mmd['size_data'], sample_mmd['pdf_data'],
kind=sample_mmd['kind'], name='sample-X')
sampleX.plot('gpc')
Most of the methods illustrated above for analytical and mixture distributions are also available for DataDistribution(s).
A data distribution can be numerically "deconvoluted" to find the combination of weights and individual distributions which best describe it using the method fit(). For example, let's try deconvoluting the above distribution as a combination of 4 distributions of type LogNormal.
sampleX_fit = sampleX.fit(LogNormal, 4)
# Weight Distribution DPn DPw PDI ------------------------------------------------------- 1 0.493 LogNormal 1.06e+02 5.26e+02 4.98 2 0.246 LogNormal 1.06e+03 1.49e+03 1.40 3 0.201 LogNormal 5.46e+03 7.67e+03 1.40 4 0.060 LogNormal 9.82e+04 1.71e+05 1.75
_ = plotdists([sampleX, sampleX_fit], 'gpc')
The fit is quite good, but it can be made even better. Go ahead, give it a try!