polykin.math.derivatives¤

hessian_forward ¤

hessian_forward(
    f: Callable[[FloatVector], float],
    x: FloatVector,
    *,
    fx: float | None = None,
    sclx: FloatVector | None = None,
    epsf: float | None = None
) -> FloatSquareMatrix

Calculate the numerical Hessian of a scalar function \(f(\mathbf{x})\) using the forward finite-difference scheme.

\[ H_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j} = \frac{ f(\mathbf{x} + \mathbf{e}_i h_i + \mathbf{e}_j h_j) - f(\mathbf{x} + \mathbf{e}_i h_i) - f(\mathbf{x} + \mathbf{e}_j h_j) + f(\mathbf{x})}{h_i h_j} \]

The step size \(h_j\) is optimally determined according to the machine precision of the function values. Typically, the Hessian is accurate to about a third of the number of reliable digits returned by the function.

References

J.E. Dennis Jr., R.B. Schnabel, "Numerical Methods for Unconstrained Optimization and Nonlinear Equations", SIAM, 1996, p. 321.

PARAMETER	DESCRIPTION
`f`	Function to be differentiated. TYPE: `Callable[[FloatVector], float]`
`x`	Differentiation point. TYPE: `FloatVector`
`fx`	Function value at `x`, if available. TYPE: `float \| None` DEFAULT: `None`
`sclx`	Scaling factors for `x`. Ideally, `x[i]sclx[i]` is close to 1. By default, the factors are set internally based on the magnitudes of `x`. TYPE:* `FloatVector \| None` DEFAULT: `None`
`epsf`	Machine precision of the function values. If `None`, machine precision of 64-bit floating-point type is assumed. If the number of reliable base-10 digits in the results returned by the function is \(n\), then `epsf` is approximately \(10^{-n}\). TYPE: `float \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`FloatSquareMatrix`	Hessian matrix.

Examples:

Evaluate the numerical hessian of f(x) = x1**2 * x2**3 at (2, -2).

>>> from polykin.math import hessian_forward
>>> import numpy as np
>>> def f(x): return x[0]**2 * x[1]**3
>>> hessian_forward(f, np.array([2.0, -2.0]))
array([[-16.00001242,  47.99984347],
       [ 47.99984347, -47.99972236]])

Source code in src/polykin/math/derivatives/ndiff.py

def hessian_forward(
    f: Callable[[FloatVector], float],
    x: FloatVector,
    *,
    fx: float | None = None,
    sclx: FloatVector | None = None,
    epsf: float | None = None,
) -> FloatSquareMatrix:
    r"""Calculate the numerical Hessian of a scalar function $f(\mathbf{x})$
    using the forward finite-difference scheme.

    $$ H_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}
        = \frac{
          f(\mathbf{x} + \mathbf{e}_i h_i + \mathbf{e}_j h_j)
        - f(\mathbf{x} + \mathbf{e}_i h_i) - f(\mathbf{x} + \mathbf{e}_j h_j)
        + f(\mathbf{x})}{h_i h_j} $$

    The step size $h_j$ is optimally determined according to the machine precision of the
    function values. Typically, the Hessian is accurate to about a third of the number of
    reliable digits returned by the function.

    **References**

    *   J.E. Dennis Jr., R.B. Schnabel, "Numerical Methods for Unconstrained Optimization
        and Nonlinear Equations", SIAM, 1996, p. 321.

    Parameters
    ----------
    f : Callable[[FloatVector], float]
        Function to be differentiated.
    x : FloatVector
        Differentiation point.
    fx : float | None
        Function value at `x`, if available.
    sclx : FloatVector | None
        Scaling factors for `x`. Ideally, `x[i]*sclx[i]` is close to 1. By
        default, the factors are set internally based on the magnitudes of `x`.
    epsf : float | None
        Machine precision of the function values. If `None`, machine precision of 64-bit
        floating-point type is assumed. If the number of reliable base-10 digits in the
        results returned by the function is $n$, then `epsf` is approximately $10^{-n}$.

    Returns
    -------
    FloatSquareMatrix
        Hessian matrix.

    Examples
    --------
    Evaluate the numerical hessian of f(x) = x1**2 * x2**3 at (2, -2).
    >>> from polykin.math import hessian_forward
    >>> import numpy as np
    >>> def f(x): return x[0]**2 * x[1]**3
    >>> hessian_forward(f, np.array([2.0, -2.0]))
    array([[-16.00001242,  47.99984347],
           [ 47.99984347, -47.99972236]])
    """
    fx = fx if fx is not None else f(x)
    sclx = np.abs(sclx) if sclx is not None else scalex(x)

    epsf = max(epsf, eps) if epsf is not None else eps
    h0 = math.cbrt(epsf)

    N = x.size
    H = np.empty((N, N))
    fh = np.empty(N)
    h = np.empty(N)
    xh = x.copy()

    for i in range(N):
        h[i] = h0 * max(abs(x[i]), 1 / sclx[i])
        xtemp1 = xh[i]
        xh[i] += h[i]
        h[i] = xh[i] - xtemp1
        fh[i] = f(xh)
        xh[i] = xtemp1

    for i in range(N):
        xtemp1 = xh[i]
        xh[i] += 2 * h[i]
        H[i, i] = ((fx - fh[i]) + (f(xh) - fh[i])) / (h[i] ** 2)
        xh[i] = xtemp1 + h[i]
        for j in range(i + 1, N):
            xtemp2 = xh[j]
            xh[j] += h[j]
            H[i, j] = ((fx - fh[i]) + (f(xh) - fh[j])) / (h[i] * h[j])
            H[j, i] = H[i, j]
            xh[j] = xtemp2
        xh[i] = xtemp1

    return H