Source code for pympcc._stationarity

"""
MPCC stationarity classification using IPOPT Lagrange multipliers.

Stationarity hierarchy (weakest → strongest, Ye 2000):

    W-stationary ⊂ C-stationary ⊂ M-stationary ⊂ S-stationary

For biactive pairs I_00 = {i : G_i ≤ tol AND H_i ≤ tol}, let μ_G_i and μ_H_i
be the MPCC multipliers for the G_i ≥ 0 and H_i ≥ 0 constraints respectively:

* **W-stationary**: KKT of the relaxed NLP is satisfied.  Always true when
  IPOPT converged successfully.
* **C-stationary**: W + μ_G_i · μ_H_i ≥ 0 for each biactive pair.
  Captures "same-sign" multipliers; "both negative" satisfies C but not M.
* **M-stationary**: C + NOT(both strictly negative) for each biactive pair.
  Formally: (μ_G_i > 0 AND μ_H_i > 0) OR (μ_G_i · μ_H_i = 0).
* **S-stationary**: μ_G_i ≥ 0 AND μ_H_i ≥ 0 for every biactive pair.
  The strongest KKT-based condition; implies MPCC-LICQ holds at the solution.

Sign convention note
--------------------
IPOPT solves min f(x) s.t. cl ≤ c(x) ≤ cu with KKT ∇f + J(x)ᵀ λ = 0.
For a constraint at its *lower* bound (e.g. G_i ≥ 0 with G_i = 0), IPOPT
returns λ_i ≤ 0.  The literature convention is μ_G_i = −λ_i ≥ 0.
All functions in this module negate the raw ``mult_g`` slices before testing.

Practical limitation: interior-point bias toward S-stationarity
---------------------------------------------------------------
IPOPT uses a primal-dual interior-point method where the barrier parameter
drives *all* multipliers for active lower-bound constraints (G ≥ 0, H ≥ 0)
toward non-negative values at convergence.  Concretely, for a biactive pair
(G_i ≈ 0, H_i ≈ 0), both μ_G_i = −λ_G_i ≥ 0 and μ_H_i = −λ_H_i ≥ 0 hold
by NLP complementarity, regardless of the true MPCC stationarity type.
As a result, ``classify_stationarity`` will almost always report
``"S-stationary"`` for fully converged IPOPT solutions.

The classification is most informative when:

* Comparing partially converged iterates (e.g., intermediate ``ε`` values in
  Scholtes / smoothing), where the barrier has not fully enforced positivity.
* Checking the sign of multipliers at a *prescribed* candidate solution (not
  one produced by IPOPT).
* Using a large IPOPT tolerance (``tol ≥ 1e-4``) that leaves multipliers
  further from their limit, making sign variations observable.

For a definitive stationarity check, use ``compute_kkt_residual`` to verify
that the optimality residual is small, and inspect ``result.comp_residual``
to confirm complementarity feasibility.  The stationarity label supplements
these measures; it does not replace them.
"""
from __future__ import annotations

from typing import Optional

import numpy as np

from ._constants import BIACTIVE_TOL as _BIACTIVE_TOL
from .problem import MPCCProblem
from .result import MPCCResult

__all__ = ["classify_stationarity", "compute_kkt_residual", "verify_b_stationarity"]



[docs]
def classify_stationarity(
    result: MPCCResult,
    problem: MPCCProblem,
    tol: float = _BIACTIVE_TOL,
) -> str:
    """
    Classify the MPCC stationarity type of a solved result.

    Returns the strongest level satisfied from the hierarchy
    ``W-stationary ⊂ C-stationary ⊂ M-stationary ⊂ S-stationary``
    (W is weakest, S is strongest).

    Parameters
    ----------
    result : MPCCResult
        A solved MPCC result (from :func:`pympcc.solve`).
    problem : MPCCProblem
        The problem instance used to produce the result.
    tol : float, optional
        Numerical tolerance (default ``1e-6``).  Controls two things:

        * **Biactive detection**: pair *i* is biactive when
          ``G_i ≤ tol AND H_i ≤ tol``.
        * **Sign comparison**: multipliers within ``[-tol, tol]`` are
          treated as zero.

    Returns
    -------
    str
        One of: ``"S-stationary"``, ``"M-stationary"``, ``"C-stationary"``,
        ``"W-stationary"``, ``"unknown"``, or ``"not stationary"``.

        * ``"unknown"`` — multipliers were not stored (``result.mult_g`` is
          ``None``).
        * ``"not stationary"`` — the solve did not converge
          (``result.success`` is ``False``).

    Notes
    -----
    The classification uses the Lagrange multipliers for the G(x) ≥ 0 and
    H(x) ≥ 0 constraints, located at indices
    ``[n_ineq + n_eq : n_ineq + n_eq + 2*n_comp]`` of ``result.mult_g``.
    The product/smoothing constraint multipliers (the last ``n_comp`` entries)
    are intentionally excluded — they are an artifact of the NLP reformulation.

    Practical interpretation:
        * **C-stationary**: sufficient for many engineering applications.
        * **S-stationary**: implies MPCC-LICQ holds; equivalent to
          B-stationary under MPCC-LICQ (Scheel & Scholtes 2000).
        * **W/M-stationary**: solution may be a saddle point; consider
          multi-start or a different strategy.

    References
    ----------
    Ye, J.J. (2000). Necessary optimality conditions for MPECs with
    equilibrium constraints. *Mathematics of Operations Research*, 25(4).

    Scheel, H. and Scholtes, S. (2000). Mathematical programs with
    complementarity constraints: stationarity, optimality, and sensitivity.
    *Mathematics of Operations Research*, 25(1), 1–22.
    """
    if not result.success:
        return "not stationary"
    if result.mult_g is None:
        return "unknown"

    offset = problem.n_ineq + problem.n_eq
    # Convert IPOPT sign convention to literature convention (negate).
    # In IPOPT: KKT ∇f + Jᵀλ = 0 → λ ≤ 0 at active lower bound.
    # In literature: μ = −λ ≥ 0 at S-stationary points.
    mu_G = -result.mult_g[offset           : offset + problem.n_comp]
    mu_H = -result.mult_g[offset + problem.n_comp : offset + 2 * problem.n_comp]

    # Biactive set I_00 = {i : G_i ≤ tol AND H_i ≤ tol}
    mask = (result.G <= tol) & (result.H <= tol)

    # Vacuously S-stationary when there are no degenerate pairs.
    if not np.any(mask):
        return "S-stationary"

    mu_G_ba = mu_G[mask]
    mu_H_ba = mu_H[mask]

    # S-stationary: both multipliers ≥ 0 for every biactive pair.
    # Allow small negative values (within tol) as numerical noise.
    if np.all(mu_G_ba >= -tol) and np.all(mu_H_ba >= -tol):
        return "S-stationary"

    # M-stationary: for each biactive pair, either
    #   (a) both strictly positive, or
    #   (b) at least one is ≈ 0 (product ≈ 0)
    m_cond = (
        ((mu_G_ba > tol) & (mu_H_ba > tol))
        | ((np.abs(mu_G_ba) <= tol) | (np.abs(mu_H_ba) <= tol))
    )
    if np.all(m_cond):
        return "M-stationary"

    # C-stationary: product ≥ 0 for every biactive pair.
    # Handles the "both negative" case (product > 0) which fails M.
    if np.all(mu_G_ba * mu_H_ba >= -(tol ** 2)):
        return "C-stationary"

    # W-stationary: KKT satisfied (always true when IPOPT converged).
    return "W-stationary"



def _dense_jac(
    values: np.ndarray,
    sparsity: Optional[tuple],
    n_rows: int,
    n_cols: int,
) -> np.ndarray:
    """Return a dense ``(n_rows, n_cols)`` Jacobian from sparse or dense input."""
    vals = np.asarray(values, dtype=float)
    if vals.ndim == 2:
        return vals
    if sparsity is not None:
        dense = np.zeros((n_rows, n_cols))
        dense[np.asarray(sparsity[0]), np.asarray(sparsity[1])] = vals
        return dense
    return vals.reshape(n_rows, n_cols)


def _jac_T_vec(
    values: np.ndarray,
    sparsity: Optional[tuple],
    n_rows: int,
    n_cols: int,
    vec: np.ndarray,
) -> np.ndarray:
    """Compute ``J.T @ vec`` without materialising the dense J.

    For sparse COO Jacobians ``J.T @ vec`` via ``_dense_jac`` requires an
    ``(n_rows × n_cols)`` dense allocation.  At N=128 with n_rows=81920 and
    n_cols=99328 that is ~65 GB.  Instead, scatter-add directly::

        out[cols[k]] += vals[k] * vec[rows[k]]   for each nonzero k

    which is O(nnz) in time and O(n_cols) in memory.
    """
    vals = np.asarray(values, dtype=float)
    vec  = np.asarray(vec, dtype=float)
    if vals.ndim == 2:
        return vals.T @ vec
    if sparsity is None:
        return vals.reshape(n_rows, n_cols).T @ vec
    rows = np.asarray(sparsity[0])
    cols = np.asarray(sparsity[1])
    return np.bincount(cols, weights=vals * vec[rows], minlength=n_cols)



[docs]
def compute_kkt_residual(
    result: MPCCResult,
    problem: MPCCProblem,
    *,
    mpcc_mult_G: Optional[np.ndarray] = None,
    mpcc_mult_H: Optional[np.ndarray] = None,
    mult_x_L: Optional[np.ndarray] = None,
    mult_x_U: Optional[np.ndarray] = None,
) -> Optional[float]:
    """
    MPCC stationarity residual:
    ``||∇f + Jg^T λ_g + Jh^T λ_h + JG^T μ_G + JH^T μ_H - z_L + z_U||_∞``.

    Parameters
    ----------
    result : MPCCResult
        A solved MPCC result.
    problem : MPCCProblem
        The problem instance used to produce the result.
    mpcc_mult_G : ndarray, shape (n_comp,)
        MPCC multipliers for the ``G(x) ≥ 0`` constraints.  For Scholtes /
        direct these are ``λ_G + H ⊙ λ_GH``; for lin-fukushima additionally
        ``+ λ_GPH``; for smoothing ``+ (1 − G/r) ⊙ λ_φ``; for augmented
        Lagrangian ``λ_G + μ_AL ⊙ H``; for slack ``λ_{G−s_G}`` directly.
    mpcc_mult_H : ndarray, shape (n_comp,)
        MPCC multipliers for the ``H(x) ≥ 0`` constraints (analogous).
    mult_x_L : ndarray, shape ≥ n, optional
        Variable lower-bound multipliers ``z_L ≥ 0`` from IPOPT.
        Only the first ``n`` entries are used.  ``None`` is treated as zero.
    mult_x_U : ndarray, shape ≥ n, optional
        Variable upper-bound multipliers ``z_U ≥ 0`` from IPOPT.
        Only the first ``n`` entries are used.  ``None`` is treated as zero.

    Returns
    -------
    float or None
        Infinity-norm of the stationarity residual, or ``None`` if
        ``mpcc_mult_G`` / ``mpcc_mult_H`` were not provided.
    """
    if result.mult_g is None or mpcc_mult_G is None or mpcc_mult_H is None:
        return None

    p = problem
    x = result.x
    n_g, n_h = p.n_ineq, p.n_eq

    r = np.asarray(p.gradient(x), dtype=float).copy()  # type: ignore[misc, operator]
    if mult_x_L is not None:
        r -= mult_x_L[:p.n]
    if mult_x_U is not None:
        r += mult_x_U[:p.n]

    if n_g:
        r += _jac_T_vec(p.ineq_jacobian(x), p.ineq_jacobian_sparsity, n_g, p.n, result.mult_g[:n_g])  # type: ignore[misc, operator]
    if n_h:
        r += _jac_T_vec(p.eq_jacobian(x), p.eq_jacobian_sparsity, n_h, p.n, result.mult_g[n_g : n_g + n_h])  # type: ignore[misc, operator]

    r += _jac_T_vec(p.comp_G_jacobian(x), p.comp_G_jacobian_sparsity, p.n_comp, p.n, mpcc_mult_G)  # type: ignore[misc, operator]
    r += _jac_T_vec(p.comp_H_jacobian(x), p.comp_H_jacobian_sparsity, p.n_comp, p.n, mpcc_mult_H)  # type: ignore[misc, operator]

    return float(np.max(np.abs(r)))




[docs]
def verify_b_stationarity(
    result: MPCCResult,
    problem: MPCCProblem,
    *,
    tol: float = _BIACTIVE_TOL,
    max_biactive: int = 10,
) -> dict:
    """
    Test B-stationarity of an MPCC solution by enumerating linearised-MPCC
    tangent-cone branches and solving an LP per branch.

    For each biactive pair ``i ∈ I_00 = {i : G_i(x*) ≤ tol AND H_i(x*) ≤ tol}``
    the linearised tangent cone splits into two branches:

      * **G branch** (G stays active): ``∇G_iᵀ d = 0,  ∇H_iᵀ d ≥ 0``
      * **H branch** (H stays active): ``∇H_iᵀ d = 0,  ∇G_iᵀ d ≥ 0``

    For every assignment of branches to the ``|I_00|`` pairs
    (``2^|I_00|`` LPs), solve::

        min  ∇f(x*)ᵀ d
        s.t. ∇g_jᵀ d ≤ 0          j active inequality (g_j(x*) ≥ -tol)
             ∇hᵀ d   = 0
             ∇G_iᵀ d = 0          i ∈ I_0+ (G_i = 0, H_i > 0 — H>0 locally
                                              forces G_i to stay zero)
             ∇H_iᵀ d = 0          i ∈ I_+0 (G_i > 0, H_i = 0 — G>0 locally
                                              forces H_i to stay zero)
             branch-specific G/H linearisations for i ∈ I_00
             d_j ≥ 0  if x*_j is at xl_j;  d_j ≤ 0 if at xu_j
             ‖d‖_∞ ≤ 1            (normalisation; LP would be unbounded)

    ``x*`` is **B-stationary** iff every branch's LP minimum is ≥ 0
    (no feasible first-order descent direction in the linearised cone).

    Parameters
    ----------
    result : MPCCResult
        A solved MPCC result (``result.x``, ``result.G``, ``result.H``,
        ``result.success`` are read).
    problem : MPCCProblem
        The problem instance used to produce the result.
    tol : float, optional
        Tolerance for the active-set partition (default ``1e-6``).
        Pair *i* is biactive when ``G_i ≤ tol AND H_i ≤ tol``;
        a constraint ``g_j`` is active when ``g_j ≥ -tol``;
        a variable bound is active when ``|x_j − xl_j| ≤ tol`` (resp. ``xu``).
    max_biactive : int, optional
        Skip when ``|I_00| > max_biactive`` (default ``10`` → up to 1024 LPs).

    Returns
    -------
    dict
        Result dict with keys:

        - ``status`` — one of ``"B-stationary"`` (every branch min ≥ -tol),
          ``"not B-stationary"`` (at least one branch admits descent),
          ``"intractable"`` (``|I_00| > max_biactive``; LP enumeration
          skipped), or ``"unknown"`` (``result.success is False``).
        - ``n_biactive`` — ``|I_00|``.
        - ``n_branches_checked`` — LPs actually solved (0 on early exit).
        - ``min_descent`` — minimum ``∇fᵀd`` over branches (``None`` on
          early exit).
        - ``witness_branch`` — tuple of ``'G'`` / ``'H'`` chars indexed by
          the biactive pairs in I_00 order; ``None`` if B-stationary.
        - ``witness_d`` — ndarray, shape ``(n,)`` — the descent direction;
          ``None`` if B-stationary.

    Notes
    -----
    Uses dense LPs (``scipy.optimize.linprog`` with method ``"highs"``).
    Intended as a diagnostic for small/medium MPCCs; cost is
    ``O(2^|I_00| · LP)`` and the per-LP build is dense in ``n``.

    The strongest stationarity level reachable from KKT multipliers alone
    (``classify_stationarity``) is S, which under MPCC-LICQ is equivalent
    to B-stationarity. This routine *directly* certifies B-stationarity
    without needing MPCC-LICQ to hold.
    """
    p = problem
    n = p.n

    if not result.success:
        return {
            "status": "unknown",
            "n_biactive": 0,
            "n_branches_checked": 0,
            "min_descent": None,
            "witness_branch": None,
            "witness_d": None,
        }

    G = np.asarray(result.G, dtype=float)
    H = np.asarray(result.H, dtype=float)

    I_00 = np.where((G <= tol) & (H <= tol))[0]
    I_0p = np.where((G <= tol) & (H >  tol))[0]
    I_p0 = np.where((G >  tol) & (H <= tol))[0]
    n_bi = int(I_00.size)

    if n_bi > max_biactive:
        return {
            "status": "intractable",
            "n_biactive": n_bi,
            "n_branches_checked": 0,
            "min_descent": None,
            "witness_branch": None,
            "witness_d": None,
        }

    x = np.asarray(result.x, dtype=float)
    grad = np.asarray(p.gradient(x), dtype=float)  # type: ignore[misc, operator]

    JG = _dense_jac(p.comp_G_jacobian(x), p.comp_G_jacobian_sparsity, p.n_comp, n)  # type: ignore[misc, operator]
    JH = _dense_jac(p.comp_H_jacobian(x), p.comp_H_jacobian_sparsity, p.n_comp, n)  # type: ignore[misc, operator]

    A_eq_base: list = []
    A_ub_base: list = []

    if p.n_eq:
        Jh = _dense_jac(p.eq_jacobian(x), p.eq_jacobian_sparsity, p.n_eq, n)  # type: ignore[misc, operator]
        A_eq_base.append(Jh)

    if p.n_ineq and p.ineq_constraints is not None:
        gvals = np.asarray(p.ineq_constraints(x), dtype=float)
        Jg = _dense_jac(p.ineq_jacobian(x), p.ineq_jacobian_sparsity, p.n_ineq, n)  # type: ignore[misc, operator]
        active = gvals >= -tol
        if np.any(active):
            A_ub_base.append(Jg[active])

    # I_0p (G_i = 0, H_i > 0): H>0 locally forces ∇G_i·d = 0.
    # I_p0 (G_i > 0, H_i = 0): G>0 locally forces ∇H_i·d = 0.
    if I_0p.size:
        A_eq_base.append(JG[I_0p])
    if I_p0.size:
        A_eq_base.append(JH[I_p0])

    xl = np.asarray(p.xl, dtype=float) if p.xl is not None else np.full(n, -np.inf)
    xu = np.asarray(p.xu, dtype=float) if p.xu is not None else np.full(n, np.inf)
    bounds: list = []
    for j in range(n):
        lo, hi = -1.0, 1.0
        if np.isfinite(xl[j]) and (x[j] - xl[j]) <= tol:
            lo = 0.0
        if np.isfinite(xu[j]) and (xu[j] - x[j]) <= tol:
            hi = 0.0
        if lo > hi:
            lo = hi
        bounds.append((lo, hi))

    from scipy.optimize import linprog  # noqa: PLC0415

    n_branches = 1 if n_bi == 0 else (1 << n_bi)
    min_descent = 0.0
    witness_branch = None
    witness_d = None

    for branch_idx in range(n_branches):
        A_eq_rows = list(A_eq_base)
        A_ub_rows = list(A_ub_base)
        chars: list = []
        for k, i in enumerate(I_00):
            bit = (branch_idx >> k) & 1
            if bit == 0:
                A_eq_rows.append(JG[i:i + 1])
                A_ub_rows.append(-JH[i:i + 1])
                chars.append("G")
            else:
                A_eq_rows.append(JH[i:i + 1])
                A_ub_rows.append(-JG[i:i + 1])
                chars.append("H")

        A_eq = np.vstack(A_eq_rows) if A_eq_rows else None
        b_eq = np.zeros(A_eq.shape[0]) if A_eq is not None else None
        A_ub = np.vstack(A_ub_rows) if A_ub_rows else None
        b_ub = np.zeros(A_ub.shape[0]) if A_ub is not None else None

        lp = linprog(grad, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq,
                     bounds=bounds, method="highs")
        if not lp.success:
            continue

        if lp.fun < min_descent:
            min_descent = float(lp.fun)
            if lp.fun < -tol:
                witness_branch = tuple(chars)
                witness_d = np.asarray(lp.x, dtype=float)

    status = "not B-stationary" if witness_d is not None else "B-stationary"
    return {
        "status": status,
        "n_biactive": n_bi,
        "n_branches_checked": n_branches,
        "min_descent": float(min_descent),
        "witness_branch": witness_branch,
        "witness_d": witness_d,
    }