Trust Region#

Trust-Region (KL over States)#

Wrapper that enforces a trust-region constraint measured as a (symmetric) KL proxy over state branches. If the KL bound is exceeded, performs backtracking line-search along the update direction until satisfied.

Interface:
  • initialize(params) -> state

  • step_params(model, params, context) -> (new_params, state)

Requires:
  • base optimizer exposing step_params(…)

  • context[“kl_fn”](old_info, new_info) -> float (KL or proxy)

  • context[“info”]: current state’s info dict (with ‘branches’ if possible)

  • context[“refresh_info”](model, params, context) -> info (callable)

class qmlhc.optim.numpy_optim.trust_region.HCTrustRegion(base_opt, delta_kl=0.02, backtrack=0.7, max_backtracks=8)[source]#

Bases: object

Trust-region wrapper with KL constraint over state-space.

initialize(params)[source]#
Return type:

Dict[str, Any]

step_params(model, params, context)[source]#
Return type:

Tuple[Dict[str, Any], Dict[str, Any]]