Trust Region#

Trust-Region (KL over States)#

Wrapper that enforces a trust-region constraint measured as a (symmetric) KL proxy over state branches. If the KL bound is exceeded, performs backtracking line-search along the update direction until satisfied.

Interface:

initialize(params) -> state
step_params(model, params, context) -> (new_params, state)

Requires:

base optimizer exposing step_params(…)
context[“kl_fn”](old_info, new_info) -> float (KL or proxy)
context[“info”]: current state’s info dict (with ‘branches’ if possible)
context[“refresh_info”](model, params, context) -> info (callable)

class qmlhc.optim.numpy_optim.trust_region.HCTrustRegion(base_opt, delta_kl=0.02, backtrack=0.7, max_backtracks=8)[source]#

Bases: object

Trust-region wrapper with KL constraint over state-space.

initialize(params)[source]#

Return type:: Dict[str, Any]

step_params(model, params, context)[source]#

Return type:: Tuple[Dict[str, Any], Dict[str, Any]]

Trust Region

Contents

Trust Region#

Trust-Region (KL over States)#