Dynamics of Optimization: From Faster Iterations to Faster Settling
An early note on continuous optimization, circuit equivalents, and why settling time became the real design problem.
An early research note on continuous-time optimization — why accelerated gradient methods no longer felt complete as discrete updates for autonomous systems, and why the real problem became designing the dynamics themselves for fast settling and possible circuit realization.
This page preserves the original writing-detail structure, but the content now reflects the actual research framing behind this note: not only how to accelerate gradient descent, but how to recast the optimizer itself as a continuous dynamical system with tunable transient response and a plausible circuit embodiment.
An early research note on continuous-time optimization — why accelerated gradient methods no longer felt complete as discrete updates for autonomous systems, and why the real problem became designing the dynamics themselves for fast settling and possible circuit realization.
Why rewriting Heavy-Ball, Nesterov, and Triple-Momentum as ODEs changed the design language from iteration counts to damping, poles, and settling time — and why that shift naturally led to equivalent circuits and primal-dual constrained optimization.
A PhD research note from the stage when continuous-time optimization, transient response shaping, circuit interpretations, and constrained primal-dual formulations were being developed as a coherent research direction.
One of the first reasons accelerated gradient descent stopped feeling complete to me was simple: the systems I cared about were not born discrete. Autonomous systems evolve in continuous time. Their stability arguments are continuous. Their safety conditions are continuous too. So even when a digital update law worked, it still felt slightly indirect. I was optimizing something that would later be sampled and approximated again, and that gap started to bother me.
At the time, the basic story of accelerated optimization was already familiar. Start with gradient descent. Add momentum. Heavy-Ball gives you inertia. Nesterov gives you a look-ahead state. Triple-Momentum pushes that idea further by tuning several momentum variables at once. All of that is elegant, but the object being designed is still an iteration. You count steps, tune hyperparameters around steps, and measure performance in steps. For autonomous systems, that framing started to feel narrower than the system I was actually trying to control.
The real change happened when I stopped treating those update laws as final objects and started treating them as approximations of something underneath. If the finite differences are read as derivatives and the scaling is handled carefully, the algorithm stops looking like a list of instructions and starts looking like a dynamical system. Nesterov’s method turns into a second-order ODE. From there, the family expands cleanly: continuous Heavy-Ball, continuous Nesterov, and continuous Triple-Momentum, with the gradient evaluated either at the current state or at a momentum-shifted state, and with an optional tuned output layered on top. That was the point where the problem changed shape for me. The continuous model was not just a limit argument. It became the thing to design.
Once I started seeing the optimizer that way, the design language changed with it. In discrete time, one talks about update rules, iteration complexity, and spectral behavior across steps. In continuous time, the language becomes damping, poles, transient response, and settling time. That shift mattered because it turned parameter selection into a system-design problem instead of only an algorithm-tuning problem. In the paper, the parameter beta becomes especially revealing in that regard: larger values drive faster convergence, but they also push the implementation closer to physical limits. So speed stops being only a mathematical question. It becomes an engineering one.
That is also where the work stops looking like a conventional optimization paper. Once the optimizer is written as a second-order system, it becomes natural to ask whether the dynamics can be realized directly instead of being repeatedly simulated on a processor. That is what leads to the equivalent-circuit side of the paper. Op-amp integrators, multiplier blocks, and summing amplifiers can be arranged so that the circuit itself mirrors the continuous optimization law. At that point, the project is no longer only about accelerating gradient descent. It is about whether optimization-based controllers can exist in a more direct continuous form, with the low latency and parallelism that analog implementations promise.
The constrained case made that idea more serious. Quadratic programs show up everywhere once constraints are part of control, so the paper moves from unconstrained dynamics to a primal-dual formulation for equality-constrained QPs and then builds an equivalent circuit view for that system as well. I found that step important because it connected the story back to actual control architectures rather than leaving it inside a clean unconstrained example. The CBF-CLF motivation made this especially clear. Those safety and stability conditions are stated in continuous time, yet they are usually enforced through fast digital loops. A continuous implementation therefore did not feel like novelty for its own sake. It felt aligned with the logic of the controller.
The simulations support that intuition, but they also keep it honest. The continuous methods converge rapidly, the circuit simulations track the ODE results reasonably well, and parameter tuning matters a great deal. But the paper is not really a fantasy about perfect analog computation. As beta grows, voltages have to be scaled, and then op-amp sensitivity, line effects, and component limits begin to show up as oscillations and noise near small error levels. That detail matters because it changes the meaning of “fastest.” The theoretically fastest design is not always the practically best one once the model is built out of physical parts.
Looking back, that is what makes this piece feel like a research-origin note rather than only a technical contribution. The lasting idea was not merely that discrete methods can be rewritten in continuous form. It was that accelerated optimization could be treated as a design problem over dynamics themselves, and that those dynamics might be built into continuous controllers rather than only emulated by digital updates. That line of thought is what ties together the ODE view, the circuit view, and the primal-dual control motivation. It also explains why this work still feels like the beginning of a larger direction rather than just a one-off derivation.
A fast update rule is not yet a continuous controller.
The central discomfort in this note was not whether accelerated gradient descent converged, but whether a discrete update law was the right computational object for systems that were modeled, constrained, and justified in continuous time. That gap is what made transient response and settling time, rather than iteration formulas alone, the real design concern.
Design the dynamics, not just the iteration.
The turning point was to treat the continuous model as more than a limit argument. Once the optimizer was written as a tunable dynamical system, parameter design, circuit realization, and constrained primal-dual extensions all became part of the same problem.