Neural dynamical systems are a class of temporal predictor that models continuous-time dynamics through many fine-grained state updates, which makes them powerful for forecasting signals that evolve smoothly in time. They also map terribly onto the hardware the AI industry has standardized on. GPUs and tensor accelerators are built for dense, batched matrix multiplication; a model whose essence is a long chain of small sequential state updates fights that architecture at every step. A June 2026 arXiv preprint, Neural dynamical systems on ferroelectric compute-in-memory for real-time forecasting, by Keshava Katti, Adithya Selvakumar, Pratik Chaudhari and Deep Jariwala, argues the fix is not a better digital accelerator but a different substrate entirely.

The mismatch is the whole motivation. The authors observe that the sequential, continuous-time structure of these models maps poorly onto digital hardware optimized for dense matrix operations, and that analog neuromorphic computing — which has continuous-time dynamics natively, in physics rather than in code — is the natural resolution. Instead of discretizing continuous dynamics into thousands of matrix steps, you let an analog device's intrinsic behavior be the dynamics.

"this sequential structure maps poorly onto digital hardware optimized for dense matrix operations, a mismatch that analog neuromorphic computing, with its native continuous-time dynamics, can resolve."— arXiv:2606.16896 (Katti et al.), source

Two primitives and a ferrodiode

The system the authors introduce, FerroNDS, is built from just two analog building blocks: an integrator that performs temporal accumulation, and an oscillator that performs frequency-selective filtering. That minimalism is deliberate. Integration and frequency-selective filtering are the two operations a continuous-time forecaster fundamentally needs — accumulate history, and pick out the frequencies that matter — and both have clean analog realizations. The novelty is the device the authors map these primitives onto: multi-bit ferrodiodes, a ferroelectric compute-in-memory element. Compute-in-memory, or CiM, performs computation where the data is stored rather than shuttling it to a separate arithmetic unit, which attacks the data-movement energy that dominates conventional designs. Doing it with a ferroelectric device adds non-volatility and multi-level storage, so each cell holds more than a single bit and retains state without power.

The authors state plainly that this is, to their knowledge, the first end-to-end integration of a ferrodiode into a neuromorphic computational framework, and that it establishes ferroelectric compute-in-memory as a practical substrate for analog neural dynamical systems. "End-to-end" is the operative claim: ferroelectric devices have been studied as memory and as synaptic elements for years, but wiring them into a complete, functioning neural dynamical system that actually produces forecasts is a different and harder demonstration than a single-device characterization.

The numbers, and what they imply

The reported results come from a 128-neuron instance of FerroNDS that computes a short-time Fourier transform and forecasts a 500-millisecond horizon for periodic, quasi-periodic and chaotic signals. The choice of test signals is worth noting — chaotic signals are the genuinely hard case for any forecaster, so including them alongside the easy periodic ones is a meaningful stress test rather than a cherry-picked demo.

The efficiency figures are where the analog substrate earns its keep. The system is reported to run at sub-watt power in real time, with per-neuron, per-inference energy of 1.64 microjoules at a 200 Hz operating point and 0.29 microjoules at 10 kHz. The direction of that scaling is instructive: at higher frequency the per-inference energy drops, consistent with amortizing fixed overheads across faster operation. On area, the authors claim a 25 to 40 times reduction over SRAM-based digital systems — the kind of multiple that comes from collapsing separate memory and compute into one non-volatile multi-bit device rather than from incremental layout tuning. Latency is reported at 3.18 milliseconds per layer at 200 Hz and 63.87 microseconds at 10 kHz, comfortably inside the real-time budget for the 500-millisecond forecasting task.

Where it fits, and the open questions

For the emerging-device side of the semiconductor world, FerroNDS is a concrete data point in the long-running argument that ferroelectric materials — the same family driving interest in FeFET memory — have a role beyond storage. The pitch is that for a specific, important workload class, an analog ferroelectric substrate is not merely competitive but categorically better suited than digital CMOS, because the physics does the math for free. The 25-to-40x area claim, if it survives independent silicon validation, is the sort of advantage that justifies the considerable difficulty of bringing a non-standard device into a manufacturable flow.

The abstract leaves the usual analog-hardware questions unanswered, and they are the right ones to ask. It does not quantify forecasting accuracy against a digital baseline, so the energy and area wins cannot yet be weighed against any quality cost. It does not address device-to-device variability, retention, or endurance of the multi-bit ferrodiodes — the practical reliability issues that have historically kept analog compute-in-memory in the lab. Nor does it specify the fabrication maturity of the ferrodiode array or whether the 128-neuron instance is measured silicon or a modeled projection. Those details separate a promising substrate from a deployable one.

Even with those caveats, the framing is the contribution worth tracking. The semiconductor industry's default answer to every workload is a faster digital matrix engine, and FerroNDS is a reminder that some workloads — continuous-time, sequential, dynamical — are a poor fit for that default and may be served far better by letting an emerging device's native physics carry the computation. Establishing ferroelectric compute-in-memory as a practical substrate for an entire model class, rather than a curiosity, is exactly the kind of claim that, if it holds, reshapes where the next generation of edge forecasting and signal-processing silicon gets built.