Takens' embedding theorem (1981) guarantees that a scalar time series contains enough information to reconstruct the full attractor of the underlying dynamical system via delay embedding. The reservoir is effectively computing a
nonlinear generalization of this: each neuron integrates the input history with a different effective time constant and nonlinearity, producing a set of overlapping delay-like projections. Jaeger & Haas (2004, Science) and, more precisely, Miao, Narayanan & Li (2023,
IEEE Transactions on Neural Networks and Learning Systems) formalize this: training a Reservoir Computing Network (RCN) is equivalent to learning a map between a window of historical data and the future — a map whose existence Takens' theorem guarantees for generic dynamical systems. Recent work by Bollt et al. (2025/2026,
AIP Chaos) strengthens this further, proving that a generic reservoir map produces an
isometric embedding of the input attractor — not just a topological one — so the reservoir represents the system without metric distortion.
Cover's theorem (1965) states that a classification problem cast into a sufficiently high-dimensional space via a nonlinear mapping is more likely to be linearly separable than in the original low-dimensional space. That is precisely what the reservoir does: it maps a scalar time series into ℝᴺ, and the linear classifier exploits the resulting separability. Gauthier et al. (2021,
Nature Communications, Next Generation Reservoir Computing) make this explicit: traditional RC exploits Cover's theorem via the high-dimensional reservoir state, while their "next-generation" variant achieves the same end using polynomial features of time-shifted data — exploiting Takens' theorem directly without a recurrent network. Both approaches work for the same deep reason.
The unification: reservoir computing combines both theorems in one mechanism. Takens says the input history is recoverable from a scalar stream; Cover says high-dimensional nonlinear projection makes it separable. The reservoir does both simultaneously — no explicit delay construction, no kernel design, no training of the recurrent weights. The extension to feed-forward "time-delay neural networks" (TDNN) as reservoirs follows naturally: a window of past inputs with nonlinear features is a finite-dimensional Takens embedding, and its dimensionality provides the Cover-style expansion that enables linear readout.
References- Takens, F. (1981). Detecting strange attractors in turbulence. In D. Rand & L.-S. Young (Eds.), Lecture Notes in Mathematics (Vol. 898, pp. 366–381). Springer. https://doi.org/10.1007/BFb0091924
- Cover, T. M. (1965). Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, EC-14(3), 326–334. https://doi.org/10.1109/PGEC.1965.264137
- Jaeger, H., & Haas, H. (2004). Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science, 304(5667), 78–80. https://doi.org/10.1126/science.1091277
- Miao, W., Narayanan, V., & Li, J.-S. (2023). Interpretable design of reservoir computing networks using realization theory. IEEE Transactions on Neural Networks and Learning Systems, 34(9), 6379–6389. https://doi.org/10.1109/TNNLS.2021.3136495
- Gauthier, D. J., Bollt, E., Griffith, A., & Barbosa, W. A. S. (2021). Next generation reservoir computing. Nature Communications, 12, 5564. https://doi.org/10.1038/s41467-021-25801-2
- Hart, A. G. (2025). Generic and isometric embeddings in reservoir computers. Chaos: An Interdisciplinary Journal of Nonlinear Science, 35(11), 111103. https://doi.org/10.1063/5.0301957