Mode-division multiplexing (MDM) technology has been regarded as a promising way to meet the exponentially increasing demand for transmission capacity over decades1-3.. When multiple modes propagate in the fiber, coupling inevitably occurs between different modes, which directly results in the crosstalk, and will seriously degrade the quality of signals. To address the critical issue above, the multi-input multi-output (MIMO) digital signal processing was introduced to compensate for the crosstalk of signals
4⇓-6, while facing the challenges of high computational complexity and high power consumption, especially when the number of channels increases
7⇓-9. And the processing speed is greatly restricted by the digital devices, which leads to a large delay. With the maturity of large-scale optical integration technology, light has gradually become a potential alternative to break through the speed and power limitation of conventional electronic hardware both in optical neural networks and optical signal processing
10⇓⇓-13. Numerous on-chip optical MIMO schemes have been proposed in recent years. The basic principle is to execute the inverse matrix of signal transmission coupling matrix through integrated optical methods such as Mach-Zehnder interferometer (MZI) meshes
14⇓⇓⇓-18. and multi-plane light conversion
19⇓-21. As the light propagates through the chip, the chaotic signals can be recovered at the speed of light and only little energy is required to maintain the chip operating in the desired state. However, all the existing optical MIMO methods are restricted to the spatial dimension while the temporal dimension is neglected. Since the modes exhibit different propagation velocities in the fiber, the modal group delay (GD) associated with modal dispersion (MD) is generated accordingly, which is not conducive to the quality of signals when accompanying with the spatial coupling. Therefore, both spatial and temporal dimensions are supposed to be taken into consideration by the optical MIMO, which will compensate for the crosstalk as much as possible.