Most of the power consumption in current GPGPU-based systems comes from data transfer. This is due to their reliance on an asynchronous, best-effort bus master, many-core architecture, which has remained unchanged for the past 30 years. To tackle this problem, we have developed Lenzo Core based on a completely new architecture known as CGLA (Coarse Grained Linear Array).
The key features of CGLA:
- A dataflow-oriented architecture that incorporates synchronous QoS control, a bus-slave design, and a coarse-grained reconfigurable structure, effectively minimizing data transfers between compute cores and memory.
- A computing platform that addresses the challenges of large-scale SIMD and wide-memory-bus architectures (including GPGPUs), which are difficult to partition into chiplets. Instead, CGLA utilizes a multi-level pipeline and a tightly integrated configuration of narrow memory buses.
- An architecture that overcomes the limitations of traditional Coarse Grained Reconfigurable Arrays (CGRA), which, while expected to be high-efficiency non-Von Neumann architecture, struggle with programming complexity and slow compilation speed.