LENZO’s latest research on CGLA-based LLM inference has received the Best Paper Award at the Sixth International Conference on Intelligent Systems and Networks (ICISN 2026) in Hanoi, Vietnam.

The paper, “Q-Snap: Quantization-Aware Dynamic Chunking for LLM Execution on a CGLA,” was presented by Takuto Ando, Ayumu Takeuchi, Yu Eto, Yoshifumi Munakata, and Yasuhiko Nakashima.
It addresses the fundamental inefficiency of LLM inference on conventional GPU architectures by introducing a hardware-aware scheduling approach optimized for CGLA.
Using an FPGA prototype and 28nm ASIC projections, the method demonstrates both improved performance and higher energy efficiency compared to modern GPU systems.
This work reinforces LENZO’s focus on architecture-driven compute, where efficiency gains are achieved through design and execution - not just process scaling.
