In recent times, the popularity of transformer-based LLMs and LLM applications such as AI agents has skyrocketed. Compute is in high demand, while models soar in parameter count—reaching hundreds of billions and trillions of parameters in the largest LLMs. Luckily, researchers