Architecting intelligent systems.Building intelligent systems at the intersection of LLMs and Large Scale Infrastructure

Recent Observations

Latest thoughts on AI and Engineering.

view all
Efficiently Serving LLMs
genaiMay 29, 2024

Efficiently Serving LLMs

Exploring techniques such as vectorization, KV caching, continuous batching, and LoRA

5 min