hv

Harshavardhanan Deekeswar

Building at the intersection of distributed systems and AI

Distributed Architect · Independent Researcher · Ex-Verizon · 15 years

Currentlyworking on Ratatoskr, a system for composing agent tool chains from natural language intent, backed by a self-extending schema-compatibility graph.

I build distributed systems. Fifteen years of it, most recently as Principal Architect at Verizon, working on network graph intelligence and integrating LLMs into operational tooling. Before that, architecture and delivery roles at Cognizant, TCS, and HCL across telecom, manufacturing, and insurance.

My focus now is production AI and token economics. In production AI, I work on the parts that matter after the MVP: reliability, latency budgets, evaluation, and failure modes. In token economics, I work on where tokens actually go, what they cost, and which optimizations hold up under real load. Fifteen years of distributed systems habits carry over directly.

I published ONTO earlier this year (arXiv:2604.17512). It's a columnar serialization format built for LLM input. Existing formats were designed for services exchanging documents. That assumption breaks when the consumer is a language model reading thousands of records under a token budget. ONTO treats LLM input as its own problem and reduces token usage by 46 to 51 percent versus JSON, with no measurable loss in accuracy.

I also write a research series on token economics in production. One recent piece of mine documents an undocumented behavior in OpenAI's API: prefix caches shared across model generations. At scale, this changes what you pay.

And I built an open-source AI Engineering Bootcamp. Ten weeks, production-focused, covering RAG, agents, LLMOps, observability, evaluation, and multi-agent patterns. I built it because it's the course I would have wanted when I started focusing on production AI.

Open to senior and staff roles. Remote or hybrid.

Selected Work

2026

ONTO

Columnar notation for LLM input. Published research on token efficiency.

2025

AI Engineering Bootcamp

10-week curriculum covering RAG, agents, LLMOps, and production patterns.

2025 — ongoing

Token Economics Research

Empirical series measuring where tokens go in production LLM systems.

Beyond Code

Work is serious. Life doesn't have to be.