330 technical articles  ·  18 peer-reviewed publications  ·  Samsung Research  ·  IIT Madras

Deep-Dives by Topic

---

Greatest Hits

Hand-picked articles across speech tech, AI agents, and ML systems.

Paperclip: the org chart your AI agents are missing

AI Agents

Multi-agent coordination fails without governance. Paperclip models agent systems as companies with org charts, budgets, and audit trails.

TraceR1: planning before moving

AI Agents

Adobe's RL framework trains agents to forecast the full trajectory before taking the first action -- 8-40% gains over reactive baselines on long-horizon tasks.

Prompt injection is a structural attack

AI Security

Prompt injection isn't a content moderation problem. It's a structural consequence of how LLMs process tokens. Here's what actually works.

Gemma 4: three architectural decisions that changed what a small model can do

ML System Design

Hybrid attention, native multimodality, and a 26B MoE variant at 12% compute cost -- the decisions behind Gemma 4's architecture.

VibeVoice: how ultra-low frame rate tokenization solved 90-minute TTS

Speech Tech

7.5 Hz tokenization, 80x compression over Encodec, four distinct voices, 90 minutes. Why frame rate is the core constraint in long-form speech synthesis.

---

Building something with AI agents or speech tech?

I help startups and teams ship production AI systems -- from architecture reviews and advisory engagements to hands-on fractional CTO work. I also help businesses go AI-native with agentic workflows and agent orchestration.

Discuss Your Project
---

Research Publications

Peer-reviewed work at international speech technology venues. Google Scholar profile →

Multilingual ASR with Improved Language Identification for Indic Languages

INTERSPEECH 2024, Kos Island — 19.1% absolute WER improvement with unified LID framework

Speaker Personalization for ASR using Weight-Decomposed Low-Rank Adaptation

INTERSPEECH 2024, Kos Island — 20% relative WER improvement with DoRA on conformer transducer

Deep Learning for Phonetic Segmentation in Indian Language TTS

INTERSPEECH 2017, Stockholm — Premier international speech technology conference

Robust Speaker Personalisation Using Generalized Low-Rank Adaptation for ASR

ICASSP 2024, Seoul — IEEE International Conference on Acoustics, Speech, and Signal Processing

All Publications →

---

Books

Long-form writing on building production AI systems.

AI Agent Orchestration

Architecture, patterns, and production lessons for multi-agent systems — beyond single-agent demos.

All Books →

---

Thoughts

Short reflections on experience, perspective, and learning.

Engineering Mental Health: Evidence Over Intuition

July 10, 2025

Engineering Longevity: What the Science Actually Says

July 07, 2025

Engineering Hormones and Metabolism: What Your Lab Results Are Not Telling You

July 04, 2025

All Thoughts →

---

Stay in the Loop