Deep-Dives by Topic
AI Agents
85 articles
Autonomous systems, multi-agent orchestration, tool use, memory, and production deployment.
AI Security
36 articles
Red teaming, prompt injection, agent attack surfaces, adversarial audio, and governance.
Speech Tech
70 articles
ASR, TTS, speech separation, voice enhancement, and real-time audio processing.
ML System Design
79 articles
Production ML architecture, serving, feature stores, evaluation, and inference patterns.
DSA
60 articles
Data structures and algorithms with detailed solutions and complexity analysis.
Thoughts
26 posts
Short reflections on health, longevity, focus, and the science of living well.
Greatest Hits
Hand-picked articles across speech tech, AI agents, and ML systems.
Paperclip: the org chart your AI agents are missing
AI Agents
Multi-agent coordination fails without governance. Paperclip models agent systems as companies with org charts, budgets, and audit trails.
TraceR1: planning before moving
AI Agents
Adobe's RL framework trains agents to forecast the full trajectory before taking the first action -- 8-40% gains over reactive baselines on long-horizon tasks.
Prompt injection is a structural attack
AI Security
Prompt injection isn't a content moderation problem. It's a structural consequence of how LLMs process tokens. Here's what actually works.
Gemma 4: three architectural decisions that changed what a small model can do
ML System Design
Hybrid attention, native multimodality, and a 26B MoE variant at 12% compute cost -- the decisions behind Gemma 4's architecture.
VibeVoice: how ultra-low frame rate tokenization solved 90-minute TTS
Speech Tech
7.5 Hz tokenization, 80x compression over Encodec, four distinct voices, 90 minutes. Why frame rate is the core constraint in long-form speech synthesis.
Building something with AI agents or speech tech?
I help startups and teams ship production AI systems -- from architecture reviews and advisory engagements to hands-on fractional CTO work. I also help businesses go AI-native with agentic workflows and agent orchestration.
Discuss Your ProjectResearch Publications
Peer-reviewed work at international speech technology venues. Google Scholar profile →
Multilingual ASR with Improved Language Identification for Indic Languages
INTERSPEECH 2024, Kos Island — 19.1% absolute WER improvement with unified LID framework
Speaker Personalization for ASR using Weight-Decomposed Low-Rank Adaptation
INTERSPEECH 2024, Kos Island — 20% relative WER improvement with DoRA on conformer transducer
Deep Learning for Phonetic Segmentation in Indian Language TTS
INTERSPEECH 2017, Stockholm — Premier international speech technology conference
Robust Speaker Personalisation Using Generalized Low-Rank Adaptation for ASR
ICASSP 2024, Seoul — IEEE International Conference on Acoustics, Speech, and Signal Processing
Books
Long-form writing on building production AI systems.
AI Agent Orchestration
Architecture, patterns, and production lessons for multi-agent systems — beyond single-agent demos.
Thoughts
Short reflections on experience, perspective, and learning.
Engineering Mental Health: Evidence Over Intuition
July 10, 2025
Engineering Longevity: What the Science Actually Says
July 07, 2025