Tag

#llmops

7 posts

AI & Engineering Jun 9, 2026 · 6 min

Hallucination is not a model problem — it's a system design problem

Ground the model in retrieved evidence, constrain its output, verify its claims, and measure everything. A layered defense against LLM hallucination.

Tech Opinions Jun 1, 2026 · 2 min

Making AI production-ready isn't about the model

Shipping production AI isn't about model benchmarks — it's the reliability, retries, fallbacks, observability, and cost discipline that keep LLM systems alive.

AI & Engineering May 7, 2026 · 7 min

Agentic Workflows That Don't Spiral Out of Control

Reliable agents come from control, not capability. Cap turns and time, push predictable steps into a state machine, and keep a human on the irreversible ones.

AI & Engineering Apr 9, 2026 · 8 min

POC to Production: What Breaks When RAG Meets Real Users

A RAG demo proves the happy path exists. Production is everything else — tracing, drift, evals, and learning to say 'I don't have that part.'

AI & Engineering Mar 30, 2026 · 7 min

The Hallucination Problem: What Reduced Ours, What Didn't

Most LLM hallucination is a retrieval failure in disguise. Fix the context first, force citations, and give the model a sanctioned way to say 'I don't know.'

AI & Engineering Mar 18, 2026 · 7 min

Cutting LLM Inference Cost ~40% Without Losing Quality

Inference cost optimization is a measurement problem in disguise. Fix the quality metric first, then trim context, route models, and cache the stable prefix.

AI & Engineering Mar 5, 2026 · 6 min

RAG vs Fine-Tuning vs Long Context: My Decision Framework

'Should we fine-tune?' is usually the wrong first question. Ask instead: is the gap knowledge or behavior, and how often does the answer change?