AI & Engineering

16 posts

AI & Engineering Jun 18, 2026 · 7 min

A GenAI Project That Delivered Measurable Business Impact

A GenAI project succeeds or fails on whether you can name the business metric it moved — and prove it. The architecture is the price of entry, not the win.

AI & Engineering Jun 12, 2026 · 7 min

Data Governance for GenAI in Regulated Industries

In regulated, multi-tenant GenAI, governance isn't a constraint on the build — it's what decides whether you can ship. Design it into the first diagram.

AI & Engineering Jun 9, 2026 · 4 min

I gave AI sudo access to my legacy Node.js app while I was at lunch. It fixed 12 bugs.

A self-healing feedback loop: tail the logs, feed errors to Claude Code, let it fix, build, and restart — unattended. Here's what it actually fixed.

AI & Engineering Jun 9, 2026 · 6 min

Hallucination is not a model problem — it's a system design problem

Ground the model in retrieved evidence, constrain its output, verify its claims, and measure everything. A layered defense against LLM hallucination.

AI & Engineering Jun 4, 2026 · 7 min

How We Actually Measure RAG Quality

RAG quality by vibes doesn't survive a second engineer. Decompose by stage, build the eval set from real failures, calibrate the LLM judge, and gate CI.

AI & Engineering Jun 3, 2026 · 7 min

7 prompting techniques that actually work (tested & ranked)

A research-backed breakdown of the prompting methods — from zero-shot to ReAct — that reliably produce better, smarter, more useful AI output.

AI & Engineering Jun 1, 2026 · 7 min

How to structure your agentic workflows (Claude Code, Cursor, or any AI tool)

Five layers — context file, plan gate, atomic tasks, git checkpoints, verification loop — that make AI agents fast and trustworthy instead of chaotic.

AI & Engineering May 20, 2026 · 1 min

Episodic memory for agents with Graphiti

Episodic, bi-temporal memory with Graphiti lets an AI agent answer not just what's true now, but what was true when — without re-indexing the whole world.

AI & Engineering May 16, 2026 · 7 min

Multilingual RAG, Where the Languages Are Code

Code is many languages plus an identifier dialect tokenizers shred. Use code-specialized embeddings, identifier-aware retrieval, and structure-aware chunking.

AI & Engineering May 7, 2026 · 7 min

Agentic Workflows That Don't Spiral Out of Control

Reliable agents come from control, not capability. Cap turns and time, push predictable steps into a state machine, and keep a human on the irreversible ones.

AI & Engineering May 2, 2026 · 2 min

Why knowledge graphs beat vector-only RAG

Vector-only RAG returns flat, context-poor chunks ranked by similarity. Knowledge graphs model entities and relationships to traverse why things connect.

AI & Engineering Apr 24, 2026 · 6 min

Choosing a Vector Database: Benchmarks from My Own Workloads

Every vector DB benchmark was run on someone else's data. Benchmark your own vector count, dimensions, and — above all — your real filtering pattern.

AI & Engineering Apr 9, 2026 · 8 min

POC to Production: What Breaks When RAG Meets Real Users

A RAG demo proves the happy path exists. Production is everything else — tracing, drift, evals, and learning to say 'I don't have that part.'

AI & Engineering Mar 30, 2026 · 7 min

The Hallucination Problem: What Reduced Ours, What Didn't

Most LLM hallucination is a retrieval failure in disguise. Fix the context first, force citations, and give the model a sanctioned way to say 'I don't know.'

AI & Engineering Mar 18, 2026 · 7 min

Cutting LLM Inference Cost ~40% Without Losing Quality

Inference cost optimization is a measurement problem in disguise. Fix the quality metric first, then trim context, route models, and cache the stable prefix.

AI & Engineering Mar 5, 2026 · 6 min

RAG vs Fine-Tuning vs Long Context: My Decision Framework

'Should we fine-tune?' is usually the wrong first question. Ask instead: is the gap knowledge or behavior, and how often does the answer change?