Infrastructurereal time ragcontext rotdata engineeringapache spark

Viquar Khan Proposes Real-Time RAG Architecture

|April 11, 2026|By LDS Team

5.7

Relevance Score

Viquar Khan Proposes Real-Time RAG Architecture — Photo: og-image.hackernoon.vercel.app · rights & takedowns

Viquar Khan, a Senior Architect at AWS, diagnoses production failures in agentic systems as "context rot" caused by stale retrieval stores, not model regressions. He argues embeddings are materialized views over transactional data and that traditional batch pipelines create silent drift that breaks long-running agents. Khan demonstrates a pragmatic architecture using Spark and Iceberg to implement low-latency change data capture, transactional upserts, and controlled write amplification so embedding stores reflect live state. The result: retrieval-augmented generation (RAG) systems that maintain accuracy across long sessions and continuous agent loops by keeping vector indices consistent with source-of-truth updates.

What happened

Viquar Khan, Senior Architect and GenAI specialist at AWS, describes a production outage where an agent made incorrect operational decisions because its retrieval layer had drifted from live transactional data. The failure mode, which he names "context rot," emerges when long-running agentic RAG loops rely on stale embeddings and vector stores. Khan cites empirical failure points, including performance collapse near the 32,000 token threshold and studies showing accuracy drops when facts migrate toward the middle of long contexts.

Technical details

Khan reframes embeddings as a materialized view of transactional data and shows why legacy batch ETL fails for high-frequency updates. He advocates using Spark with table formats like Iceberg to provide ACID semantics, incremental compaction, and efficient upserts that keep the embedding store current. Key implementation patterns he highlights include:

•streaming change data capture (CDC) into a transactional table to capture inserts, updates, and deletes
•incremental embedding recomputation for changed records and selective reindexing rather than full rebuilds
•write amplification controls and compaction strategies to avoid exploding storage and recompute costs

These elements let you maintain a near-real-time vector index that aligns with business state while bounding cost.

Context and significance

This is an operationally focused intervention, not a new model or training technique. Its significance is practical: as agents move from stateless prompts to continuous execution, data systems become the primary failure surface. The article shifts responsibility from the LLM to the data engineering layer and makes clear that retrieval consistency, not bigger models, is the limiting factor for reliable agent behavior. Vendor vector DBs that lack transactional connectivity, or pipelines that recompute embeddings in large batches, are fragile for agentic workloads. Adopting transactional table formats and event-driven embedding refreshes bridges that gap.

What to watch

Look for tighter CDC-to-vector pipelines, native connectors between Iceberg-like formats and vector search tools, and vendor features that treat embeddings as first-class incremental materialized views. Teams building agentic systems should instrument retrieval freshness and budget for incremental recompute, compaction, and TTL policies to avoid silent degradation.

Key Points

1Context rot occurs when embeddings diverge from live transactional data, causing agent hallucinations and bad operational decisions.
2Treat embeddings as materialized views and use CDC plus transactional tables to keep vector indexes synchronized in near real time.
3Combining Spark and Iceberg with incremental recompute and compaction controls reduces write amplification and preserves retrieval accuracy.

Scoring Rationale

The piece delivers a high-value operational pattern for production RAG and agentic systems, directly relevant to ML engineering. It is practical rather than research-first, so significance is notable but not paradigm-shifting. The story is older than three days, so its score is reduced for recency.

Sources

Public references used for this report.

1 source

01hackernoon.comReal-Time Agentic RAG: Eradicating Context Rot With Spark & Iceberg

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems