<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>search on Abhishek Murthy</title><link>https://abhishekmurthy.com/tags/search/</link><description>Recent content in search on Abhishek Murthy</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Mon, 24 Nov 2025 16:05:24 -0500</lastBuildDate><atom:link href="https://abhishekmurthy.com/tags/search/index.xml" rel="self" type="application/rss+xml"/><item><title>Building a search engine that fits in your L3 cache</title><link>https://abhishekmurthy.com/posts/search-engine-fits-in-l3-cache/</link><pubDate>Mon, 24 Nov 2025 16:05:24 -0500</pubDate><guid>https://abhishekmurthy.com/posts/search-engine-fits-in-l3-cache/</guid><description>The first version of my search engine was slower after I added an index.
That sounds backwards, but it is a real failure mode. A bad index can turn a simple sequential scan into a pile of cache misses, hash lookups, tiny heap allocations, branchy score calculations, and random memory walks. The CPU stops doing search and starts waiting for memory.
The target I wanted was intentionally unreasonable: a local search engine for a few hundred thousand short technical records that could answer ranked queries inside an interactive UI budget, while keeping the hot path small enough to stay friendly to an L3 cache.</description></item><item><title>RAG is not provenance</title><link>https://abhishekmurthy.com/posts/rag-is-not-provenance/</link><pubDate>Fri, 20 Jun 2025 16:05:24 -0400</pubDate><guid>https://abhishekmurthy.com/posts/rag-is-not-provenance/</guid><description>The easiest way to make a document extraction system look good in a demo is to hide the evidence.
Ask a model to read a few PDFs, retrieve the top chunks, and fill a spreadsheet. If the answer looks plausible, ship the JSON. The failure usually arrives later, when somebody asks a very simple question:
Where did this number come from?
That question breaks a surprising number of systems. The value might be correct, but the cited source points at a row label.</description></item></channel></rss>