rust on Abhishek Murthy

rust on Abhishek Murthyhttps://abhishekmurthy.com/tags/rust/Recent content in rust on Abhishek MurthyHugo -- gohugo.ioen-usSat, 23 May 2026 21:22:16 -0400Durable OCR pipelines with Restate, Rust, and agent workershttps://abhishekmurthy.com/posts/building-pdf-extraction-pipelines-that-survive-real-documents/Sat, 23 May 2026 21:22:16 -0400https://abhishekmurthy.com/posts/building-pdf-extraction-pipelines-that-survive-real-documents/Most document extraction systems start life as a three-line demo: text = pdf.extract_text() result = model.extract(schema, text) save(result) That demo is useful because it proves the shape of the product. It is also where the architecture usually starts lying to you. The real system is not “PDF in, JSON out”. It is a distributed rendering, OCR, indexing, retrieval, agent execution, validation, and evaluation pipeline with unreliable inputs at every layer. The failure modes are not just “the model got the answer wrong”.Building a search engine that fits in your L3 cachehttps://abhishekmurthy.com/posts/search-engine-fits-in-l3-cache/Mon, 24 Nov 2025 16:05:24 -0500https://abhishekmurthy.com/posts/search-engine-fits-in-l3-cache/The first version of my search engine was slower after I added an index. That sounds backwards, but it is a real failure mode. A bad index can turn a simple sequential scan into a pile of cache misses, hash lookups, tiny heap allocations, branchy score calculations, and random memory walks. The CPU stops doing search and starts waiting for memory. The target I wanted was intentionally unreasonable: a local search engine for a few hundred thousand short technical records that could answer ranked queries inside an interactive UI budget, while keeping the hot path small enough to stay friendly to an L3 cache.