#046 Building a Search Database From First Principles

13/03/2025 53 min

Listen "#046 Building a Search Database From First Principles"

Descargar episodio Ver en sitio original

Episode Synopsis

Modern search is broken. There are too many pieces that are glued together.Vector databases for semantic searchText engines for keywordsRerankers to fix the resultsLLMs to understand queriesMetadata filters for precisionEach piece works well alone.Together, they often become a mess.When you glue these systems together, you create:Data Consistency Gaps Your vector store knows about documents your text engine doesn't. Which is right?Timing Mismatches New content appears in one system before another. Users see different results depending on which path their query takes.Complexity Explosion Every new component doubles your integration points. Three components means three connections. Five means ten.Performance Bottlenecks Each hop between systems adds latency. A 200ms search becomes 800ms after passing through four components.Brittle Chains When one system fails, your entire search breaks. More pieces mean more breaking points.I recently built a system where we had query specific post-filters but the requirement to deliver a fixed number of results to the user.A lot of times, the query had to be run multiple times to achieve the desired amount.So we had an unpredictable latency. A high load on the backend, where some queries hammered the database 10+ times. A relevance cliff, where results 1-6 look great, but the later ones were poor matches.Today on How AI Is Built, we are talking to Marek Galovic from TopK.We talk about how they built a new search database with modern components. "How would search work if we built it today?”Cloud storage is cheap. Compute is fast. Memory is plentiful.One system that handles vectors, text, and filters together - not three systems duct-taped into one.One pass handles everything:Vector search + Text search + Filters → Single sorted result
Built with hand-optimized Rust kernels for both x86 and ARM, the system scales to 100M documents with 200ms P99 latency.The goal is to do search in 5 lines of code.Marek Galovic:LinkedInWebsiteTopK WebsiteTopK DocsNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to TopK and Snowflake Comparison00:35 Architectural Patterns and Custom Formats01:30 Query Execution Engine Explained02:56 Distributed Systems and Rust04:12 Query Execution Process06:56 Custom File Formats for Search11:45 Handling Distributed Queries16:28 Consistency Models and Use Cases26:47 Exploring Database Versioning and Snapshots27:27 Performance Benchmarks: Rust vs. C/C++29:02 Scaling and Latency in Large Datasets29:39 GPU Acceleration and Use Cases31:04 Optimizing Search Relevance and Hybrid Search34:39 Advanced Search Features and Custom Scoring38:43 Future Directions and Research in AI47:11 Takeaways for Building AI Applications

More episodes of the podcast How AI Is Built

#056 Building Solo: How One Engineer Uses AI Agents to Ship Production Code 11/09/2025

#055 Embedding Intelligence: AI's Move to the Edge 13/08/2025

#054 Building Frankenstein Models with Model Merging and the Future of AI 29/07/2025

#053 AI in the Terminal: Enhancing Coding with Warp 23/07/2025

#052 Don't Build Models, Build Systems That Build Models 01/07/2025

#051 Build systems that can be debugged at 4am by tired humans with no context 17/06/2025

#050 Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship Faster 27/05/2025

#050 TAKEAWAYS Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship Faster 27/05/2025

#049 BAML: The Programming Language That Turns LLMs into Predictable Functions 20/05/2025

#049 TAKEAWAYS BAML: The Programming Language That Turns LLMs into Predictable Functions 20/05/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

#046 Building a Search Database From First Principles

Listen "#046 Building a Search Database From First Principles"

Episode Synopsis

More episodes of the podcast How AI Is Built

Digital Natives: Children of today, Technologists of Tomorrow

Subdomains, a glance with the experts!

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD