Listen "23: Searching for Neighbors with Voyager"
Episode Synopsis
How do you get a machine to find a song that’s similar to another song? What properties of the song should it look for? And then does it just compare each track to every other track, one by one, until it finds the closest match? When you have a catalog of 100 million different music tracks, like we do at Spotify, that would take a long time. So, for these kinds of problems, we use a technique known as nearest neighbor search (NNS). This past summer at Spotify, we built a new library for nearest neighbor search: It’s called Voyager — and we open sourced it.
Host and principal engineer Dave Zolotusky talks with Peter Sobot and Mark Koh, two of the machine learning engineers who developed Voyager. They discuss using nearest neighbor search for recommendations and personalization, how to go from searching for vectors in a 2D space to searching for them in a space with thousands of dimensions, the relative funkiness and danceability of Mozart and Bach, how to find a place on a map when you don’t have the exact coordinates, tricky acronyms (Annoy: “Approximate Nearest Neighbor Oh Yeah”) and initialisms (HNSW: “Hierarchical Navigable Small World”), why we stopped using our old NNS library, why we open sourced the new one, how it works for use cases beyond music (like LLMs), and looking for ducks in grass.
Learn more about Spotify Voyager:
About Voyager
Voyager on GitHub
Voyager documentation for Python
Voyager documentation for Java
Read what else we’re nerding out about on the Spotify Engineering Blog: engineering.atspotify.comYou should follow us on Twitter @SpotifyEng and on LinkedIn!
Host and principal engineer Dave Zolotusky talks with Peter Sobot and Mark Koh, two of the machine learning engineers who developed Voyager. They discuss using nearest neighbor search for recommendations and personalization, how to go from searching for vectors in a 2D space to searching for them in a space with thousands of dimensions, the relative funkiness and danceability of Mozart and Bach, how to find a place on a map when you don’t have the exact coordinates, tricky acronyms (Annoy: “Approximate Nearest Neighbor Oh Yeah”) and initialisms (HNSW: “Hierarchical Navigable Small World”), why we stopped using our old NNS library, why we open sourced the new one, how it works for use cases beyond music (like LLMs), and looking for ducks in grass.
Learn more about Spotify Voyager:
About Voyager
Voyager on GitHub
Voyager documentation for Python
Voyager documentation for Java
Read what else we’re nerding out about on the Spotify Engineering Blog: engineering.atspotify.comYou should follow us on Twitter @SpotifyEng and on LinkedIn!
More episodes of the podcast NerdOut@Spotify
29: Deploying Our New Typeface: Spotify Mix
03/04/2025
28: The CNCF Turns 10
06/03/2025
27: Measuring Developer Productivity
18/04/2024
26: A Trillion Events
08/02/2024
25: Voice Translation *Release Notes*
11/01/2024
24: Tesla *Release Notes*
09/11/2023
22: Declarative Infra and Beyond
24/08/2023
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.