- Mailing Lists
- in
- How Spotify Uses GenAI and ML to Annotate a Hundred Million Tracks
Archives
- By thread 5323
-
By date
- June 2021 10
- July 2021 6
- August 2021 20
- September 2021 21
- October 2021 48
- November 2021 40
- December 2021 23
- January 2022 46
- February 2022 80
- March 2022 109
- April 2022 100
- May 2022 97
- June 2022 105
- July 2022 82
- August 2022 95
- September 2022 103
- October 2022 117
- November 2022 115
- December 2022 102
- January 2023 88
- February 2023 90
- March 2023 116
- April 2023 97
- May 2023 159
- June 2023 145
- July 2023 120
- August 2023 90
- September 2023 102
- October 2023 106
- November 2023 100
- December 2023 74
- January 2024 75
- February 2024 75
- March 2024 78
- April 2024 74
- May 2024 108
- June 2024 98
- July 2024 116
- August 2024 134
- September 2024 130
- October 2024 141
- November 2024 171
- December 2024 115
- January 2025 216
- February 2025 140
- March 2025 220
- April 2025 233
- May 2025 239
- June 2025 303
- July 2025 135
Re: Improve Your Website's Search Engine Ranking
Effective Logistics And Supply Chain Management (13 & 14 Oct 2025)
How Spotify Uses GenAI and ML to Annotate a Hundred Million Tracks
How Spotify Uses GenAI and ML to Annotate a Hundred Million Tracks
Azure VM Cheatsheet for DevOps Teams (Sponsored)Azure Virtual Machine (VM) lets you flexibly run virtualized environments and scale on demand. But how do you make sure your VMs are optimized and cost-effective? Download the cheatsheet to see how Datadog’s preconfigured Azure VM dashboard helps you:
Disclaimer: The details in this post have been derived from the articles shared online by the Spotify Engineering Team. All credit for the technical details goes to the Spotify Engineering Team. The links to the original articles and sources are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them. Spotify applies machine learning across its catalog to support key features. One set of models assigns tracks and albums to the correct artist pages, handling cases where metadata is missing, inconsistent, or duplicated. Another set analyzes podcasts to detect platform policy violations. These models review audio, video, and metadata to flag restricted content before it reaches listeners. All of these activities depend on large volumes of high-quality annotations. These annotations act as the ground truth for model training and evaluation. Without them, model accuracy drops, feedback loops fail, and feature development slows down. As the number of use cases increased, the existing annotation workflows at Spotify became a bottleneck. Each team built isolated tools, managed their reviewers, and shipped data through manual processes that didn’t scale or integrate with machine learning pipelines. The problem was structural. Annotation was treated as an isolated task instead of a core part of the machine learning workflow. There was no shared tooling, no centralized workforce model, and no infrastructure to automate annotation at scale. This article explains how Spotify addressed these challenges by building an annotation platform designed to scale with its machine learning needs. It covers:
Warp's AI coding agent leaps ahead of Claude Code to hit #1 on Terminal-Bench (Sponsored)Warp just launched the first Agentic Development Environment, built for devs who want to move faster with AI agents. It's the top overall coding agent, jumping ahead of Claude Code by 20% to become the #1 agent on Terminal-Bench and scoring 71% on SWE-bench Verified. ✅ Long-running commands: something no other tool can support ✅ Agent multi-threading: run multiple agents in parallel – all under your control ✅ Across the development lifecycle: setup → coding → deployment Try Warp's coding agent for yourself Moving from Manual Workflow to Scalable AnnotationThe starting point was a straightforward machine learning (ML) classification task. The team needed annotations to evaluate model predictions and improve training quality, so they built a minimal pipeline to collect them. They began by sampling model outputs and serving them to human annotators through simple scripts. Each annotation was reviewed, captured, and passed back into the system. The annotated data was then integrated directly into model training and evaluation workflows. There was no full-fledged platform yet, but just a focused attempt to connect annotations to something real and measurable. Even with this basic setup, the results were significant:
This early success wasn’t just about volume. It showed that when annotation is directly tied into the model lifecycle, feedback loops become more useful and productivity improves. The outcome was enough to justify further investment. From here, the focus shifted from running isolated tasks to building a dedicated platform that could generalize the workflow and support many ML use cases in parallel. Platform ArchitectureThe overall platform architecture consists of three pillars. See the diagram below for reference: Let’s look at each pillar in more detail. 1 - Scaling Human ExpertiseTo support large-scale annotation work, Spotify first focused on organizing its human annotation resources. Instead of treating annotators as a generic pool, the team defined clear roles with distinct responsibilities and escalation paths. The annotation workforce was structured into three levels:
In parallel with the human effort, Spotify also developed a configurable system powered by large language models. This system operates in conjunction with human annotators and is designed to generate high-quality labels for cases that follow predictable patterns. It is not a full replacement but a complement that handles clear-cut examples, allowing humans to focus on harder problems. See the diagram below: This hybrid model significantly increased annotation throughput. By assigning the right cases to the right annotator (human or machine), Spotify was able to expand its dataset coverage at a lower cost and with higher consistency. 2 - Building Annotation Tooling for Complex TasksAs annotation needs grew beyond simple classification, Spotify expanded its tooling to support a wide range of complex, multimodal tasks. Early projects focused on basic question-answer formats, but new use cases required more flexible and interactive workflows. These included:
To support these varied requirements, the team invested in several core areas of tooling:
See the diagram below for annotating tooling capabilities: In cases where annotation tasks involved subjective interpretation or fine-grained distinctions, such as identifying background music layered into a podcast, different experts could produce conflicting results. To handle this, the system computed an agreement score across annotators. Items with low agreement were automatically escalated to the quality analysts for resolution. This structure allowed multiple annotation projects to run in parallel with consistent oversight, predictable output quality, and tight feedback loops between engineers, annotators, and reviewers. It turned what was once a manual process into a managed and observable workflow. 3 - Foundational Infrastructure and IntegrationTo support annotation at Spotify scale, the platform infrastructure was designed to be flexible and tool-agnostic. No single tool can serve all annotation needs, so the team focused on building the right abstractions. These abstractions make it possible to integrate a variety of tools depending on the task, while keeping the core system consistent and maintainable. The foundation includes:
See the diagram below: Integration was built across two levels of ML development:
The result is a system that supports both fast iteration in research environments and stable operation in production pipelines. Engineers can move between these modes without changing how they define tasks or access results. The infrastructure sits behind the tooling, but it is what allows the annotation platform to scale efficiently across diverse use cases. Impact on Annotation VelocityThe shift from manual, fragmented workflows to a unified annotation platform resulted in a sharp increase in annotation throughput. Internal metrics showed a sustained acceleration in annotation volume over time, driven by both improved tooling and more efficient workforce coordination. See the figure below that shows the rate of annotations over time.
This increase in velocity directly reduced the time required to develop and iterate on machine learning models. Teams were able to move faster across several dimensions:
As a result, ML teams could test hypotheses, refine models, and ship features faster. The annotation platform became a core enabler for iterative, data-driven development at scale. ConclusionSpotify’s annotation platform is built on a clear principle: scaling machine learning requires more than just more data or larger models. It depends on structured, high-quality annotations delivered through systems that are efficient, adaptable, and integrated into the full model development lifecycle. Relying entirely on human labor can create bottlenecks. On the other hand, full automation without oversight can lead to quality drift. Real leverage comes from combining both, with humans providing context and judgment and automation handling volume and repeatability. By moving from isolated workflows to a unified platform, Spotify turned annotation into a shared capability rather than a one-time cost. The implementation of standardized roles, modular tools, and consistent infrastructure allowed ML teams to build and iterate faster without rebuilding pipelines from scratch. This approach supports fast experimentation and scaling across a wide range of use cases. References: SPONSOR USGet your product in front of more than 1,000,000 tech professionals. Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases. Space Fills Up Fast - Reserve Today Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com. © 2025 ByteByteGo |
by "ByteByteGo" <bytebytego@substack.com> - 11:36 - 1 Jul 2025