- Mailing Lists
- in
- How Facebook Live Scaled to a Billion Users
Archives
- By thread 5069
-
By date
- June 2021 10
- July 2021 6
- August 2021 20
- September 2021 21
- October 2021 48
- November 2021 40
- December 2021 23
- January 2022 46
- February 2022 80
- March 2022 109
- April 2022 100
- May 2022 97
- June 2022 105
- July 2022 82
- August 2022 95
- September 2022 103
- October 2022 117
- November 2022 115
- December 2022 102
- January 2023 88
- February 2023 90
- March 2023 116
- April 2023 97
- May 2023 159
- June 2023 145
- July 2023 120
- August 2023 90
- September 2023 102
- October 2023 106
- November 2023 100
- December 2023 74
- January 2024 75
- February 2024 75
- March 2024 78
- April 2024 74
- May 2024 108
- June 2024 98
- July 2024 116
- August 2024 134
- September 2024 130
- October 2024 141
- November 2024 171
- December 2024 115
- January 2025 216
- February 2025 140
- March 2025 220
- April 2025 233
- May 2025 239
- June 2025 183
How Facebook Live Scaled to a Billion Users
How Facebook Live Scaled to a Billion Users
😘 Kiss bugs goodbye with fully automated end-to-end test coverage (Sponsored)Bugs sneak out when less than 80% of user flows are tested before shipping. However, getting that kind of coverage (and staying there) is hard and pricey for any team. QA Wolf’s AI-native service provides high-volume, high-speed test coverage for web and mobile apps, reducing your organizations QA cycle to less than 15 minutes. They can get you:
Engineering teams move faster, releases stay on track, and testing happens automatically—so developers can focus on building, not debugging. Drata’s team of 80+ engineers achieved 4x more test cases and 86% faster QA cycles. ⭐ Rated 4.8/5 on G2 Disclaimer: The details in this post have been derived from the articles/videos shared online by the Facebook/Meta engineering team. All credit for the technical details goes to the Facebook/Meta Engineering Team. The links to the original articles and videos are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them. Facebook didn’t set out to dominate live video overnight. The platform’s live streaming capability began as a hackathon project with the modest goal of seeing how fast they could push video through a prototype backend. It gave the team a way to measure end-to-end latency under real conditions. That test shaped everything that followed. Facebook Live moved fast by necessity. From that rooftop prototype, it took just four months to launch an MVP through the Mentions app, aimed at public figures like Dwayne Johnson. Within eight months, the platform rolled out to the entire user base, consisting of billions of users. The video infrastructure team at Facebook owns the end-to-end path of every video. That includes uploads from mobile phones, distributed encoding in data centers, and real-time playback across the globe. They build for scale by default, not because it sounds good in a deck, but because scale is a constraint. When 1.2 billion users might press play, bad architecture can lead to issues. The infrastructure needed to make that happen relied on foundational principles: composable systems, predictable patterns, and sharp handling of chaos. Every stream, whether it came from a celebrity or a teenager’s backyard, needed the same guarantees: low latency, high availability, and smooth playback. And every bug, every outage, every unexpected spike forced the team to build smarter, not bigger. In this article, we’ll look at how Facebook Live was built and the kind of challenges they faced. How Much Do Remote Engineers Make? (Sponsored)Engineering hiring is booming again: U.S. companies with revenue of $50 million+ are anticipating a 12% hiring increase compared with 2024. Employers and candidates are wondering: how do remote software engineer salaries compare across global markets? Terminal’s Remote Software Engineer Salary Report includes data from 260K+ candidates across Latin America, Canada and Europe. Employers can better inform hiring decisions and candidates can understand their earning potential. Our hiring expertise runs deep: Terminal is the smarter platform for hiring remote engineers. We help you hire elite engineering talent up to 60% cheaper than U.S. talent. Core Components Behind Facebook VideoAt the heart of Facebook’s video strategy lies a sprawling infrastructure. Each component serves a specific role in making sure video content flows smoothly from creators to viewers, no matter where they are or what device they’re using. See the diagram below that shows a high-level view of this infrastructure: Fast, Fail-Tolerant UploadsThe upload pipeline is where the video journey begins. It handles everything from a celebrity’s studio-grade stream to a shaky phone video in a moving car. Uploads must be fast, but more importantly, they must be resilient. Network drops, flaky connections, or device quirks shouldn’t stall the system.
Beyond reliability, the system clusters similar videos. This feeds recommendation engines that suggest related content to the users. The grouping happens based on visual and audio similarity, not just titles or tags. That helps surface videos that feel naturally connected, even if their metadata disagrees. Encoding at ScaleEncoding is a computationally heavy bottleneck if done naively. Facebook splits incoming videos into chunks, encodes them in parallel, and stitches them back together. This massively reduces latency and allows the system to scale horizontally. Some features are as follows:
This platform prepares content for consumption across every device class and network condition. Mobile users in rural zones, desktop viewers on fiber, everyone gets a version that fits their bandwidth and screen. Live Video as a First-Class CitizenLive streams add a layer of complexity. Unlike uploaded videos, live content arrives raw, gets processed on the fly, and must reach viewers with minimal delay. The architecture must absorb the chaos of real-time creation while keeping delivery tight and stable.
This is like a two-way street. Comments, reactions, and viewer engagement flow back to the broadcaster, making live content deeply interactive. Building that loop demands real-time coordination across networks, services, and user devices. Scalability RequirementsScaling Facebook Live is about building for a reality where “peak traffic” is the norm. With over 1.23 billion people logging in daily, the infrastructure must assume high load as the baseline, not the exception. Some scaling requirements were as follows: Scale Is the Starting PointThis wasn’t a typical SaaS model growing linearly. When a product like Facebook Live goes global, it lands in every timezone, device, and network condition simultaneously. The system must perform across the globe in varying conditions, from rural to urban. And every day, it gets pushed by new users, new behaviors, and new demands. Almost 1.23 billion daily active users formed the base load. Traffic patterns should follow cultural, regional, and global events. Distributed Presence: POPs and DCsTo keep latency low and reliability high, Facebook uses a combination of Points of Presence (POPs) and Data Centers (DCs).
This architecture allows for regional isolation and graceful degradation. If one POP goes down, others can pick up the slack without a central failure. Scaling Challenges That Break ThingsHere are some key scaling challenges Facebook faced that potentially created issues:
Live Video ArchitectureStreaming video live is about managing flow across an unpredictable, global network. Every live session kicks off a chain reaction across infrastructure components built to handle speed, scale, and chaos. Facebook Live’s architecture reflects this need for real-time resilience. Live streams originate from a broad set of sources:
These clients create RTMPS (Real-Time Messaging Protocol Secure) streams. RTMPS carries the video payload with low latency and encryption, making it viable for casual streamers and production-level events. Points of Presence (POPs)POPs act as the first entry point into Facebook’s video pipeline. They’re regional clusters of servers optimized for:
Each POP is tuned to handle a high volume of simultaneous connections and quickly routes streams using consistent hashing to distribute load evenly. See the diagram below: Data CentersOnce a POP forwards a stream, the heavy lifting happens in a Facebook data center. This is where the encoding hosts:
Each data center operates like a mini CDN node, tailored to Facebook’s specific needs and traffic patterns. Caching and DistributionLive video puts pressure on distribution in ways that on-demand video doesn’t. With pre-recorded content, everything is cacheable ahead of time. But in a live stream, the content is being created while it's being consumed. That shifts the burden from storage to coordination. Facebook’s answer was to design a caching strategy that can support this. The architecture uses a two-tier caching model:
This separation allows independent scaling and regional flexibility. As more viewers connect from a region, the corresponding POP scales up, caching hot content locally and shielding central systems. Managing the Thundering HerdThe first time a stream goes viral, hundreds or thousands of clients might request the same manifest or segment at once. If all those hit the data center directly, the system gets into trouble. To prevent that, Facebook uses cache-blocking timeouts:
The balance is tricky:
Keeping Manifests FreshLive streams rely on manifests: a table of contents that lists available segments. Keeping these up-to-date is crucial for smooth playback. Facebook uses two techniques:
HTTP Push is preferable when tight latency matters, especially for streams with high interaction or fast-paced content. TTL is simpler but comes with trade-offs in freshness and efficiency. Live Video PlaybackLive playback is about consistency, speed, and adaptability across networks that don’t care about user experience. Facebook’s live playback pipeline turns a firehose of real-time video into a sequence of reliable HTTP requests, and DASH is the backbone that makes that work. DASH (Dynamic Adaptive Streaming over HTTP)DASH breaks live video into two components:
The manifest evolves as the stream continues. New entries are appended, old ones fall off, and clients keep polling to see what’s next. This creates a rolling window, typically a few minutes long, that defines what’s currently watchable.
This model works because it’s simple, stateless, and cache-friendly. And when done right, it delivers video with sub-second delay and high reliability. Where POPs Come InPlayback clients don’t talk to data centers directly. Instead, they go through POPs: edge servers deployed around the world.
This two-tier caching model (POPs and DCs) keeps things fast and scalable:
ConclusionFacebook Live didn’t reach a billion users by accident. It got there through deliberate, pragmatic engineering. The architecture was designed to survive chaos in production. The story begins with a clock stream on a rooftop, but it quickly shifts to decisions under pressure: picking RTMP because it worked, chunking uploads to survive flaky networks, and caching manifests to sidestep thundering herds. A few lessons cut through all the technical layers:
References: SPONSOR USGet your product in front of more than 1,000,000 tech professionals. Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases. Space Fills Up Fast - Reserve Today Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com. © 2025 ByteByteGo |
by "ByteByteGo" <bytebytego@substack.com> - 11:36 - 20 May 2025