- Mailing Lists
- in
- How Uber Eats Handles Billions of Daily Search Queries
Archives
- By thread 4932
-
By date
- June 2021 10
- July 2021 6
- August 2021 20
- September 2021 21
- October 2021 48
- November 2021 40
- December 2021 23
- January 2022 46
- February 2022 80
- March 2022 109
- April 2022 100
- May 2022 97
- June 2022 105
- July 2022 82
- August 2022 95
- September 2022 103
- October 2022 117
- November 2022 115
- December 2022 102
- January 2023 88
- February 2023 90
- March 2023 116
- April 2023 97
- May 2023 159
- June 2023 145
- July 2023 120
- August 2023 90
- September 2023 102
- October 2023 106
- November 2023 100
- December 2023 74
- January 2024 75
- February 2024 75
- March 2024 78
- April 2024 74
- May 2024 108
- June 2024 98
- July 2024 116
- August 2024 134
- September 2024 130
- October 2024 141
- November 2024 171
- December 2024 115
- January 2025 216
- February 2025 140
- March 2025 220
- April 2025 233
- May 2025 239
- June 2025 43
How Uber Eats Handles Billions of Daily Search Queries
How Uber Eats Handles Billions of Daily Search Queries
Be part of P99 CONF – All things performance (Sponsored)Obsessed with performance and low latency engineering? Discuss your optimizations and lessons learned with ~30K like-minded engineers… at P99 CONF 2025! P99 CONF is a highly technical conference known for lively discussion. ScyllaDB makes it free and virtual, so it’s open to experts around the world. Core topics for this year include Rust, Zig, databases, event streaming architectures, measurement, compute/infrastructure, Linux, Kubernetes, and AI/ML. If you’re selected to speak, you’ll be in amazing company. Past speakers include the creators of Postgres, Bun, Honeycomb, tokio, and Flask – plus engineers from today’s most impressive tech leaders. Bonus: Early bird registrants get 30-day access to the complete O’Reilly library & learning platform, plus free digital books SUBMIT YOUR TALK - OR ACCESS A FREE PASS Disclaimer: The details in this post have been derived from the articles/videos shared online by the Uber Eats engineering team. All credit for the technical details goes to the Uber Eats Engineering Team. The links to the original articles and videos are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them. Uber Eats set out to increase the number of merchants available to users by a significant multiple. The team referred to it as nX growth. This wasn’t a simple matter of onboarding more restaurants. It meant expanding into new business lines like groceries, retail, and package delivery, each with its scale and technical demands. To accommodate this, the search functionality needed to support this growth across all discovery surfaces:
The challenge wasn’t just to show more. It was to do so without increasing latency, without compromising ranking quality, and without introducing inconsistency across surfaces. A few core problems made this difficult:
To support nX merchant growth, the team had to rethink multiple layers of the search stack from ingestion and indexing to sharding, ranking, and query execution. In this article, we look at the breakdown of how Uber Eats rebuilt its search platform to handle this scale without degrading performance or relevance. The 2025 State of Testing in DevOps Report (Sponsored)mabl’s 6th State of Testing in DevOps Report explores the impact of software testing, test automation, organizational growth, and DevOps maturity across the software development lifecycle. Search Architecture OverviewUber Eats search is structured as a multi-stage pipeline, built to balance large-scale retrieval with precise, context-aware ranking. Each stage in the architecture has a specific focus, starting from document ingestion to final ranking. Scaling the system for millions of merchants and items means optimizing each layer without introducing bottlenecks downstream. Ingestion and IndexingThe system ingests documents from multiple verticals (restaurants, groceries, retail) and turns them into searchable entities. There are two primary ingestion paths:
Retrieval LayerThe retrieval layer acts as the front line of the search experience. Its job is to fetch a broad set of relevant candidates for downstream rankers to evaluate.
First-Pass RankingOnce the initial candidate set is retrieved, a lightweight ranking phase begins.
Hydration LayerBefore documents reach the second-pass ranker, they go through a hydration phase. Each document is populated with additional context: delivery ETAs, promotional offers, loyalty membership info, and store images. This ensures downstream components have all the information needed for ranking and display. Second-Pass RankingThis is where the heavier computation happens, evaluating business signals and user behavior.
The Query Scaling ChallengeScaling search isn't just about fetching more documents. It's about knowing how far to push the system before performance breaks. At Uber Eats, the first attempt to increase selection by doubling the number of matched candidates from 200 to 400 seemed like a low-risk change. In practice, it triggered a 4X spike in P50 query latency and exposed deeper architectural flaws. The idea was straightforward: expand the candidate pool so that downstream rankers have more choices. More stores mean better recall. However, the cost wasn’t linear because of the following reasons:
Geo-sharding MismatchesThe geo-sharding strategy, built around delivery zones using hexagons, wasn't prepared for expanded retrieval scopes. As delivery radii increased, queries began touching more distant shards, many of which were optimized for different traffic patterns or data distributions. This led to inconsistent latencies and underutilized shards in low-traffic areas. Pipeline Coordination GapsIngestion and query layers weren’t fully aligned. The ingestion service categorizes stores as “nearby” or “far” based on upstream heuristics. These classifications didn’t carry over cleanly into the retrieval logic. As a result, rankers treated distant and local stores the same, skewing relevance scoring and increasing CPU time. Geospatial Indexing with H3 and the Sharding ProblemUber Eats search is inherently geospatial. Every query is grounded in a delivery address, and every result must answer a core question: Can this store deliver to this user quickly and reliably? To handle this, the system uses H3, Uber’s open-source hexagonal spatial index, to model the delivery world. H3-Based Delivery MappingEach merchant’s delivery area is mapped using H3 hexagons:
This structure makes location-based lookups efficient. Given a user’s location, the system finds their H3 hexagon and retrieves all matching stores with minimal fanout.
Where Ingestion Fell ShortThe problem wasn’t the mapping but the metadata. Upstream services were responsible for labeling stores as “close” or “far” at ingestion time. This binary categorization was passed downstream without actual delivery time (ETA) information. Once ingested, the ranking layer saw both close and far stores as equivalent. That broke relevance scoring in subtle but important ways. Consider this:
That lack of granularity meant distant but high-converting stores would often outrank nearby ones. Users saw popular chains from across the city instead of the closer, faster options they expected. Sharding TechniquesSharding determines how the system splits the global index across machines. A good sharding strategy keeps queries fast, data well-balanced, and hotspots under control. A bad one leads to overloaded nodes, inconsistent performance, and painful debugging sessions. Uber Eats search uses two primary sharding strategies: Latitude sharding and Hex sharding. Each has trade-offs depending on geography, query patterns, and document distribution. Latitude ShardingLatitude sharding divides the world into horizontal bands. Each band corresponds to a range of latitudes, and each range maps to a shard. The idea is simple: group nearby regions based on their vertical position on the globe. Shard assignment is computed offline using Spark. The process involves two steps:
To avoid boundary misses, buffer zones are added. Any store that falls near the edge of a shard is indexed in both neighboring shards. The buffer width is based on the maximum expected search radius, converted from kilometers into degrees of latitude. The benefits of this approach are as follows:
The downsides are as follows:
Hex ShardingTo address the limitations of latitude sharding, Uber Eats also uses Hex sharding, built directly on top of the H3 spatial index. Here’s how it works:
Buffer zones are handled similarly, but instead of latitude bands, buffer regions are defined as neighboring hexagons at a lower resolution. Any store near a hex boundary is indexed into multiple shards to avoid cutting off valid results. The benefits are as follows:
As a takeaway, latitude sharding works well when shard traffic needs to be spread across time zones, but it breaks down in high-density regions. Hex sharding offers more control, better balance, and aligns naturally with the geospatial nature of delivery. Uber Eats uses both, but hex sharding has become the more scalable default, especially as selection grows and delivery radii expand. Index Layout OptimizationsWhen search systems slow down, it’s tempting to look at algorithms, infrastructure, or sharding. But often, the bottleneck hides in a quieter place: how the documents are laid out in the index itself. At Uber Eats scale, index layout plays a critical role in both latency and system efficiency. The team optimized layouts differently for restaurant (Eats) and grocery verticals based on query patterns, item density, and retrieval behavior. Eats Index LayoutRestaurant queries typically involve users looking for either a known brand or food type within a city. For example, “McDonald’s,” “pizza,” or “Thai near me.” The document layout reflects that intent. Documents are sorted as:
This works for the following reasons:
Grocery Index LayoutGrocery stores behave differently. A single store may list hundreds or thousands of items, and queries often target a specific product ( “chicken,” “milk,” “pasta”) rather than a store. Here, the layout is:
This matters for the following reasons:
Performance ImpactThe improvements of these indexing strategies were pretty good:
ETA-Aware Range IndexingDelivery time matters. When users search on Uber Eats, they expect nearby options to show up first, not restaurants 30 minutes away that happen to rank higher for other reasons. But for a long time, the ranking layer couldn’t make that distinction. It knew which stores delivered to a given area but not how long delivery would take. This is because the ingestion pipeline didn’t include ETA (Estimated Time of Delivery) information between stores and hexagons. That meant:
This undermined both user expectations and conversion rates. A store that looks promising but takes too long to deliver creates a broken experience. The Solution: Range-Based ETA BucketingTo fix this, Uber Eats introduced ETA-aware range indexing. Instead of treating delivery zones as flat lists, the system:
This approach works for the following reasons:
ConclusionScaling Uber Eats search to support nX merchant growth wasn’t a single optimization. It was a system-wide redesign. Latency issues, ranking mismatches, and capacity bottlenecks surfaced not because one layer failed, but because assumptions across indexing, sharding, and retrieval stopped holding under pressure. This effort highlighted a few enduring lessons that apply to any high-scale search or recommendation system:
References: SPONSOR USGet your product in front of more than 1,000,000 tech professionals. Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases. Space Fills Up Fast - Reserve Today Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com. © 2025 ByteByteGo |
by "ByteByteGo" <bytebytego@substack.com> - 11:36 - 27 May 2025