Automated Bug Fixing at Facebook Scale

Automated Bug Fixing at Facebook Scale

📌Save the Date! Innovate Faster with AI Code Generation (Sponsored) With AI code generation tools developers can accelerate timelines at a pace and cost that would have been unfathomable just years ago. However, code generated by AI can include bugs and errors, and readability, maintainability, and security issues – just like code produced by developers.
͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­͏ ‌     ­
Forwarded this email? Subscribe here for more

📌Save the Date! Innovate Faster with AI Code Generation (Sponsored)

With AI code generation tools developers can accelerate timelines at a pace and cost that would have been unfathomable just years ago. However, code generated by AI can include bugs and errors, and readability, maintainability, and security issues – just like code produced by developers.

Join Manish Kapur, Sr. Director, Product & Solution at Sonar, for “Code Faster, Write Cleaner using AI Coding Assistants and Sonar” on Wednesday, March 20th to dive into the world of AI-assisted coding! Attendees will learn best practices for integrating AI coding assistants into their development workflows as well as practical advice to nurture a culture of clean code.

Register Now


If there’s one thing that a majority of developers truly hate, it’s debugging. 

While debugging small programs isn’t fun, it can get incredibly irritating when you have to debug millions of lines of code on a Friday evening to find that elusive bug.

To make things worse, bugs (software or otherwise) are tenacious. 

You get rid of one and two more show up. Just when you think you’ve finally fixed the issue and started testing things out, you realize that the patch you just made is causing another crash some other place within those million lines.

Before you know it, you are trudging your way through another bug hunt.

This is where SapFix projects itself as a game-changing tool in the field of automated bug fixing.

It’s a new AI hybrid tool created by Facebook with the goal of reducing the time engineers spend on debugging.

SapFix makes debugging easy by automatically generating fixes for specific issues and proposing those fixes to engineers for approval and deployment to production.

The below diagram shows the SapFix workflow at a high level. In a later section, we will see the entire process in even more detail.

It would be an understatement to say that SapFix has shown promise. Here are some facts worth considering:

  • SapFix has been used to suggest fixes for six key Android apps in the Facebook App family.

  • The apps are Facebook, Messenger, Instagram, FBLite, Workplace and Workchat.

  • Together, these apps consist of tens of millions of lines of code and are used daily by hundreds of millions of users worldwide.

If you think about it, those are 6 multi-million line code-bases and it’s still early development days for SapFix!

At this point, you might wonder how SapFix is able to generate fixes for so many diverse apps with wildly different uses ranging from communication to social media to building communities.

The Role of Sapienz and Infer

The secret sauce of SapFix is the adoption of automated program repair techniques.

These techniques are based on algorithms to identify, analyze and patch known software bugs without human intervention. One of the widely used approaches relies on software testing to direct the repair process. 

This is where Facebook leverages its automated test case design system known as Sapienz.

Sapienz uses Search-based Software Engineering (SBSE) to automatically design system-level test cases for mobile apps. Executing those test cases allows Sapienz to find 100s of crashes per month even before they can be discovered by Facebook’s internal human testers.

Think of SBSE as having a super smart helper that looks through all the lines of code and tries different combinations to fix a problem. It's a lot like when you try different pieces of a puzzle until they fit just right. 

As an estimate, Facebook’s engineers have been able to fix 75% of crashes reported by Sapienz. This indicates a very high signal-to-noise ratio for bug reports generated by Sapienz.

However, to improve this figure even further, Facebook also uses Infer.

Infer is an open-source tool that helps with localization and static analysis of the fixes proposed. Like Sapienz, Infer is also deployed directly onto Facebook’s internal continuous integration system and has access to the majority of Facebook’s code base. 

Sapienz and Infer collaborate with each other to provide information to developers about potential bugs such as:

  • Localization of the likely root cause

  • The failing test scenario that helped identify the bug

Image

However, Sapienz and Infer can only provide information and not save the developer’s time in actually fixing the issue. Sure, their collaboration helps identify bugs and their location within the code, but most of the work involved in fixing these bugs still falls to a developer.

This is where SapFix comes along and combines three important components to provide an end-to-end automated repair system:

  1. Mutation-based technique supported by patterns generated from previous human fixes

  2. The automated test design of Sapienz

  3. Infer’s static analysis and localization infrastructure

From picking the test cases that detect the crash to fixing the issue and re-testing, SapFix takes care of the entire process as part of Facebook’s continuous integration and deployment system.


Latest articles

If you’re not a paid subscriber, here’s what you missed last month.

1.      How Video Recommendations Work - Part 1

2.      How to Design a Good API?

3.      How do We Design for High Availability?

4.      Good Code vs. Bad Code

5.      Mastering Design Principles - SOLID

To receive all the full articles and support ByteByteGo, consider subscribing:


The SapFix Workflow

How does SapFix actually work?

There are four types of fixes that are performed by SapFix:

  • Template Fix

  • Mutation Fix

  • Diff Revert

  • Partial Diff Revert

Below is a diagram that shows the entire workflow of how SapFix handles the process of fixing an issue based on these types.

At its core, the process is extremely simple to understand.

The fix creation process receives the below input:

  • The buggy revision and the blamed file that contains the crash location

  • The blamed line where the crash is supposed to be happening

  • Stack trace of the crash

  • The unique id of the crash 

  • Author of the buggy revision (the developer who made the Diff)

  • Buggy expressions provided by Infer (this is null when Infer data isn’t available)

Based on this input, SapFix goes ahead and generates a list of revisions that can fix the crash. This list is created after SapFix has tested those revisions thoroughly.

From the input to output, there are several steps involved:

  1. Developers submit changes (called ‘Diffs’) to be reviewed using Phabricator (Facebook’s continuous integration system)

  2. SapFix uses Sapienz to select a few test cases to execute on each Diff submitted for review.

  3. When Sapienz pinpoints a specific crash to the given Diff, SapFix establishes the priority of fix types (template, mutation, revert, etc). 

  4. SapFix proceeds to try and generate multiple potential fixes per bug and then evaluates their quality.

  5. To do so, it runs existing, developer-written tests along with tests created by Sapienz on the patched builds. This validation process is autonomous and isolated from the larger codebase.

  6. In essence, SapFix is kind of debugging the codebase just like developers currently do. Think of the puzzle-solving approach we talked about earlier. However, unlike developers, SapFix cannot deploy the fix to production on its own.

  7. Once the patches are tested, SapFix selects one of the candidate patches and requests a human reviewer to review the change through the Phabricator Code Review system. The reviewer is chosen to be the software engineer who actually submitted the Diff that SapFix attempted to fix.

  8. This is the engineer who most likely has the best technical context to evaluate the patch. However, other relevant engineers are also subscribed to each Diff based on Facebook’s code review standards. What this means is that all Diffs proposed by SapFix are guaranteed to have at least one qualified human reviewer.

The above flow may appear simple, but there are some additional nuances to it, and understanding those makes things clearer.

Template Fix and Mutation Fix

As the name suggests, the template fix and mutation fix strategies choose between template and mutation-based fixes.

Template-based fixes are favored when all other parameters are equal.

But where do these templates come from?

Template fixes come from another tool known as Getafix that generates patches similar to the ones human developers produced in the past. From the perspective of SapFix, Getafix is a black box that contains a bunch of template fix patterns harvested from previous successful fixes.

As far as the mutation fix strategy is concerned, SapFix currently only supports fixing Null Pointer Exception (NPE) crashes. Though Facebook has a plan to cover more mutation strategies, just focusing on NPE has also provided a good amount of success.

High Firing Crashes

If neither template-based nor mutation-based strategies produce a patch that passes all tests, SapFix attempts to revert Diffs that result in high-firing crashes.

High-firing crash is a software bug that occurs frequently or affects a large number of users. 

There are a couple of reasons for reverting the diff instead of trying to patch:

  • High-firing crashes can block Sapienz and other testing technologies. Therefore, it’s important to delete or revert them from the master build as soon as possible.

  • High-firing bugs have a higher potential impact on the stability and reliability of the application.

The revert strategies (full and partial) basically delete the change made in the Diff. In practice, reverting can mean deletion, addition, or replacement of code in the current version of the system.

Between the two types of revert strategies, SapFix generally prefers full diff revert because partial diff revert has a higher probability of knock-on adverse effects. 

However, new Diffs are generated every few seconds and full diff reverts can also fail due to merge conflicts with other revisions. In those cases, SapFix attempts to go for partial diff revert since the changes produced are smaller and less prone to merge conflicts.

SapFix Adoption Results

Over a period of 3 months, after SapFix was adopted, it tackled 57 crashes related to Null-Pointer Exceptions (NPE). 

To handle these crashes, 165 patches were created (roughly half from template and half from mutation-based repair). Out of these 165 patches, 131 were correctly built and passed all tests. Finally, 55 were reported to the developers.

Also, initial reactions from the developers were quite positive. When going through the very first SapFix-proposed patch, the developers had the feeling of “living in the future”.

However, the time taken to generate a fix presented a slightly different issue. 

The median time from fault detection to publishing a fix to the developer came out to be 69 minutes. The worst case was approximately 1.5 hours and the fastest one was 37 minutes after the crash was first detected.

As you can also see, the overall range of observed values is pretty wide. 

The main reason for this is the computational complexity of fixing an issue and the variation in workloads on the CI/CD system. 

Since SapFix is deployed in a highly parallel, asynchronous environment, the time from detection to publication is influenced by the current demand on the system and the availability of computing resources.

Lessons Learned from SapFix

Facebook’s main philosophy behind SapFix was to focus on the industrial deployment of an automated repair system rather than academic research. Therefore, most of the decisions were focused on this goal.

Though much remains to be done, Facebook also learned a lot of lessons from SapFix that they have shared.

Here are a few important ones:

  • End-to-end automated repair can work at an industrial scale

  • The role of developers as the final gatekeeper is critical to the success of SapFix. There is still a lot of work required to have automated oracles.

  • Reverting of diffs is useful for high-firing crashes in the master build of the system

  • SapFix works best with newly arising failures or crashes. With pre-existing crashes, the relevancy is reduced because the developer reviewing the patch may not have a sufficient overview of the code.

  • Developer sociology is important to take into account. Despite SapFix providing ready-to-use patches, the developers may still prefer to clone and own the changes instead of simply landing the patches suggested by SapFix.

  • Developers showed a good amount of interest in interacting with the SapFix bot.

  • SapFix focuses more on removing the symptom rather than addressing the root cause. This needs more work in terms of identifying the root cause of failures and trying to fix them.

References:


SPONSOR US

Get your product in front of more than 500,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing hi@bytebytego.com.

 
Like
Comment
Restack
 

© 2024 ByteByteGo
548 Market Street PMB 72296, San Francisco, CA 94104
Unsubscribe

Get the appStart writing


by "ByteByteGo" <bytebytego@substack.com> - 11:40 - 5 Mar 2024