Using Chrome’s accessibility APIs to find security bugs
						Posted by Adrian Taylor, Security Engineer, Chrome
    
      
Chrome’s user interface (UI) code is complex, and sometimes has bugs.
Are those bugs security bugs? Specifically, if a user’s clicks and actions result in memory corruption, is that something that an attacker can exploit to harm that user?
Our security severity guidelines say “yes, sometimes.” For example, an attacker could very likely convince a user to click an autofill prompt, but it will be much harder to convince the user to step through a whole flow of different dialogs.
Even if these bugs aren’t the most easily exploitable, it takes a great deal of time for our security shepherds to make these determinations. User interface bugs are often flakey (that is, not reliably reproducible). Also, even if these bugs aren’t necessarily deemed to be exploitable, they may still be annoying crashes which bother the user.
It would be great if we could find these bugs automatically.
If only the whole tree of Chrome UI controls were exposed, somehow, such that we could enumerate and interact with each UI control automatically.
Aha! Chrome exposes all the UI controls to assistive technology. Chrome goes to great lengths to ensure its entire UI is exposed to screen readers, braille devices and other such assistive tech. This tree of controls includes all the toolbars, menus, and the structure of the page itself. This structural definition of the browser user interface is already sometimes used in other contexts, for example by some password managers, demonstrating that investing in accessibility has benefits for all users. We’re now taking that investment and leveraging it to find security bugs, too.
Specifically, we’re now “fuzzing” that accessibility tree - that is, interacting with the different UI controls semi-randomly to see if we can make things crash. This technique has a long pedigree.
Screen reader technology is a bit different on each platform, but on Linux the tree can be explored using Accerciser.
Screenshot of Accerciser showing the tree of UI controls in Chrome
All we have to do is explore the same tree of controls with a fuzzer. How hard can it be?
“We do this not because it is easy, but because we thought it would be easy” - Anon.
Actually we never thought this would be easy, and a few different bits of tech have had to fall into place to make this possible. Specifically,
There are lots of combinations of ways to interact with Chrome. Truly randomly clicking on UI controls probably won’t find bugs - we would like to leverage coverage-guided fuzzing to help the fuzzer select combinations of controls that seem to reach into new code within Chrome.
We need any such bugs to be genuine. We therefore need to fuzz the actual Chrome UI, or something very similar, rather than exercising parts of the code in an unrealistic unit-test-like context. That’s where our InProcessFuzzer framework comes into play - it runs fuzz cases within a Chrome browser_test; essentially a real version of Chrome.
But such browser_tests have a high startup cost. We need to amortize that cost over thousands of test cases by running a batch of them within each browser invocation. Centipede is designed to do that.
But each test case won’t be idempotent. Within a given invocation of the browser, the UI state may be successively modified by each test case. We intend to add concatenation to centipede to resolve this.
Chrome is a noisy environment with lots of timers, which may well confuse coverage-guided fuzzers. Gathering coverage for such a large binary is slow in itself. So, we don’t know if coverage-guided fuzzing will successfully explore the UI paths here.
All of these concerns are common to the other fuzzers which run in the browser_test context, most notably our new IPC fuzzer (blog posts to follow). But the UI fuzzer presented some specific challenges.
Finding UI bugs is only useful if they’re actionable. Ideally, that means:
Our fuzzing infrastructure gives a thorough set of diagnostics.
It can bisect to find when the bug was introduced and when it was fixed.
It can minimize complex test cases into the smallest possible reproducer.
The test case is descriptive and says which UI controls were used, so a human may be able to reproduce it.
These requirements together mean that the test cases should be stable across each Chrome version - if a given test case reproduces a bug with Chrome 125, hopefully it will do so in Chrome 124 and Chrome 126 (assuming the bug is present in both). Yet this is tricky, since Chrome UI controls are deeply nested and often anonymous.
Initially, the fuzzer picked controls simply based on their ordinal at each level of the tree (for instance “control 3 nested in control 5 nested in control 0”) but such test cases are unlikely to be stable as the Chrome UI evolves. Instead, we settled on an approach where the controls are named, when possible, and otherwise identified by a combination of role and ordinal. This yields test cases like this:
action {
path_to_control {
named {
name: "Test - Chromium"
}
}
path_to_control {
anonymous {
role: "panel"
}
}
path_to_control {
anonymous {
role: "panel"
}
}
path_to_control {
anonymous {
role: "panel"
}
}
path_to_control {
named {
name: "Bookmarks"
}
}
take_action {
action_id: 12
}
}
  
Fuzzers are unlikely to stumble across these control names by chance, even with the instrumentation applied to string comparisons. In fact, this by-name approach turned out to be only 20% as effective as picking controls by ordinal. To resolve this we added a custom mutator which is smart enough to put in place control names and roles which are known to exist. We randomly use this mutator or the standard libprotobuf-mutator in order to get the best of both worlds. This approach has proven to be about 80% as quick as the original ordinal-based mutator, while providing stable test cases.
Chart of code coverage achieved by minutes fuzzing with different strategies
So, does any of this work?
We don’t know yet! - and you can follow along as we find out. The fuzzer found a couple of potential bugs (currently access restricted) in the accessibility code itself but hasn’t yet explored far enough to discover bugs in Chrome’s fundamental UI. But, at the time of writing, this has only been running on our ClusterFuzz infrastructure for a few hours, and isn’t yet working on our coverage dashboard. If you’d like to follow along, keep an eye on our coverage dashboard as it expands to cover UI code.
					
											
					
			Chrome’s user interface (UI) code is complex, and sometimes has bugs.
Are those bugs security bugs? Specifically, if a user’s clicks and actions result in memory corruption, is that something that an attacker can exploit to harm that user?
Our security severity guidelines say “yes, sometimes.” For example, an attacker could very likely convince a user to click an autofill prompt, but it will be much harder to convince the user to step through a whole flow of different dialogs.
Even if these bugs aren’t the most easily exploitable, it takes a great deal of time for our security shepherds to make these determinations. User interface bugs are often flakey (that is, not reliably reproducible). Also, even if these bugs aren’t necessarily deemed to be exploitable, they may still be annoying crashes which bother the user.
It would be great if we could find these bugs automatically.
If only the whole tree of Chrome UI controls were exposed, somehow, such that we could enumerate and interact with each UI control automatically.
Aha! Chrome exposes all the UI controls to assistive technology. Chrome goes to great lengths to ensure its entire UI is exposed to screen readers, braille devices and other such assistive tech. This tree of controls includes all the toolbars, menus, and the structure of the page itself. This structural definition of the browser user interface is already sometimes used in other contexts, for example by some password managers, demonstrating that investing in accessibility has benefits for all users. We’re now taking that investment and leveraging it to find security bugs, too.
Specifically, we’re now “fuzzing” that accessibility tree - that is, interacting with the different UI controls semi-randomly to see if we can make things crash. This technique has a long pedigree.
Screen reader technology is a bit different on each platform, but on Linux the tree can be explored using Accerciser.
Screenshot of Accerciser showing the tree of UI controls in Chrome
All we have to do is explore the same tree of controls with a fuzzer. How hard can it be?
“We do this not because it is easy, but because we thought it would be easy” - Anon.
Actually we never thought this would be easy, and a few different bits of tech have had to fall into place to make this possible. Specifically,
There are lots of combinations of ways to interact with Chrome. Truly randomly clicking on UI controls probably won’t find bugs - we would like to leverage coverage-guided fuzzing to help the fuzzer select combinations of controls that seem to reach into new code within Chrome.
We need any such bugs to be genuine. We therefore need to fuzz the actual Chrome UI, or something very similar, rather than exercising parts of the code in an unrealistic unit-test-like context. That’s where our InProcessFuzzer framework comes into play - it runs fuzz cases within a Chrome browser_test; essentially a real version of Chrome.
But such browser_tests have a high startup cost. We need to amortize that cost over thousands of test cases by running a batch of them within each browser invocation. Centipede is designed to do that.
But each test case won’t be idempotent. Within a given invocation of the browser, the UI state may be successively modified by each test case. We intend to add concatenation to centipede to resolve this.
Chrome is a noisy environment with lots of timers, which may well confuse coverage-guided fuzzers. Gathering coverage for such a large binary is slow in itself. So, we don’t know if coverage-guided fuzzing will successfully explore the UI paths here.
All of these concerns are common to the other fuzzers which run in the browser_test context, most notably our new IPC fuzzer (blog posts to follow). But the UI fuzzer presented some specific challenges.
Finding UI bugs is only useful if they’re actionable. Ideally, that means:
Our fuzzing infrastructure gives a thorough set of diagnostics.
It can bisect to find when the bug was introduced and when it was fixed.
It can minimize complex test cases into the smallest possible reproducer.
The test case is descriptive and says which UI controls were used, so a human may be able to reproduce it.
These requirements together mean that the test cases should be stable across each Chrome version - if a given test case reproduces a bug with Chrome 125, hopefully it will do so in Chrome 124 and Chrome 126 (assuming the bug is present in both). Yet this is tricky, since Chrome UI controls are deeply nested and often anonymous.
Initially, the fuzzer picked controls simply based on their ordinal at each level of the tree (for instance “control 3 nested in control 5 nested in control 0”) but such test cases are unlikely to be stable as the Chrome UI evolves. Instead, we settled on an approach where the controls are named, when possible, and otherwise identified by a combination of role and ordinal. This yields test cases like this:
action {
path_to_control {
named {
name: "Test - Chromium"
}
}
path_to_control {
anonymous {
role: "panel"
}
}
path_to_control {
anonymous {
role: "panel"
}
}
path_to_control {
anonymous {
role: "panel"
}
}
path_to_control {
named {
name: "Bookmarks"
}
}
take_action {
action_id: 12
}
}
Fuzzers are unlikely to stumble across these control names by chance, even with the instrumentation applied to string comparisons. In fact, this by-name approach turned out to be only 20% as effective as picking controls by ordinal. To resolve this we added a custom mutator which is smart enough to put in place control names and roles which are known to exist. We randomly use this mutator or the standard libprotobuf-mutator in order to get the best of both worlds. This approach has proven to be about 80% as quick as the original ordinal-based mutator, while providing stable test cases.
Chart of code coverage achieved by minutes fuzzing with different strategies
So, does any of this work?
We don’t know yet! - and you can follow along as we find out. The fuzzer found a couple of potential bugs (currently access restricted) in the accessibility code itself but hasn’t yet explored far enough to discover bugs in Chrome’s fundamental UI. But, at the time of writing, this has only been running on our ClusterFuzz infrastructure for a few hours, and isn’t yet working on our coverage dashboard. If you’d like to follow along, keep an eye on our coverage dashboard as it expands to cover UI code.
							10
				Oct
						2024
		
		
		