Sonic Sleuthing

The sound of keystrokes captured during phone or video calls can be decoded by machine learning to reveal secrets with up to 95% accuracy.

Nick Bild
9 months ago β€’ Security

Side channel attacks are a unique and sophisticated type of security threat that exploit unintended information leaks from a system during its regular operation. Unlike traditional attacks that directly target software or hardware vulnerabilities, side channel attacks exploit the observable behaviors of a system to infer sensitive information. This can include details about cryptographic keys, passwords, or other confidential data. These attacks work by analyzing seemingly harmless side channel information such as power consumption, electromagnetic emissions, and heat signatures.

These attacks are particularly concerning when it comes to user privacy because they can expose highly sensitive information without directly breaking encryption or authentication mechanisms. For example, an attacker could monitor the power consumption of a device while it's performing cryptographic operations and deduce the secret encryption key being used. This poses a significant threat to data confidentiality and privacy, as sensitive information that was thought to be well-protected could suddenly become vulnerable to exposure.

However, attacks that measure power consumption, the heat signature of the keys on a keyboard, and many other similar attacks require a substantial amount of access to the environment the targeted system is in, if not the targeted system itself. For those trying to stay safe from malicious attackers, that is good news, because it makes it much easier to keep systems secure. However, recent developments may cast new doubts on the security of systems that were once considered to be beyond the reach of attackers.

A trio of engineers led by a researcher at Durham University in England have developed a method that makes it practical to determine what is being typed on a keyboard by simply listening to the sound that it makes. The audio can be acquired by a microphone on a smartphone nearby the target system, but more concerningly, their methods still work with a high degree of accuracy when that audio is captured via a phone call or Zoom video call β€” no direct physical access to the location of the targeted system is required.

The exploit works by using a CoAtNet deep convolutional neural network to analyze spectrograms of audio recorded as keys are pressed on a keyboard. The model classifies these key presses to give a prediction as to which key was pressed to make that sound. The model was trained to recognize 36 keys (A-Z, 0-9) by capturing audio of them being pressed 25 times each. The presses were performed with varying pressure, and by different fingers, to help account for different cases that are likely to be encountered in real-world scenarios.

After preparing the model, the researchers ran a series of experiments on an off-the-shelf MacBook Pro 16-inch laptop. In these trials, an individual typed on the keyboard during both voice calls on a smartphone and a Zoom video call. This audio was analyzed using the new technique, and it was found that keystrokes could be accurately identified 95% of the time on average during phone calls. The accuracy only dropped slightly, to 93%, when capturing audio from Zoom calls.

These results are highly impressive, however, as it currently stands, the model must first be trained on audio samples from the specific keyboard that is being targeted. But before you allow yourself to get too comfortable, that may change in the future. By collecting a much larger training set, that present requirement could disappear. A model trained on that dataset may have the ability to recognize keystrokes on virtually any keyboard.

For the near-term, touch-typing and intentionally varying one’s typing style β€” at least when entering sensitive data β€” can be sufficient to defeat the attack. Looking further ahead, we may have to be more careful about typing when microphones are nearby. Perhaps a device that mutes microphones when typing, or one that makes random key press sounds, will emerge to defeat the attack.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles