Alexa, Siri, Google Assistant vulnerable to malicious commands, study reveals

Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More


A new study from researchers at Amazon Web Services has exposed significant security flaws in large language models that can understand and respond to speech. The paper, titled “SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models,” details how these AI systems can be manipulated to produce harmful or unethical responses using carefully designed audio attacks.

As speech interfaces become ubiquitous, from smart speakers to AI assistants, ensuring the safety and robustness of the underlying technology is crucial. However, the AWS researchers found that despite built-in safety checks, speech language models (SLMs) are highly vulnerable to “adversarial attacks” — slight perturbations to the audio input that are imperceptible to humans but can completely alter the model’s behavior.

A diagram from the AWS research paper illustrates how a spoken question-answering AI system can be manipulated into providing unethical instructions on how to rob a bank when subjected to an adversarial attack. The researchers propose a pre-processing defense to mitigate such vulnerabilities in speech-based language models. (image credit: arxiv.org)

Jailbreaking SLMs with adversarial audio

“Our experiments on jailbreaking demonstrate the vulnerability of SLMs to adversarial perturbations and transfer attacks, with average attack success rates of 90% and 10% respectively when evaluated on a dataset of carefully designed harmful questions,” the authors wrote. “This raises serious concerns about the potential for bad actors to exploit these systems at scale.”

Using a technique called projected gradient descent, the researchers were able to generate adversarial examples that consistently caused the SLMs to produce toxic outputs across 12 different categories, from explicit violence to hate speech. Shockingly, with full access to the model, they achieved a 90% success rate in compromising its safety barriers.


Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now


The research demonstrates how adversarial attacks can be carried out across different spoken question-answering AI models, using techniques like cross-model and cross-prompt attacks to elicit unintended responses, underscoring the need for robust, transferable defenses. (image credit: arxiv.org)

Black-box attacks: A real-world threat

Even more alarming, the study showed that audio attacks crafted on one SLM often transferred to other models, even without direct access — a realistic scenario given that most commercial providers only allow limited API access. While the success rate dropped to 10% in this “black box” setting, it still represents a serious vulnerability.

“The transferability of these attacks across different model architectures suggests that this is not just an issue with a specific implementation, but a deeper flaw in how we currently train these systems to be safe and aligned,” said lead author Raghuveer Peri.

The implications are far-reaching as businesses increasingly rely on speech AI for customer service, data analysis, and other core functions. Beyond reputational damage from an AI going rogue, adversarial attacks could be used for fraud, espionage, or even physical harm if connected to automated systems.

Countermeasures and the road ahead

Fortunately, the researchers also propose several countermeasures, such as adding random noise to the audio input — a technique known as randomized smoothing. In experiments, this significantly reduced the attack success rate. However, the authors caution it is not a complete solution.

“Defending against adversarial attacks is an ongoing arms race,” said Peri. “As the capabilities of these models grow, so does the potential for misuse. It’s critical that we continue to invest in making them safe and robust against active threats.”

The SLMs in the study were trained on dialog data to achieve state-of-the-art performance on spoken question-answering tasks, reaching over 80% on both safety and helpfulness benchmarks prior to the attacks. This highlights the difficult balance between capability and safety as the technology advances.

With major tech companies racing to develop and deploy ever more powerful speech AI, this research serves as a timely warning that security must be a top priority, not an afterthought. Regulators and industry groups will need to work together to establish rigorous standards and testing protocols.

As co-author Katrin Kirchhoff put it, “We’re at an inflection point with this technology. It has immense potential to benefit society, but also to cause harm if not developed responsibly. This study is a step towards ensuring we can reap the benefits of speech AI while mitigating the risks.”