A report from Belgian public broadcaster VRT NWS has revealed how contractors paid to transcribe audio clips collected by Google’s AI assistant can end up listening to sensitive information about users, including names, addresses, and details about their personal lives.
It’s the latest story showing how our interactions with AI assistants are not as private as we may like to believe. Earlier this year, a report from Bloomberg revealed similar details about Amazon’s Alexa, explaining how audio clips recorded by Echo devices are sent without users’ knowledge to human contractors, who transcribe what’s being said in order to improve the company’s AI systems.
Worse, these audio clips are often recorded entirely by accident. Usually, AI assistants like Alexa and Google Assistant only start recording audio when they hear their wake word (eg, “Okay Google”), but these reports show the devices often start recording by mistake.
In the story by VRT NWS, which focuses on Dutch and Flemish speaking Google Assistant users, the broadcaster reviewed a thousand or so recordings, 153 of which had been captured accidentally. A contractor told the publication that he transcribes around 1,000 audio clips from Google Assistant every week. In one of the clips he reviewed he heard a female voice in distress and said he felt that “physical violence” had been involved. “And then it becomes real people you’re listening to, not just voices,” said the contractor.
You can watch more in the video report below:
Tech companies say that sending audio clips to humans to be transcribed is an essential process for improving their speech recognition technology. They also stress that only a small percentage of recordings are shared in this way. A spokesperson for Google told Wired that just 0.2 percent of all recordings are transcribed by humans, and that these audio clips are never presented with identifying information about the user.
These obfuscations could cause legal trouble for the company, says Michael Veale, a technology privacy researcher at the Alan Turing Institute in London. He told Wired that this level of disclosure might not meet the standards set by the EU’s GDPR regulations. “You have to be very specific on what you’re implementing and how,” said Veale. “I think Google hasn’t done that because it would look creepy.”
We’ve reached out to Google for comment and will update this story if we hear more.