If you’re thinking of buying a “smart” TV for the holidays, you ought to know that your new device is constantly capturing snapshots of what’s on screen and sending them back to the manufacturer — even if you are using the device as a computer monitor and not watching TV at all. The findings come from a recent study by computer scientists at the University of California, Davis; University College London; and Universidad Carlos III de Madrid, published in Proceedings of the 2024 ACM on Internet Measurement Conference.
Smart TVs can do all the things “dumb” or linear televisions do, plus providing a wide range of internet services and additional channels offered by the manufacturers. You can also cast content from other devices, such as laptops and phones, either wirelessly or with an HDMI connection.
Tech companies collect information about our internet use and use it to sell targeted advertising. Zubair Shafiq, associate professor of computer science at UC Davis, and graduate student Yash Vekaria wanted to know how advertisers track you between devices. For example, you might be watching television in the living room, then go into another room to work on your laptop and see ads related to the shows you were just watching come up on your feeds.
“To understand cross-device tracking, we wanted to first understand how these TV manufacturers or advertisers know what the user might be interested in. When we started looking into it, we came across technology referred to as ACR, which stands for automatic content recognition. And we found that that’s the main component which is responsible for generating audience segments on a smart TV,” Vekaria said.
ACR can recognize content based on small snippets, similar to the Shazam app that can identify song and artist if you play it a piece of music. This ACR code is built into the operating system of smart TVs.
The researchers looked at TVs made by two leading manufacturers, Samsung and LG. Vekaria set up an intermediary server to capture traffic coming from a TV set in the lab and discover where it might be going on the internet. He also dug into the code running on the devices.
They found that the smart TVs captured snapshots of audio or video as often as every 10 milliseconds, batched them and used an algorithm to generate a “fingerprint” representing all the content over a time interval, such as the past minute. This fingerprint was sent to a company server and matched against a database of all the content available through the TV service.
“They do a match against the database to figure out what exact piece of content that user is streaming at this point in time. When they do this over a period of time, they can infer that, say, this person watches NFL from 9 to 12 p.m., but they generally watch news in the afternoon,” Vekaria said.
The TV companies can then use this information to sell targeted advertising on their platform.
“These smart TV ad platforms have this vast profile of every single TV customer, and the ads on every TV get personalized based on that profile,” Shafiq said.
Capturing content types
The researchers looked at five types of content: “linear” TV, a single TV channel broadcast by antenna; FAST (Free Ad-Supported TV), essentially a broadcast TV channel delivered over the internet; OTT (over the top), streaming apps such as Netflix or Prime, delivered over the internet; content from a laptop or gaming console connected by HDMI cable; and screencast content mirrored from a nearby laptop or phone.
They found that ACR on TVs sold in the United States was capturing linear TV, FAST channels and content shared over HDMI connections, but not screencasts and OTT content. The latter is likely because of agreements with those companies, which collect their own data on users.
The researchers also compared ACR in the same brands of smart TVs in the United Kingdom with the U.S. They found that information was being collected and shared at about the same rate in the British models, but there were some more restrictions on ACR data collection on FAST content, probably due to differences in agreements between manufacturers and copyright owners.
The documentation that comes with these TV sets does mention data collection and ACR, but the descriptions are vague and high-level, Shafiq said. By default, you have to opt in to ACR to set up your new TV. It is possible to go back in and turn off these settings, but it is not straightforward.
“As a user, you might not know all the possible options that exist that you need to turn off,” Shafiq said. “It’s very easy to opt in, but to opt out, it’s an extensive process.”
Smart TVs are only the start, Shafiq said. Companies are working on technology such as Microsoft Recall, which records everything you do on the screen of your device and analyzes it with AI.
“Basically it's ACR supercharged by AI that is going to be in all of our other devices in the near future. So things are trending in the wrong direction, it seems, from a privacy perspective,” Shafiq said.
As for this holiday shopping season, Shafiq said he’s looking for a dumb TV.
“Let me know if you can find one,” he said.
Additional authors on the paper are Gianluca Anselmi and Anna Maria Mandalari, University College London, and Patricia Callejo, Universidad Carlos III de Madrid. The work was supported by grants from the National Science Foundation, the Engineering and Physical Sciences Research Council (U.K.) and the European Union.
Media Resources
Watching TV with the Second-Party: A First Look at Automatic Content Recognition Tracking in Smart TVs (ACM Digital Library)
Media Contacts
- Zubair Shafiq, Computer Science, zshafiq@ucdavis.edu
- Yash Vekaria, Computer Science, yvekaria@ucdavis.edu
- Andy Fell, News and Media Relations, 530-304-8888, ahfell@ucdavis.edu