A researcher has exposed how Bright Data embeds surveillance code into consumer applications that transforms devices, including smart TVs, into unwilling proxies for web scraping operations. The company operates the world's largest residential proxy network and markets its services to AI companies seeking training data.
The iOS SDK that Bright Data distributes through free apps converts user devices into exit nodes. These nodes relay web-scraping traffic without explicit user consent or clear disclosure. Always-on smart TVs become particularly valuable targets because they remain connected and active continuously, making them ideal for sustained data harvesting operations.
Bright Data, the successor to Luminati, positions itself as a legitimate data collection vendor. The company supplies residential IP addresses and bandwidth to clients conducting large-scale web scraping. AI companies use this infrastructure to harvest training data from websites that often prohibit automated access. The residential proxy approach masks scraping activity behind real home internet connections, bypassing technical and legal defenses that websites deploy.
The researcher's reverse-engineering of the iOS SDK revealed how silently the collection mechanism operates. Users downloading ostensibly free apps grant permission for data collection, but the extent and nature of proxy activity remains obscured. Most users lack visibility into how their devices participate in scraping operations or what websites their connections target.
This model creates material risks for device owners. Their internet connections become liable for web scraping activity that violates website terms of service. Copyright infringement claims, legal liability, and bandwidth consumption fall on unsuspecting users whose devices power the operation. ISPs may throttle or terminate accounts conducting suspicious traffic patterns.
The practice also enables circumvention of geofencing and access controls. Bright Data's network allows clients to appear as if they originate from residential addresses worldwide, defeating location-based protections. Organizations cannot effectively block scraping when requests originate from legitimate home networks.
Bright Data operates in a gray legal zone. While
