A flaw in Google Cloud's Vertex AI SDK for Python enabled attackers without project access to intercept machine learning model uploads and execute arbitrary code on Google's infrastructure. Palo Alto Networks Unit 42 discovered the vulnerability and reported it through Google's bug bounty program.
The attack exploits a process Unit 42 calls "Pickle in the Middle." The SDK downloads model files from Google Cloud Storage during uploads. An attacker can create a storage bucket with a name similar to the victim's legitimate bucket, then upload malicious serialized Python objects (pickles) to that bucket. When the SDK deserializes these files, it executes the attacker's code within Google's model serving infrastructure.
The vulnerability stems from insufficient validation of bucket names and the inherent risks of deserializing untrusted Python pickle objects. This combination allows attackers to perform what researchers term "bucket squatting." An attacker registers or creates a bucket with a name that closely matches a target victim's bucket, then positions malicious payloads to intercept the upload process.
Impact reaches any organization using Vertex AI's Python SDK to upload machine learning models to Google Cloud. Successful exploitation grants attackers code execution in Google's serving environment, potentially allowing them to steal model intellectual property, inject backdoors into models, or compromise downstream systems consuming those models. The attack requires no authentication to the victim's Google Cloud project.
Unit 42 confirmed no active exploitation in the wild at the time of discovery. Google has patched the vulnerability. Organizations running Vertex AI should update their Python SDK immediately. Additional mitigations include implementing strict bucket naming conventions, enabling bucket versioning to detect unauthorized uploads, and using Cloud IAM policies to restrict who can modify storage buckets containing model files.
The discovery highlights risks inherent in cloud machine learning platforms where serialized objects move between services. Machine learning teams should treat model artifacts with the same
