Hugging Face AI Tools Leaks Expose Major Organizations to Security Risks: Investigation and Actions Taken

Posted: December 15, 2023

Major Organizations at Risk Due to Hugging Face AI Tools Leaks

A leading cybersecurity team, Lasso, has recently discovered and warned about the exposure of more than 1,600 Hugging Face Application Programming Interface (API) tokens. These leaked tokens pose a security concern for hundreds of major organizations, including tech giants like Google, Meta, and Microsoft.

The security blunder has left these organizations potentially vulnerable to a series of menacing actions by any ill-intentioned actors who get their hands on these tokens. The impacts of these leaked tokens constitute a spectrum of malicious activities, including the manipulation of AI models, theft of private models, and training data poisoning, thereby severely jeopardizing sensitive company assets and institutional infrastructure.

Investigation and Actions Taken

In collating the details of this major security compromise, Lasso's diligent team of researchers started investigating the leaked tokens in November 2023. Their rigorous investigation yielded alarming results: 1,681 valid tokens were discovered, and 655 tokens had write permissions for as many as 77 organizations. This essentially granted full control over the repositories to anyone possessing the tokens.

In response to the leaked tokens, Hugging Face has deprecated the org_api tokens, thus blocking their use in its Python library as a preventive action against further potential breaches. Lasso also promptly alerted the organizations that were directly affected by the leak. These organizations have since taken measures to mitigate the exposure by revoking the leaked tokens and removing the public access token code.

Despite the swift action taken by all involved parties, the situation raises serious questions about the degree to which organizations safeguard their digital assets and the need for increased diligence and better practices in information security.

Security Firm's Warnings and Findings

Lasso researchers shared their concerns about the implications of the detected vulnerabilities. The leaked tokens cause a significant security gap that can give malicious individuals the ability to poison trained data. This cyberattack technique compromises the integrity of these organizations' machine learning (ML) models, affecting the overall quality and reliability of the AI technologies that thrive on these models.

Furthermore, it highlighted the severity of the situation by outlining the potential scale of the exploit. Any entity with ill intentions gaining control over an organization's repository with millions of downloads could manipulate the ML models at will. This kind of manipulation could lead to widespread harm, impacting millions of users globally who depend on the services offered by these entities.

Though Hugging Face moved quickly to block the tokens in its Python library as an immediate measure, it wasn't enough to fully mitigate the risk. It's worth noting that this action did not block the read permissions. This means while the malicious elements may not be able to alter the data or models, they still can have access to read sensitive data and gain crucial insights into an organization's AI models and systems. Consequently, even with the initial steps taken, the organizations associated with the leaked API tokens remain exposed to significant threats until a fuller suite of protective countermeasures can be employed.

Additional Related News

Hugging Face Informed About the Findings

Hugging Face, the AI startup that fell prey to the token leaks was promptly notified of the findings. This enabled them to take the initial steps to rectify the situation and put measures in place to guard against future similar occurrences.

Improvements in GitHub's Security Scanning Feature

To bolster its security profile, GitHub has upgraded its security scanning feature to include extra token validity checks. This enhancement will aid in promptly detecting any security loopholes and nipping them in the bud before they pose serious risks to user repositories.

Warnings of Stolen OAuth Tokens

Security firms raise new warnings on the theft of OAuth tokens, leading to the downloading of private repositories on GitHub. The stolen tokens give unauthorized access to sensitive project files and information stored in private repositories, posing a severe threat to organizational security.