Cybersecurity researchers have revealed a GitHub design flaw that allows access to deleted and private repository data. Learn how the issue, dubbed Cross Fork Object Reference (CFOR), puts sensitive information at risk and what it means for open-source security.
Cybersecurity researchers at Truffle Security, an open-source security software company, have found that anyone can access deleted and private repository data on GitHub, and it’s available forever.
This issue, known as Cross Fork Object Reference (CFOR), allows users to directly access commit data from another fork, including data from private and deleted forks.
According to Truffle Security, this is not a bug, but rather an intentional design feature of GitHub’s repository architecture. This means that any code committed to a public repository may be accessible forever, even if the original repository is deleted, as long as there is at least one fork of that repository.
The researchers demonstrated three scenarios where this issue can be exploited:
- Accessing Deleted Fork Data: When a user forks a public repository, commits code to it, and then deletes the fork, the committed code is still accessible forever. This is because GitHub stores repositories and forks in a repository network, and deleting a fork does not delete the committed data.
- Accessing Deleted Repo Data: When a public repository is deleted, GitHub reassigns the root node role to one of the downstream forks. This means that all commits from the original repository still exist and are accessible via any fork, even if the fork never synced with the original repository.
- Accessing Private Repo Data: When a private repository is open-sourced, any code committed between the time the internal fork was created and when the repository was made public is accessible on the public repository. This is because the private fork and public repository are part of the same repository network.
To access the data, an attacker only needs to know the commit hash https://github.com/<user/org>/<repo>/commit/<commit_hash>
, which can be brute-forced through GitHub’s UI or obtained through the public events API endpoint. This means that confidential data and secrets may be inadvertently exposed on an organization’s public GitHub repositories.
Truffle Security has submitted their report titled “Anyone can Access Deleted and Private Repository Data on GitHub” to GitHub via their Vulnerability Disclosure Program, and GitHub has responded that their architecture is designed to work this way. While GitHub is transparent about their architecture, the average user may not understand the implications of this design, and the act of deletion does not necessarily mean that the data is deleted.
Truffle Security recommends that users take steps to securely remediate leaked keys on public GitHub repositories through key rotation. They also note that secret scanning tools will need to be updated to handle these scenarios and that users should be aware that deleted data may still be accessible.
This issue is not unique to GitHub, and other version control system products may also be affected. As the use of open-source software continues to grow, users must understand the security implications of how these systems work.