Protect your information assets : Verifying the exposure of private code repositories

Larbi OUIYZME
3 min readSep 24, 2024

--

Photo by Brecht Corbeel on Unsplash

With the rise of cyberattacks, protecting information assets especially private code repositories, has become a strategic priority for companies. Whether on GitHub or other platforms, your source code must be properly secured to prevent unauthorized exposure, which could lead to vulnerabilities exploitable by malicious actors.

Why should you protect your private repositories ?

Private repositories often contain sensitive information such as :

  • API keys
  • System configuration details
  • Authentication secrets
  • Proprietary algorithms or company data

If this information is accidentally made public, it can compromise not only the security of your applications but also your infrastructure.

GitLab and GitHub (since its acquisition by Microsoft) have enhanced their security with the introduction of AI, even though security is not their primary focus. They are working on detecting and revoking secrets. Imagine a company with 100 developers and thousands of lines of code, mistakes are inevitable, whether it’s an API key leak or the accidental exposure of source code. Human error is unavoidable, which is why it’s crucial to minimize the risk of exposing the company’s intellectual assets.

There are solutions like GitGuardian or SpectralOps, available in demo versions, as well as open-source solutions or custom scripts developed internally. It’s recommended to test and evaluate these tools during the trial period to scan the entire codebase for free, detect secrets, and export the results in CSV format.

Tips to protect your repositories:

  1. Limit Access : Only essential project members should have access to the source code repositories.
  2. Use Two-Factor Authentication (2FA) : This adds an extra layer of protection for accessing your GitHub accounts.
  3. Avoid Storing Sensitive Information in Code : Use tools like Vault or .env files to store secrets outside of the source code.
  4. Automate Security Audits : Integrate tools like GitGuardian or other to monitor repositories and detect exposed secrets.
  5. Use Private Repositories : Keep confidential projects in private repositories, and ensure they are not accidentally made public.

Automatically verify the exposure of private repositories

A simple way to check if a GitHub repository is exposed is by automating the verification using a Python script. This script checks the repository’s URL and determines whether it’s public or private based on the HTTP response code.

Here is an example of a Python script, check_repo_accessibility.py, available in my GitHub repository under the MIT License. that allows you to test the accessibility of repositories from a list of URLs stored in a text file.

# Version: 1.0
# Author: Larbi OUIYZME
# License: MIT

import requests

# Function to check if a repository is accessible
def check_repo_accessibility(repo_url):
response = requests.get(repo_url)

# If the repository returns a 404, it's either private or doesn't exist
if response.status_code == 404:
print(f"[PRIVATE or NON-EXISTENT] The repository {repo_url} is not accessible (404).")
# If the repository returns a 200, it's public
elif response.status_code == 200:
print(f"[PUBLIC] The repository {repo_url} is public.")
# For any other response code, print the error
else:
print(f"[ERROR] Unable to check {repo_url}. Response code: {response.status_code}")

# Load repository URLs from the repos.txt file
def load_repos_from_file(filename):
with open(filename, 'r') as file:
# Read each line and strip any extra whitespace or newline characters
repos = [line.strip() for line in file if line.strip()]
return repos

# Load the URLs from the 'repos.txt' file
repo_file = 'repos.txt'
repos = load_repos_from_file(repo_file)

# Check the accessibility of each repository
for repo_url in repos:
check_repo_accessibility(repo_url)

Script explanation :

  • The repos.txt file contains a list of repository URLs that you want to check.
  • The script uses the requests library to send an HTTP GET request to each repository URL.
  • If the repository returns a 404 response, it means it’s either private or doesn’t exist.
  • If the repository returns a 200 response, it means it’s public.
  • Other response codes are handled to detect possible errors.

How to use the script :

  1. Create a repos.txt file containing one repository URL per line, for example :
https://github.com/username/repo1
https://github.com/username/repo2

2. Run the Python script to check the accessibility of each repository.

Conclusion :

By securing your source code repositories and implementing automated audit measures, you significantly reduce the risk of exposing sensitive information. This type of script enables you to proactively monitor the status of your repositories and avoid mistakes that could lead to critical data leaks.

--

--

Larbi OUIYZME

I'm Larbi, from Morocco. IT trainer and Chief Information Security Officer (CISO), I'm committed to share knowledge. Also, Ham Radio CN8FF passionate about RF