What happens if dependency repository is deleted on GitHub

githubopen sourceversion control

  • I own a GitHub repository, A.
  • Repository B is another open-source project, which is owned by someone else.
  • Repository A depends on repository B (repository B is a submodule of A).

If the owner of repository B decides to delete that repository, users will not be able to successfully clone/checkout/build my repository anymore.

Should I preemptively fork B to use as a backup in case the owner decides to delete it? Is this considered a dangerous situation to be in, or how is it usually handled for open-sourced projects?

Best Answer

If the owner of repository B decides to delete that repository, users will not be able to successfully clone/checkout/build my repository anymore.

If the dependent code "repo B" vanishes:

  • All users will be able to successfully clone your repo.
  • Existing users will probably have a copy of repo B locally and continue building just fine. Cloned repos do not typically get deleted if the source is deleted unless a user went out of their way to specifically setup for that scenario. As Git is a DVCS, it is designed to safeguard against this sort of thing.
  • New users will not be able to build your repo until they can get a copy of repo B from somewhere. You would be in this boat since you do not store a backup.

Should I preemptively fork B to use as a backup in case the owner decides to delete it?

Yes.

Is this considered a dangerous situation to be in or how is it usually handled for open-sourced projects?

Yes, this is a dangerous situation to be in depending on the popularity/distribution/mirrors of the dependent repo and how important your repo is to you. If it is important to others, they (hopefully) already have a backup of both your repo and the dep repo.

Note that you can fork it on GitHub to your account and not clone it to your SSD to not take up space. Also, keep in mind that that backup option is dependent on nothing happening to GitHub's servers or your account not being compromised; only you can determine the degree of redundancy which is adequate.

Consider the quantity of code you are relying on, its popularity, the difficulty to reproduce it, and the cost to store it reliably. After considering that risk assessment, back it up accordingly.


Since cost seems to be a factor in your situation given you don't want to spend more for a larger SSD, here is a list of cheap backup options:

  1. Obviously, fork it on GitHub as it is completely free. GitHub will use deduplication so the cost is extremely minimal for them.
  2. Locally (free), old spinning hard drives or usb flash drives. Also you may already be paying for free cloud backup via your ISP or cell provider.
  3. Remotely (free), many free cloud backup options or ask a friend.
  4. Remotely ($), purchase a per GB Usenet plan and upload it to Usenet (~25GB for $10 USD)