How can I block any crawlers to access anything on gitlab?
there should be a robots.txt or something similar to tell not to crawl. That would be good as a first step.
But the more important thing, how can I tell gitlab only authenticated access is allowed?
e.g.
https://gitlab.yourdomain.com/ is accessible public
also
https://gitlab.yourdomain.com/explore is accessible public
if both URLs are protected behind a authentication no crawler can even fetch anything anymore. But how to configure it with gitlab CE?
To be even more clear nothing else except the login dialog shall be public visible. How to manage this with gitlab CE?
Best Answer
There is a
robots.txt
in the repositoryhttps://gitlab.com/gitlab-org/gitlab-foss/blob/master/public/robots.txt
Also, if you set a projects visibility to
private
, you won't be able to view the project at the URL's in your example.