Archive.org – Why URL is Not Available

archive.orghttps

If I try to save the URL https://www.uni-ms.de/de/ using the Wayback Machine (archive.org), I get the following response:

Wayback Machine error

However, the site is perfectly fine, and I haven’t been able to find anything strange in the HTTPS certificates or HTTP response headers, either: https://www.uni-ms.de/de/. Saving the HTTP version http://www.uni-ms.de/de/ works without issues, too. What’s the problem here?

Best Answer

There's something about the way this server uses SNI that is incompatible with Java's libraries:

https://www.ssllabs.com/ssltest/analyze.html?d=www.uni-ms.de

So no, our crawler can't talk to it over https. I haven't seen this particular problem before.

This one works: http://web.archive.org/web/*/https://www.uni-muenster.de/de/ you can see the save I just made.

Surfing the tubes of the Internet, this appears to be a change that the server could make to not confuse Java:

https://stackoverflow.com/questions/7615645/ssl-handshake-alert-unrecognized-name-error-since-upgrade-to-java-1-7-0/8058839#8058839

[Earlier I claimed I had saved it, I wasn't paying attention and I was actually looking at an earlier snapshot.]