I know it is possible to search by domain as mentioned at: How to search Internet Archive for all pages from a particular domain but this is not always feasible for huge domains like Twitter, where there will be way too many results for a given date.
For example I spent a long time looking for: https://web.archive.org/web/20191005025317if_/https://twitter.com/dmorey/status/1180312072027947008
I knew the user's URL: https://twitter.com/dmorey and the approximate date based on news reports, but not the Tweet URL since the Tweet had been deleted, so what I would like would be to search for any URL under https://twitter.com/dmorey
I tried to add an asterisk: https://twitter.com/dmorey/*
: http://web.archive.org/web/20190615000000*/https://twitter.com/dmorey/* but that has no results.
How I ended up finding the page: http://archive.is has the asterisk feature, which pointed me to the desired exact URL.
Asked them on Twitter at: https://twitter.com/cirosantilli/status/1312690953875083265
Best Answer
I found a way. It is not amazing, but it would have worked. You can construct a search query of an URL of type:
http://web.archive.org/cdx/search/cdx?url=twitter.com/dmorey/status&matchType=prefix&limit=1000&from=20191001&to=20191007
and
matchType=prefix
will make it search by prefix, andfrom
andto
date ranges.This gives a text list of what I presume are archived URLs, one of which was the desired one:
I found this by Googling into https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md which documents those search parameters.