Yahoo Groups – How to Download Entire Archive of Past Messages

archivingbulk-downloaddownloadyahooyahoo-groups

I am an owner of a now-defunct Yahoo Group. Apparently, it's been announced that

Yahoo Groups is to remove all content December 14 2019

and I would like to save the archive of messages to that group. For posterity, or maybe for vanity, never mind.

Now, on the group page, I do have access to the message archive, by month, and then by message title. But what I want is to get all messages, at once. I'm not very picky about the exact format (e.g. separate files, one file per month, one single file), as long as there's no junk in it (ads, loads and loads of Yahoo boilerplate HTML).

Is there a way – other than crawling all the message pages myself – to download all those messages?

Best Answer

There's an option in Yahoo Groups to download Groups Data. I submitted a request but I haven't heard back yet so I can't verify if it's the solution to our problem: https://groups.yahoo.com/neo/getmydata

In the meantime, I like this script: https://github.com/IgnoredAmbience/yahoo-group-archiver (Thanks @tripleee in the comments).

This script downloads all files, photos, and more.

You'll need two Cookie values. I describe how to find them in Chrome below.

To use this new script I had to:

  1. Clone the repo locally
  2. cd into the repo
  3. Install its two dependencies: pip install -r requirements.txt (best practice is to use a virtualenv)
  4. Find the cookie values (described below)
  5. Using the cookie values and group name, construct the CLI input: ./yahoo.py -ct "<T_cookie>" -cy "<Y_cookie>" "<groupid>".

The <groupid> is found in the URL: https://groups.yahoo.com/neo/groups/GROUPID.

My final input looked like this:

./yahoo.py -ct "z=R.mrdBRSOwdBEZbt..VFnXFMzUxMwY2Tzc2MzM3MzZPM040Mz&a=QAE&sk=DAA1.RYcKZA1nr&ks=EAAdKqReOqwn_mFtpt577DhvA--~G&kt=EAADFxdOWYNIRQFzbAFOREkyTkFFeE9EQXhORFF3TkRFNE5Ea3pORGMwTnctLQFE3MTI5MTMmcHM9akYxdEN4b1U2WG9NazR0dUlHQnNBUS0t" -cy "v=1&n=0upf9jdnj00000000&r=intl=us" "My_Awesome_Group"

Finding the cookie values wasn't apparent at first. Using Chrome, this is how I got the values:

  1. Open Chrome settings
  2. Scroll to the bottom and expand "Advanced"
  3. Open "Site Settings"
  4. "Cookies and site data"
  5. "See all cookies and site data"
  6. Search "Yahoo" in the top right
  7. Expand the "yahoo.com" option:

    yahoo.com cookies

  8. Go into T and Y one at a time and copy their "Content" values to use in the CLI input above.

In case you're interested, one of the Yahoo Groups that I follow is considering paying Groups.io to transfer their Yahoo Group into their site. They were quoted at $220.
https://groups.io/static/transfer

Related Topic