Python – Persisting data in Google Colaboratory

google-colaboratorypython

Has anyone figured out a way to keep files persisted across sessions in Google's newly open sourced Colaboratory?

Using the sample notebooks, I'm successfully authenticating and transferring csv files from my Google Drive instance and have stashed them in /tmp, my ~, and ~/datalab. Pandas can read them just fine off of disk too. But once the session times out , it looks like the whole filesystem is wiped and a new VM is spun up, without downloaded files.

I guess this isn't surprising given Google's Colaboratory Faq:

Q: Where is my code executed? What happens to my execution state if I close the browser window?

A: Code is executed in a virtual machine dedicated to your account. Virtual machines are recycled when idle for a while, and have a maximum lifetime enforced by the system.

Given that, maybe this is a feature (ie "go use Google Cloud Storage, which works fine in Colaboratory")? When I first used the tool, I was hoping that any .csv files that were in the My File/Colab Notebooks Google Drive folder would be also loaded onto the VM instance that the notebook was running on :/

Best Answer

Put that before your code, so will always download your file before run your code.

!wget -q http://www.yoursite.com/file.csv