![]() |
Complete backups of Tumblr blogs (guide)
Saw this posted at Reddit, thought it might help out some folks here. I don't have any Tumblr blogs myself, but I know a lot of you do.
http://www.reddit.com/r/DataHoarder/..._tumblr_blogs/ Pasted: What is Tumblr? Tumblr, is a microblogging platform and social networking website founded by David Karp and owned by Yahoo! Inc. The service allows users to post multimedia and other content to a short-form blog. Users can follow other users' blogs, as well as make their blogs private. Read more here. While a high percentage of blogs on tumblr are just nothing but teenagers caught in a circle jerk of reposts and sex gifs, there are also some blogs worth archiving for future reference, you can browse high quality content blogs here in what tumblr calls 'Spotlight', these are often sponsored or company blogs but also personal high traffic, original content blogs. In the following examples I'm going to be using one of my favourite blogs hosted at tumblr, Luis Henrique's dotcore.tumblr.com Let's Get Started First off we're going to need the tools to make the backups, the scripts I'm using here are python based, maybe a few bash scripts later on depending on how you want to do mass/multiple backups, while it is possible to run these python scripts in a Windows environment I'm going to be detailing how to do this in a *nix environment. The scripts... tumblr_backup.py xmltramp.py Save both of these files to your working directory and do $chmod +x *.py Now we have the tools let's explore our backup options, personally I just like to backup the entire blog, this basically makes an offline copy organised by date, an example can be seen here, but there are many options, here are the flags that can be used with tumblr_backup.py -h, --help show this help message and exit -q, --quiet suppress progress messages -i, --incremental incremental backup mode -x, --xml save the original XML source -b, --blosxom save the posts in blosxom format -r, --reverse-month reverse the post order in the monthly archives -R, --reverse-index reverse the index file order -a HOUR, --auto=HOUR do a full backup at HOUR hours, otherwise do an incremental backup (useful for cron jobs) -n COUNT, --count=COUNT save only COUNT posts -s SKIP, --skip=SKIP skip the first SKIP posts -p PERIOD, --period=PERIOD limit the backup to PERIOD: 'y': the current year 'm': the current month 'd': the current day (i.e. today ;-) YYYY: the given year YYYY-MM: the given month YYYY-MM-DD: the given day -P PASSWORD, --private=PASSWORD password to a private tumblr I like to first make a full backup of the blog including original xml source, to run this initial pull do $python tumblr_backup.py -x dotcore Once you've done a backup the generated directory structure will look like this... ./ - the current directory <blog-name>/ - your blog backup index.html - table of contents with links to the monthly pages backup.css - the default backup style sheet custom.css - the user's style sheet (optional) archive/ <yyyy-mm>.html - the monthly pages ? posts/ <id>.html - the single post pages ? images/ <image.ext> - the image files ? xml/ <id>.xml - the original XML posts ? theme/ avatar.<ext> - the blog?s avatar style.css - the blog?s style sheet You can now browse to /dotcore/index.html to view the blog offline. After you've done the initial download you can do incremental backups as often as you'd like, depending on how active a blog is you may wish to do this every hour or just once a day, keep in mind with the current post limitations of tumblr if a blog is using up it's daily post limit you could do the incremental backup every hour and have return 5 posts each time. To run an incremental backup do $python tumblr_backup.py -x -i dotcore That's it you're done, you can run this manually if you want, but I have 400+ blogs backed up so I wrote a quick and super simple bash script to run the incremental backups with some output, here's mine stripped down to just the example, save it as backup.sh give it permission to run with chmod then run it! Using while true; do and a sleep timer this script runs every 2 hours. #!/bin/bash while true; do clear echo ____________Tumblr Backup Script_____________ echo -------Backing Up The Following Blogs-------- echo dotcore echo _____________________________________________ python tumblr_backup.py -x -i dotcore echo Size of dotcore && du -ch dotcore/ | grep total echo _____________________________________________ echo Free Space On Drive df -h $PWD | awk '/[0-9]%/{print $(NF-2)}' echo _____________________________________________ echo Last Run date sleep $[120 * 60] done Happy Hoarding! Note; this was written at $Thu Dec 12 04:22:39AM GMT with my eyes barely open, I'm going to sleep now and will edit in the morning, point out any mistakes in the comments :D |
Nice share but how do you restore a tumblr blog after a deletion? I dont see anyone asking that question even on reddit
|
Hobosexual! lol
|
Quote:
|
Quote:
|
If python scripts are a little bit out of your league, here's instructions for using a common website ripper to create a complete backup of any tumblr on your local hard drive:
How To Back Up Your Tumblr |
Quote:
|
Quote:
|
All times are GMT -7. The time now is 04:10 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123