Complete backups of Tumblr blogs (guide)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Socks
    Confirmed User
    • May 2002
    • 8475

    #1

    Complete backups of Tumblr blogs (guide)

    Saw this posted at Reddit, thought it might help out some folks here. I don't have any Tumblr blogs myself, but I know a lot of you do.

    http://www.reddit.com/r/DataHoarder/..._tumblr_blogs/

    Pasted:

    What is Tumblr?
    Tumblr, is a microblogging platform and social networking website founded by David Karp and owned by Yahoo! Inc. The service allows users to post multimedia and other content to a short-form blog. Users can follow other users' blogs, as well as make their blogs private.
    Read more here.
    While a high percentage of blogs on tumblr are just nothing but teenagers caught in a circle jerk of reposts and sex gifs, there are also some blogs worth archiving for future reference, you can browse high quality content blogs here in what tumblr calls 'Spotlight', these are often sponsored or company blogs but also personal high traffic, original content blogs.
    In the following examples I'm going to be using one of my favourite blogs hosted at tumblr, Luis Henrique's dotcore.tumblr.com
    Let's Get Started
    First off we're going to need the tools to make the backups, the scripts I'm using here are python based, maybe a few bash scripts later on depending on how you want to do mass/multiple backups, while it is possible to run these python scripts in a Windows environment I'm going to be detailing how to do this in a *nix environment.
    The scripts...
    tumblr_backup.py
    xmltramp.py
    Save both of these files to your working directory and do
    $chmod +x *.py
    Now we have the tools let's explore our backup options, personally I just like to backup the entire blog, this basically makes an offline copy organised by date, an example can be seen here, but there are many options, here are the flags that can be used with tumblr_backup.py
    -h, --help show this help message and exit
    -q, --quiet suppress progress messages
    -i, --incremental incremental backup mode
    -x, --xml save the original XML source
    -b, --blosxom save the posts in blosxom format
    -r, --reverse-month reverse the post order in the monthly archives
    -R, --reverse-index reverse the index file order
    -a HOUR, --auto=HOUR do a full backup at HOUR hours, otherwise do an
    incremental backup (useful for cron jobs)
    -n COUNT, --count=COUNT
    save only COUNT posts
    -s SKIP, --skip=SKIP skip the first SKIP posts
    -p PERIOD, --period=PERIOD
    limit the backup to PERIOD:
    'y': the current year
    'm': the current month
    'd': the current day (i.e. today ;-)
    YYYY: the given year
    YYYY-MM: the given month
    YYYY-MM-DD: the given day
    -P PASSWORD, --private=PASSWORD
    password to a private tumblr
    I like to first make a full backup of the blog including original xml source, to run this initial pull do
    $python tumblr_backup.py -x dotcore
    Once you've done a backup the generated directory structure will look like this...
    ./ - the current directory
    <blog-name>/ - your blog backup
    index.html - table of contents with links to the monthly pages
    backup.css - the default backup style sheet
    custom.css - the user's style sheet (optional)
    archive/
    <yyyy-mm>.html - the monthly pages
    ?
    posts/
    <id>.html - the single post pages
    ?
    images/
    <image.ext> - the image files
    ?
    xml/
    <id>.xml - the original XML posts
    ?
    theme/
    avatar.<ext> - the blog?s avatar
    style.css - the blog?s style sheet
    You can now browse to /dotcore/index.html to view the blog offline.
    After you've done the initial download you can do incremental backups as often as you'd like, depending on how active a blog is you may wish to do this every hour or just once a day, keep in mind with the current post limitations of tumblr if a blog is using up it's daily post limit you could do the incremental backup every hour and have return 5 posts each time. To run an incremental backup do
    $python tumblr_backup.py -x -i dotcore
    That's it you're done, you can run this manually if you want, but I have 400+ blogs backed up so I wrote a quick and super simple bash script to run the incremental backups with some output, here's mine stripped down to just the example, save it as backup.sh give it permission to run with chmod then run it! Using while true; do and a sleep timer this script runs every 2 hours.
    #!/bin/bash
    while true; do
    clear
    echo ____________Tumblr Backup Script_____________
    echo -------Backing Up The Following Blogs--------
    echo dotcore
    echo _____________________________________________
    python tumblr_backup.py -x -i dotcore
    echo Size of dotcore && du -ch dotcore/ | grep total
    echo _____________________________________________
    echo Free Space On Drive
    df -h $PWD | awk '/[0-9]%/{print $(NF-2)}'
    echo _____________________________________________
    echo Last Run
    date
    sleep $[120 * 60]
    done
    Happy Hoarding!
    Note; this was written at $Thu Dec 12 04:22:39AM GMT with my eyes barely open, I'm going to sleep now and will edit in the morning, point out any mistakes in the comments :D
  • HoboSexual
    Registered User
    • Dec 2013
    • 94

    #2
    Nice share but how do you restore a tumblr blog after a deletion? I dont see anyone asking that question even on reddit

    Comment

    • Evil Chris
      OG
      • Dec 2001
      • 13248

      #3
      Hobosexual! lol


      It PAYZE to post on GFY

      chris at payze.com | Skype chriswrp

      Comment

      • HoboSexual
        Registered User
        • Dec 2013
        • 94

        #4
        Originally posted by Evil Chris
        Hobosexual! lol
        Be surprised how many woman to make sex with unshowered man

        Comment

        • mopek1
          Confirmed User
          • Jun 2004
          • 3191

          #5
          Originally posted by HoboSexual
          Nice share but how do you restore a tumblr blog after a deletion? I dont see anyone asking that question even on reddit
          I'd like to know this too.

          Comment

          • Forkbeard
            Confirmed User
            • Feb 2002
            • 2236

            #6
            If python scripts are a little bit out of your league, here's instructions for using a common website ripper to create a complete backup of any tumblr on your local hard drive:

            How To Back Up Your Tumblr
            Offering sponsored blog posts and custom writing services.

            Comment

            • TumblrPRO
              So Fucking Banned
              • Aug 2013
              • 296

              #7
              Originally posted by HoboSexual
              Nice share but how do you restore a tumblr blog after a deletion? I dont see anyone asking that question even on reddit
              Just prevent being deleted... There are many ways to prevent this.

              Comment

              • mopek1
                Confirmed User
                • Jun 2004
                • 3191

                #8
                Originally posted by TumblrPRO
                Just prevent being deleted... There are many ways to prevent this.
                Like what.

                Comment

                Working...