Menu

Hadoop HDFS NameNode Backup

comprendre les proprietes de sauvegarde des 2 fsimages et trouver la proprietes du laps de temps entre 2 creations.

HDFS cli

OIV

can dump content of HDFS fsimages

Use hdfs oiv (can run offline) hdfs oiv -p FileDistribution -i /var/hdfs/name/current/fsimage_0000000000000023497 -o test_fd hdfs oiv -p Ls -i /var/hdfs/name/current/fsimage_0000000000000023497 -o test_ls

OEV

can load content of HDFS fsimages dump

Curl

Use curl to download image and edit logs: https://<namenode>:9871/getimage?getimage=1&txid=latest https://<namenode>:9871/getimage?getedit=1&startTxId=X&endTxId=Y

dfsadmin -fetchImage

Node-backmeup

Local Backup

module.exports = header: 'HDFS NN Backup', handler: ->

  @tools.backup
    header: 'HDFS LS output'
    name: 'ls'
    cmd: 'hdfs dfs -ls -R / '
    target: "/var/backups/nn_#{options.fqdn}/"
    interval: month: 1
    retention: count: 2

  any_dfs_name_dir = options.hdfs_site['dfs.namenode.name.dir'].split(',')[0]
  any_dfs_name_dir = any_dfs_name_dir.substr(7) if any_dfs_name_dir.indexOf('file://') is 0
  @tools.backup
    header: 'FSimages & edits'
    name: 'fs'
    source: path.join any_dfs_name_dir, 'current'
    filter: ['fsimage_*','edits_0*']
    target: "/var/backups/nn_#{options.fqdn}/"
    interval: month: 1
    retention: count: 2

Restoration procedure

To restore the fsimage as it was at the date of backup with a shell command with default configuration value:

cd /var/hdfs/name/current/
rm -rf *
tar -xzf /var/backups/nn_$HOSTNAME/<backup_date>.tar.gz

man tar for more information if you have changed default options

Dependencies

path = require 'path'