Hadoop HDFS DataNode Install

A DataNode manages the storage attached to the node it run on. There are usually one DataNode per node in the cluster. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

In a Hight Availabity (HA) enrironment, in order to provide a fast failover, it is necessary that the Standby node have up-to-date information regarding the location of blocks in the cluster. In order to achieve this, the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.

module.exports = header: 'HDFS DN Install', handler: ({options}) ->

Register

  @registry.register 'hconfigure', 'ryba/lib/hconfigure'
  @registry.register 'hdp_select', 'ryba/lib/hdp_select'

IPTables

Service	Port	Proto	Parameter
datanode	50010/1004	tcp/http	dfs.datanode.address
datanode	9864/1006	tcp/http	dfs.datanode.http.address
datanode	9865	tcp/https	dfs.datanode.https.address
datanode	50020	tcp	dfs.datanode.ipc.address

The "dfs.datanode.address" default to "50010" in non-secured mode. In non-secured mode, it must be set to a value below "1024" and default to "1004".

IPTables rules are only inserted if the parameter "iptables.action" is set to "start" (default value).

  [_, dn_address] = options.hdfs_site['dfs.datanode.address'].split ':'
  [_, dn_http_address] = options.hdfs_site['dfs.datanode.http.address'].split ':'
  [_, dn_https_address] = options.hdfs_site['dfs.datanode.https.address'].split ':'
  [_, dn_ipc_address] = options.hdfs_site['dfs.datanode.ipc.address'].split ':'
  @tools.iptables
    header: 'IPTables'
    rules: [
      { chain: 'INPUT', jump: 'ACCEPT', dport: dn_address, protocol: 'tcp', state: 'NEW', comment: "HDFS DN Data" }
      { chain: 'INPUT', jump: 'ACCEPT', dport: dn_http_address, protocol: 'tcp', state: 'NEW', comment: "HDFS DN HTTP" }
      { chain: 'INPUT', jump: 'ACCEPT', dport: dn_https_address, protocol: 'tcp', state: 'NEW', comment: "HDFS DN HTTPS" }
      { chain: 'INPUT', jump: 'ACCEPT', dport: dn_ipc_address, protocol: 'tcp', state: 'NEW', comment: "HDFS DN Meta" }
    ]
    if: options.iptables

Packages

Install the "hadoop-hdfs-datanode" service, symlink the rc.d startup script inside "/etc/init.d" and activate it on startup.

  @call header: 'Packages', ->
    @service
      name: 'hadoop-hdfs-datanode'
    @hdp_select
      name: 'hadoop-hdfs-client' # Not checked
      name: 'hadoop-hdfs-datanode'
    @service.init
      if_os: name: ['redhat','centos'], version: '6'
      target: '/etc/init.d/hadoop-hdfs-datanode'
      source: "#{__dirname}/../resources/hadoop-hdfs-datanode.j2"
      local: true
      context: options: options
      mode: 0o0755
    @service.init
      if_os: name: ['redhat','centos'], version: '7'
      target: '/usr/lib/systemd/system/hadoop-hdfs-datanode.service'
      source: "#{__dirname}/../resources/hadoop-hdfs-datanode-systemd.j2"
      local: true
      context: options: options
      mode: 0o0644

  @call header: 'Compression', retry: 2, ->
    @service.remove 'snappy', if: options.attempt is 1
    @service name: 'snappy'
    @service name: 'snappy-devel'
    @system.link
      source: '/usr/lib64/libsnappy.so'
      target: '/usr/hdp/current/hadoop-client/lib/native/.'
    @call (_, callback) ->
      @service
        name: 'lzo-devel'
        relax: true
      , (err) ->
        @service.remove
          if: !!err
          name: 'lzo-devel'
        @next callback
    @service
      name: 'hadooplzo'
    @service
      name: 'hadooplzo-native'

Layout

Create the DataNode data and pid directories. The data directory is set by the "hdp.hdfs.site['dfs.datanode.data.dir']" and default to "/var/hdfs/data". The pid directory is set by the "hdfs_pid_dir" and default to "/var/run/hadoop-hdfs"

  @call header: 'Layout', ->
    # no need to restrict parent directory and yarn will complain if not accessible by everyone
    pid_dir = options.secure_dn_pid_dir
    pid_dir = pid_dir.replace '$USER', options.user.name
    pid_dir = pid_dir.replace '$HADOOP_SECURE_DN_USER', options.user.name
    pid_dir = pid_dir.replace '$HADOOP_IDENT_STRING', options.user.name
    # TODO, in HDP 2.1, datanode are started as root but in HDP 2.2, we should
    # start it as HDFS and use JAAS
    @system.mkdir
      target: "#{options.conf_dir}"
    @system.mkdir
      target: for dir in options.hdfs_site['dfs.datanode.data.dir'].split ','
        if dir.indexOf('file://') is 0
          dir.substr(7) 
        else if dir.indexOf('file://') is -1
          dir
        else 
          dir.substr(dir.indexOf('file://')+7)
      uid: options.user.name
      gid: options.hadoop_group.name
      mode: 0o0750
      parent: true
    @system.tmpfs
      if_os: name: ['redhat','centos'], version: '7'
      mount: pid_dir
      uid: options.user.name
      gid: options.hadoop_group.name
      perm: '0750'
    @system.mkdir
      target: "#{pid_dir}"
      uid: options.user.name
      gid: options.hadoop_group.name
      mode: 0o0755
      parent: true
    @system.mkdir
      target: "#{options.log_dir}" #/#{options.user.name}
      uid: options.user.name
      gid: options.group.name
      parent: true
    @system.mkdir
      target: "#{path.dirname options.hdfs_site['dfs.domain.socket.path']}"
      uid: options.user.name
      gid: options.hadoop_group.name
      mode: 0o751
      parent: true

Core Site

Update the "core-site.xml" configuration file with properties from the "ryba.core_site" configuration.

  @hconfigure
    header: 'Core Site'
    target: "#{options.conf_dir}/core-site.xml"
    source: "#{__dirname}/../../resources/core_hadoop/core-site.xml"
    local: true
    properties: options.core_site
    backup: true

HDFS Site

Update the "hdfs-site.xml" configuration file with the High Availabity properties present inside the "hdp.ha_client_config" object.

  @hconfigure
    header: 'HDFS Site'
    target: "#{options.conf_dir}/hdfs-site.xml"
    source: "#{__dirname}/../../resources/core_hadoop/hdfs-site.xml"
    local: true
    properties: options.hdfs_site
    uid: options.user.name
    gid: options.hadoop_group.name
    backup: true

Environment

Maintain the "hadoop-env.sh" file present in the HDP companion File.

The location for JSVC depends on the platform. The Hortonworks documentation mentions "/usr/libexec/bigtop-utils" for RHEL/CentOS/Oracle Linux. While this is correct for RHEL, it is installed in "/usr/lib/bigtop-utils" on my CentOS.

  @call header: 'Environment', ->
    HADOOP_DATANODE_OPTS = options.opts.base
    HADOOP_DATANODE_OPTS += " -D#{k}=#{v}" for k, v of options.opts.java_properties
    HADOOP_DATANODE_OPTS += " #{k}#{v}" for k, v of options.opts.jvm
    @file.render
      header: 'Environment'
      target: "#{options.conf_dir}/hadoop-env.sh"
      source: "#{__dirname}/../resources/hadoop-env.sh.j2"
      local: true
      context:
        HADOOP_ROOT_LOGGER: options.log4j.root_logger
        HADOOP_SECURITY_LOGGER: options.log4j.security_logger
        HDFS_AUDIT_LOGGER: options.log4j.audit_logger
        HADOOP_HEAPSIZE: options.hadoop_heap
        HADOOP_DATANODE_OPTS: HADOOP_DATANODE_OPTS
        HADOOP_LOG_DIR: options.log_dir
        HADOOP_PID_DIR: options.secure_dn_pid_dir
        HADOOP_OPTS: options.hadoop_opts
        HADOOP_CLIENT_OPTS: ''
        HDFS_DATANODE_SECURE_USER: options.user.name
        HADOOP_SECURE_LOG_DIR: options.log_dir
        HADOOP_SECURE_PID_DIR: options.secure_dn_pid_dir
        java_home: options.java_home
      uid: options.user.name
      gid: options.hadoop_group.name
      mode: 0o755
      backup: true
      eof: true

Log4j

  @file
    header: 'Log4j'
    target: "#{options.conf_dir}/log4j.properties"
    source: "#{__dirname}/../resources/log4j.properties"
    local: true
    write: for k, v of options.log4j.properties
      match: RegExp "#{k}=.*", 'm'
      replace: "#{k}=#{v}"
      append: true

Hadoop Metrics

Configure the "hadoop-metrics2.properties" to connect Hadoop to a Metrics collector like Ganglia or Graphite.

  @file.properties
    header: 'Metrics'
    target: "#{options.conf_dir}/hadoop-metrics2.properties"
    content: options.metrics.config
    backup: true

Configure Master

Accoring to Yahoo!: The conf/masters file contains the hostname of the SecondaryNameNode. This should be changed from "localhost" to the fully-qualified domain name of the node to run the SecondaryNameNode service. It does not need to contain the hostname of the JobTracker/NameNode machine; Also some interesting info about snn

  # @file
  #   header: 'SNN Master'
  #   if: (-> @contexts('ryba/hadoop/hdfs_snn').length)
  #   content: "#{@contexts('ryba/hadoop/hdfs_snn')?.config?.host}"
  #   target: "#{options.conf_dir}/masters"
  #   uid: options.user.name
  #   gid: hadoop_group.name

SSL

  @call header: 'SSL', retry: 0, ->
    options.ssl_server['ssl.server.keystore.location'] = "#{options.conf_dir}/keystore"
    options.ssl_server['ssl.server.truststore.location'] = "#{options.conf_dir}/truststore"
    options.ssl_client['ssl.client.truststore.location'] = "#{options.conf_dir}/truststore"
    @hconfigure
      target: "#{options.conf_dir}/ssl-server.xml"
      properties: options.ssl_server
    @hconfigure
      target: "#{options.conf_dir}/ssl-client.xml"
      properties: options.ssl_client
    # Client: import certificate to all hosts
    @java.keystore_add
      keystore: options.ssl_client['ssl.client.truststore.location']
      storepass: options.ssl_client['ssl.client.truststore.password']
      caname: "hadoop_root_ca"
      cacert: "#{options.ssl.cacert.source}"
      local: "#{options.ssl.cacert.local}"
    # Server: import certificates, private and public keys to hosts with a server
    @java.keystore_add
      keystore: options.ssl_server['ssl.server.keystore.location']
      storepass: options.ssl_server['ssl.server.keystore.password']
      caname: "hadoop_root_ca"
      key: "#{options.ssl.key.source}"
      cert: "#{options.ssl.cert.source}"
      keypass: options.ssl_server['ssl.server.keystore.keypassword']
      name: "#{options.ssl.key.name}"
      local: "#{options.ssl.key.local}"
    @java.keystore_add
      keystore: options.ssl_server['ssl.server.keystore.location']
      storepass: options.ssl_server['ssl.server.keystore.password']
      caname: "hadoop_root_ca"
      cacert: "#{options.ssl.cacert.source}"
      local: "#{options.ssl.cacert.local}"

Kerberos

Create the DataNode service principal in the form of "dn/{host}@{realm}" and place its keytab inside "/etc/security/keytabs/dn.service.keytab" with ownerships set to "hdfs:hadoop" and permissions set to "0600".

    @krb5.addprinc
      header: 'Kerberos'
      principal: options.krb5.principal
      randkey: true
      keytab: options.krb5.keytab
      uid: options.user.name
      gid: options.group.name
      mode: 0o0600
    , options.krb5.admin

Kernel

Configure kernel parameters at runtime. There are no properties set by default, here's a suggestion:

vm.swappiness = 10
vm.overcommit_memory = 1
vm.overcommit_ratio = 100
net.core.somaxconn = 4096 (default socket listen queue size 128)

Note, we might move this middleware to Masson.

  @tools.sysctl
    header: 'Kernel'
    properties: options.sysctl
    merge: true
    comment: true

Ulimit

Increase ulimit for the HDFS user. The HDP package create the following files:

cat /etc/security/limits.d/hdfs.conf
hdfs   - nofile 32768
hdfs   - nproc  65536

The procedure follows [Kate Ting's recommandations][kate]. This is a cause of error if you receive the message: 'Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread'.

Also worth of interest are the [Pivotal recommandations][hawq] as well as the [Greenplum recommandation from Nixus Technologies][greenplum], the [MapR documentation][mapr] and [Hadoop Performance via Linux presentation][hpl].

Note, a user must re-login for those changes to be taken into account.

  @system.limits
    header: 'Ulimit'
    user: options.user.name
  , options.user.limits

This is a dirty fix of this bug. When launched with -user parameter, jsvc downgrades user via setuid() system call, but the operating system limits (max number of open files, for example) remains the same. As jsvc is used by bigtop scripts to run hdfs via root, we also (in fact: only) need to fix limits to root account, until Bigtop integrates jsvc 1.0.6

  @system.limits
    header: 'Ulimit to root'
    user: 'root'
  , options.user.limits

Dependencies

misc = require '@nikitajs/core/lib/misc'
path = require 'path'