Hadoop Core Install
module.exports = header: 'Hadoop Core Install', handler: ({options}) ->
Register
@registry.register 'hconfigure', 'ryba/lib/hconfigure'
@registry.register 'hdp_select', 'ryba/lib/hdp_select'
Identities
By default, the "hadoop-client" package rely on the "hadoop", "hadoop-hdfs", "hadoop-mapreduce" and "hadoop-yarn" dependencies and create the following entries:
cat /etc/passwd | grep hadoop
hdfs:x:496:497:Hadoop HDFS:/var/lib/hadoop-hdfs:/bin/bash
yarn:x:495:495:Hadoop Yarn:/var/lib/hadoop-yarn:/bin/bash
mapred:x:494:494:Hadoop MapReduce:/var/lib/hadoop-mapreduce:/bin/bash
cat /etc/group | egrep "hdfs|yarn|mapred"
hadoop:x:498:hdfs,yarn,mapred
hdfs:x:497:
yarn:x:495:
mapred:x:494:
Note, the package "hadoop" will also install the "dbus" user and group which are not handled here.
for group in [options.hadoop_group, options.hdfs.group, options.yarn.group, options.mapred.group, options.ats.group]
@system.group header: "Group #{group.name}", group
for user in [options.hdfs.user, options.yarn.user, options.mapred.user, options.ats.user]
@system.user header: "user #{user.name}", user
Packages
Install the "hadoop-client" and "openssl" packages as well as their dependecies.
The environment script "hadoop-env.sh" from the HDP companion files is also uploaded when the package is first installed or upgraded. Be careful, the original file will be overwritten with and user modifications. A copy will be made available in the same directory after any modification.
@call header: 'Packages', ->
@service
name: 'openssl-devel'
@service
name: 'hadoop-client'
@hdp_select
name: 'hadoop-client'
Topology
Configure the topology script to enable rack awareness to Hadoop.
@call header: 'Topology', ->
@file
target: "#{options.conf_dir}/rack_topology.sh"
source: "#{__dirname}/../resources/rack_topology.sh"
local: true
uid: options.hdfs.user.name
gid: options.hadoop_group.name
mode: 0o755
backup: true
@file
target: "#{options.conf_dir}/rack_topology.data"
content: options.topology
.map (node) ->
"#{node.ip} #{node.rack or ''}"
.join '\n'
uid: options.hdfs.user.name
gid: options.hadoop_group.name
mode: 0o755
backup: true
eof: true
Keytab Directory
@system.mkdir
header: 'Keytabs'
target: '/etc/security/keytabs'
uid: 'root'
gid: 'root' # was hadoop_group.name
mode: 0o0755
Kerberos HDFS User
Create the HDFS user principal. This will be the super administrator for the HDFS filesystem. Note, we do not create a principal with a keytab to allow HDFS login from multiple sessions with braking an active session.
@krb5.addprinc
header: 'HDFS Kerberos User'
, options.hdfs.krb5_user
, options.krb5.admin
SPNEGO
Create the SPNEGO service principal in the form of "HTTP/{host}@{realm}" and place its keytab inside "/etc/security/keytabs/spnego.service.keytab" with ownerships set to "hdfs:hadoop" and permissions set to "0660". We had to give read/write permission to the group because the same keytab file is for now shared between hdfs and yarn services.
@call header: 'SPNEGO', ->
@krb5.addprinc
principal: options.spnego.principal
randkey: true
keytab: options.spnego.keytab
uid: 'root'
gid: options.hadoop_group.name
mode: 0o660 # need rw access for hadoop and mapred users
, options.krb5.admin
@system.execute # Validate keytab access by the hdfs user
cmd: "su -l #{options.hdfs.user.name} -c \"klist -kt /etc/security/keytabs/spnego.service.keytab\""
if: -> @status -1
Web UI
This action follow the "Authentication for Hadoop HTTP web-consoles" recommendations.
@system.execute
header: 'WebUI'
cmd: 'dd if=/dev/urandom of=/etc/hadoop/hadoop-http-auth-signature-secret bs=1024 count=1'
unless_exists: '/etc/hadoop/hadoop-http-auth-signature-secret'
SSL
@call header: 'SSL', retry: 0, ->
@hconfigure
target: "#{options.conf_dir}/ssl-server.xml"
properties: options.ssl_server
@hconfigure
target: "#{options.conf_dir}/ssl-client.xml"
properties: options.ssl_client
# Client: import certificate to all hosts
@java.keystore_add
keystore: options.ssl_client['ssl.client.truststore.location']
storepass: options.ssl_client['ssl.client.truststore.password']
caname: "hadoop_root_ca"
cacert: options.ssl.cacert.source
local: options.ssl.cacert.local
# Server: import certificates, private and public keys to hosts with a server
@java.keystore_add
keystore: options.ssl_server['ssl.server.keystore.location']
storepass: options.ssl_server['ssl.server.keystore.password']
# caname: "hadoop_root_ca"
# cacert: "#{options.ssl.cacert}"
key: options.ssl.key.source
cert: options.ssl.cert.source
keypass: options.ssl_server['ssl.server.keystore.keypassword']
name: options.ssl.key.name
local: options.ssl.key.local
@java.keystore_add
keystore: options.ssl_server['ssl.server.keystore.location']
storepass: options.ssl_server['ssl.server.keystore.password']
caname: "hadoop_root_ca"
cacert: options.ssl.cacert.source
local: options.ssl.cacert.local