Hadoop Tuning


        
                Markdown 3.55 KB
                                    
                        | None                    
                
                                        |
    0    0                            

            
                                    raw
                    download
                    clone
                    embed
                    print
                
                                    report
                
                
Recommendations - May-16-2017 @ 11:00 AM
INFRA

Generally accepted oversubscription ratios are around 4:1 at the server access layer and 2:1 between the access layer and the aggregation layer or core.
Schedule migration from EXT3 to XFS for all servers
Separate service DB's once load is justified.

Best practice on larger clusters

Validate HDFS disks are using RAID-0-Per-Spindle (Single Stripe) and are not bypassing the control in JBOD mode*
Linux Settings

net.core.somaxconn = 4096  [this is 4000 atm, and DataNode Max Threads for Transfer is 4096]
net.ipv4.tcp_fin_timeout = 10
vm.dirty_background_ratio = 20
vm.dirty_ratio = 50 


Ambariv // SmartSense

Deploy Ambari Views server to start migrating users away from HUE
Setup the next version of SmartSense's Small File Report or build a script for it 

LOTS of platform issues are due to App Teams not handling small files at all! This is a MAJOR PROBLEM. 


HDFS

Add additional (2 min isolated mounts) dfs.namenode.name.dir mount

Best Practice 

dfs.namenode.checkpoint.period = 3600

Current Checkpoint is very high

Increase DataNode Heaps to 2GB
NameNode Thread Pool - Suggested value of NN server thread size is ln(no of data node)*20
Ensure Safemode threshold != 1. 

YARN

yarn.timeline-service.generic-application-history.save-non-am-container-meta-info = false
Enable FAIRWEIGHT for DZ Queue 
Enable Preemption on only DZ Queue

Hive

Convert DEFAULT Hive Engine to Tez

Set Tez Session Timeout to like 10-20 seconds, with a low number of containers. We want REUSE! 
tez.session.am.dag.submit.timeout.secs

hive.plan.serialization.format = kyro
Complete Hive HA Setup
Move prod apps into HiveServer for Prod Apps
Make ORC file Default for all new tables
Hive Specific Properties

hive.exec.compress.intermediate = true
hive.exec.compress.output = true
hive.vectorized.execution.enabled = true
hive.vectorized.execution.reduce.enabled = true
hive.exec.parallel = true
hive.optimize.bucketmapjoin.sortedmerge = true
hive.exec.dynamic.partition.mode = nonstrict
hive.groupby.orderby.position.alias = true
hive.enforce.bucketing = true
hive.support.concurrency = true
hive.optimize.ppd = true
hive.optimize.ppd.storage = true;
hive.cbo.enable = true;
hive.compute.query.using.stats=true;
hive.stats.fetch.column.stats=true;
hive.stats.fetch.partition.stats=true;
hive.tez.auto.reducer.parallelism=true;
hive.tez.max.partition.factor=20;
hive.exec.reducers.bytes.per.reducer=128000000;


Tez

Increase Tez AM Size so it can compile jobs with many files 

tez.am.resource.memory.mb=4096


Spark

Reconsider Enable platform wide dynamic allocation

Oozie

Complete Oozie HA Setup
Separate Oozie for Prod Apps and Analysts 

MAPREDUCE

mapreduce.input.fileinputformat.split.minsize = 104857600

right now MapReduce will just take super tiny files per mapper, this is not acceptable and we would rather use network to aggregate to reduce compute traffic

Enable Mapper Compression

Other

Decouple user home DIRs from NAS
Use Dedicated Disks for Zookeeper
Use Dedicated Disks for QJM
Remove all QJM from NameNodes
Implement MasterServers Topology form Platform Guide - Due to age we may need to refresh this slightly. 
Consider increasing HADOOP_CLIENT_OPTS=-Xmx4g  as we have tables with many files that are creating problems compiling jobs

This would be resolved well with having AppTeams handle thier small files issues