Advertisement
Guest User

Untitled

a guest
Jul 22nd, 2017
147
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.58 KB | None | 0 0
  1. (Extract from Dharmesh Kakdia's - "Apache Mesos Essentials")
  2.  
  3. Installing Hadoop on Mesos
  4. ==========================
  5.  
  6. Hadoop on Mesos (https://github.com/mesos/hadoop) relies on the extension
  7. to Mesos, such as the Mesos executor to execute TaskTrackers and Hadoop
  8. JobTracker Mesos Scheduler to run Hadoop on a Mesos framework. We will run
  9. Hadoop 1.x on Mesos:
  10.  
  11. 1. Install and run Mesos by following the instructions in Chapter 1,
  12. Running Mesos.
  13.  
  14. 2. We need to compile Hadoop on the Mesos library. Hadoop on Mesos uses
  15. Maven to manage dependencies, which we will need to install along with
  16. Java and Git:
  17. ubuntu@master:~ $ sudo apt-get install maven openjdk-7-jdk git
  18.  
  19. 3. Let's clone Hadoop on the Mesos source code from https://github.com/
  20. mesos/hadoop, and navigate to it using the following command:
  21. ubuntu@master:~ $ git clone https://github.com/mesos/hadoop/
  22. ubuntu@master:~ $ cd hadoop
  23.  
  24. 4. Build Hadoop on the Mesos binaries from the code using the following
  25. command. By default, it will build the latest version of Mesos and Hadoop.
  26. If required, we can adjust the versions in the pom.xml file:
  27. ubuntu@master:~ $ mvn package
  28. This will build hadoop-mesos-VERSION-jar in the target folder.
  29.  
  30. 5. Download the Hadoop distribution, extract, and navigate to it. We can use
  31. vanilla Apache distribution, Cloudera Distribution Hadoop (CDH), or any
  32. other Hadoop distribution. We can download and extract the latest CDH
  33. distribution using following commands:
  34. ubuntu@master:~ $ wget http://archive.cloudera.com/cdh5/cdh/5/
  35. hadoop-2.5.0-cdh5.2.0.tar.gz
  36. ubuntu@master:~ $ tar xzf hadoop-*.tar.gz
  37.  
  38. 6. We need to put Hadoop on Mesos jar that we just built in a location where
  39. it's accessible to Hadoop via Hadoop CLASSPATH. We will copy it
  40. to the Hadoop lib folder, which is by default the lib folder in share/
  41. hadoop/common/ inside hadoop distribution:
  42. ubuntu@master:~ $ cp hadoop-mesos/target/hadoop-mesos-*.jar
  43. hadoop-*/share/hadoop/common/lib
  44.  
  45. 7. By default, the CDH distribution is configured to use MapReduce Version 2
  46. (MRv2) with YARN. So, we need to update it to point to MRv1:
  47.  
  48. ubuntu@master:~ $ cd hadoop-*
  49. ubuntu@master:~ $ mv bin bin-mapreduce2
  50. ubuntu@master:~ $ ln –s bin-mapreduce1 bin
  51. ubuntu@master:~ $ cd etc;
  52. ubuntu@master:~ $ mv hadoop hadoop-mapreduce2
  53. ubuntu@master:~ $ ln –s hadoop-mapreduce1 hadoop
  54. ubuntu@master:~ $ cd -;
  55.  
  56. Optionally, we can also update examples to point to the MRv1 examples:
  57.  
  58. ubuntu@master:~ $ mv examples examples-mapreduce2
  59. ubuntu@master:~ $ ln –s example-mapreduce1 examples
  60. www.it-ebooks.info
  61.  
  62. 8. Now, we will configure Hadoop to recognize that it should use the Mesos
  63. scheduler that we just built. Set the following mandatory configuration
  64. options in etc/hadoop/mapred-site.xml by adding them to the
  65.  
  66. <configuration> and </configuration> tags:
  67. <property>
  68. <name>mapred.job.tracker</name>
  69. <value>localhost:9001</value>
  70. </property>
  71. <property>
  72. <name>mapred.jobtracker.taskScheduler</name>
  73. <value>org.apache.hadoop.mapred.MesosScheduler</value>
  74. </property>
  75. <property>
  76. <name>mapred.mesos.taskScheduler</name>
  77. <value>org.apache.hadoop.mapred.JobQueueTaskScheduler
  78. </value>
  79. </property>
  80. <property>
  81. <name>mapred.mesos.master</name>
  82. <value>zk://localhost:2181/mesos</value>
  83. </property>
  84. <property>
  85. <name>mapred.mesos.executor.uri</name>
  86. <value>hdfs://localhost:9000/hadoop.tar.gz</value>
  87. </property>
  88.  
  89. We specify Hadoop to use Mesos for scheduling tasks by specifying the
  90. mapred.jobtracker.taskScheduler property. The Mesos master address
  91. is specified via mapred.mesos.master, which we have set to the local
  92. ZooKeeper address. mapred.mesos.executor.uri points to the Hadoop
  93. distribution path that we will upload to HDFS, which is to be used for
  94. executing tasks.
  95.  
  96. 9. We have to ensure that Hadoop on Mesos is able to find the Mesos native
  97. library, which by default is located at /usr/local/lib/libmesos.so.
  98. We need to export the location of the Mesos native library by adding the
  99. following line at the start of the bin/hadoop-daemon.sh script:
  100. export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
  101.  
  102. 10. We need to have a location where the distribution can be accessed by
  103. Mesos, while launching Hadoop tasks. We can put it in HDFS, S3, or any
  104. other accessible location, such as a NFS server. We will put it in HDFS, and
  105. for this, we need to install HDFS on the cluster. We need to start the
  106. Namenode daemon on the HDFS master node. Note that HDFS master
  107. node is independent of the Mesos master. Copy the Hadoop distribution
  108. to the node, and start the Namenode using the following command:
  109. ubuntu@master:~$ bin/hadoop-daemon.sh start namenode
  110. We need to start the Datanode daemons on each node that we want to make
  111. a HDFS slave node (which is independent of the Mesos slave node). Copy the
  112. created Hadoop distribution to all HDFS slave nodes, and start the Datanode
  113. on each using the following command:
  114.  
  115. ubuntu@master:~$ bin/hadoop-daemon.sh start datanode
  116. We need to format the Namenode for the first usage with the following
  117. command on the HDFS master, where the Namenode is running:
  118. ubuntu@master:~$ bin/hadoop namenode -format
  119.  
  120. 11. Hadoop is now ready to run on Mesos. Let's package it and upload it on
  121.  
  122. HDFS:
  123. ubuntu@master:~ $ tar cfz hadoop.tar.gz hadoop-*
  124. ubuntu@master:~ $ bin/hadoop dfs -put hadoop.tar.gz /hadoop.tar.gz
  125. ubuntu@master:~ $ bin/hadoop dfs –chmod 777 /hadoop.tar.gz
  126.  
  127. 12. Now, we can start the JobTracker. Note that we don't need to start
  128. TaskTracker manually, as this will be started by Mesos when we submit a
  129. Hadoop job:
  130. ubuntu@master:~ $ bin/hadoop jobtracker
  131. Hadoop is now running, and we are ready to run Hadoop jobs.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement