Guest User

Untitled

a guest
Apr 25th, 2018
68
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 6.16 KB | None | 0 0
  1. groups = rbinom(32,n=50,prob=0.4)
  2. groupsdfs =to.dfs(groups)
  3. mapreduceResult<- mapreduce(
  4. input =groupsdfs,
  5. map =function(.,v) keyval(v,1),
  6. reduce = function(k,vv) keyval(k,sum(vv)))
  7. from.dfs(mapreduceResult)
  8.  
  9. 14/07/24 11:22:59 INFO mapreduce.Job: map 100% reduce 58%
  10. 14/07/24 11:23:01 INFO mapreduce.Job: Task Id : attempt_1406189659246_0001_r_000016_1, Status : FAILED
  11. Error: java.lang.RuntimeException: Error in configuring object
  12. at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
  13. at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
  14. at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
  15. at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:409)
  16. at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
  17. at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
  18. at java.security.AccessController.doPrivileged(Native Method)
  19. at javax.security.auth.Subject.doAs(Subject.java:415)
  20. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  21. at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
  22. Caused by: java.lang.reflect.InvocationTargetException
  23. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  24. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  25. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  26. at java.lang.reflect.Method.invoke(Method.java:606)
  27. at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
  28. ... 9 more
  29. Caused by: java.lang.RuntimeException: configuration exception
  30. at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
  31. at org.apache.hadoop.streaming.PipeReducer.configure(PipeReducer.java:67)
  32. ... 14 more
  33. Caused by: java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory
  34. at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)
  35. at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
  36. ... 15 more
  37. Caused by: java.io.IOException: error=2, No such file or directory
  38. at java.lang.UNIXProcess.forkAndExec(Native Method)
  39. at java.lang.UNIXProcess.<init>(UNIXProcess.java:135)
  40. at java.lang.ProcessImpl.start(ProcessImpl.java:130)
  41. at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022)
  42. ... 16 more
  43.  
  44. 14/07/24 11:23:42 INFO mapreduce.Job: Job job_1406189659246_0001 failed with state FAILED due to: Task failed task_1406189659246_0001_r_000007
  45.  
  46. 14/07/24 11:23:42 INFO mapreduce.Job: Counters: 54
  47. File System Counters
  48. FILE: Number of bytes read=1631
  49. FILE: Number of bytes written=2036200
  50. FILE: Number of read operations=0
  51. FILE: Number of large read operations=0
  52. FILE: Number of write operations=0
  53. HDFS: Number of bytes read=1073
  54. HDFS: Number of bytes written=5198
  55. HDFS: Number of read operations=67
  56. HDFS: Number of large read operations=0
  57. HDFS: Number of write operations=38
  58. Job Counters
  59. Failed map tasks=2
  60. Failed reduce tasks=28
  61. Killed reduce tasks=1
  62. Launched map tasks=4
  63. Launched reduce tasks=48
  64. Other local map tasks=2
  65. Data-local map tasks=2
  66. Total time spent by all maps in occupied slots (ms)=18216
  67. Total time spent by all reduces in occupied slots (ms)=194311
  68. Total time spent by all map tasks (ms)=18216
  69. Total time spent by all reduce tasks (ms)=194311
  70. Total vcore-seconds taken by all map tasks=18216
  71. Total vcore-seconds taken by all reduce tasks=194311
  72. Total megabyte-seconds taken by all map tasks=18653184
  73. Total megabyte-seconds taken by all reduce tasks=198974464
  74. Map-Reduce Framework
  75. Map input records=3
  76. Map output records=25
  77. Map output bytes=2196
  78. Map output materialized bytes=2266
  79. Input split bytes=214
  80. Combine input records=0
  81. Combine output records=0
  82. Reduce input groups=10
  83. Reduce shuffle bytes=1859
  84. Reduce input records=21
  85. Reduce output records=30
  86. Spilled Records=46
  87. Shuffled Maps =38
  88. Failed Shuffles=0
  89. Merged Map outputs=38
  90. GC time elapsed (ms)=1339
  91. CPU time spent (ms)=40060
  92. Physical memory (bytes) snapshot=5958418432
  93. Virtual memory (bytes) snapshot=33795457024
  94. Total committed heap usage (bytes)=7176978432
  95. Shuffle Errors
  96. BAD_ID=0
  97. CONNECTION=0
  98. IO_ERROR=0
  99. WRONG_LENGTH=0
  100. WRONG_MAP=0
  101. WRONG_REDUCE=0
  102. File Input Format Counters
  103. Bytes Read=859
  104. File Output Format Counters
  105. Bytes Written=5198
  106. rmr
  107. reduce calls=10
  108. 14/07/24 11:23:42 ERROR streaming.StreamJob: Job not Successful!
  109. Streaming Command Failed!
  110. Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, :
  111. hadoop streaming failed with error code 1
  112.  
  113. Sys.setenv(HADOOP_CMD="/usr/bin/hadoop")
  114. Sys.setenv(HADOOP_STREAMING="/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming.jar")
  115. Sys.setenv(JAVA_HOME="/usr/java/jdk1.7.0_55-cloudera")
  116. Sys.setenv(HADOOP_COMMON_LIB_NATIVE_DIR="/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/hadoop/lib/native")
  117. Sys.setenv(HADOOP_OPTS="-Djava.library.path=HADOOP_HOME/lib")
  118. library(rhdfs)
  119. hdfs.init()
  120. library(rmr2)
  121.  
  122. ## space and word delimiter
  123. map <- function(k,lines) {
  124. words.list <- strsplit(lines, '\s')
  125. words <- unlist(words.list)
  126. return( keyval(words, 1) )
  127. }
  128. reduce <- function(word, counts) {
  129. keyval(word, sum(counts))
  130. }
  131. wordcount <- function (input, output=NULL) {
  132. mapreduce(input=input, output=output, input.format="text", map=map, reduce=reduce)
  133. }
  134.  
  135. ## variables
  136. hdfs.root <- '/user/node'
  137. hdfs.data <- file.path(hdfs.root, 'data')
  138. hdfs.out <- file.path(hdfs.root, 'out')
  139.  
  140. ## run mapreduce job
  141. ##out <- wordcount(hdfs.data, hdfs.out)
  142. system.time(out <- wordcount(hdfs.data, hdfs.out))
  143.  
  144. ## fetch results from HDFS
  145. results <- from.dfs(out)
  146. results.df <- as.data.frame(results, stringsAsFactors=F)
  147. colnames(results.df) <- c('word', 'count')
  148.  
  149. ##head(results.df)
  150. ## sorted output TOP10
  151. head(results.df[order(-results.df$count),],10)
Add Comment
Please, Sign In to add comment