Advertisement
Guest User

Untitled

a guest
Jan 20th, 2017
433
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 37.69 KB | None | 0 0
  1. spark-submit --packages com.databricks:spark-csv_2.11:1.5.0 ./adiel.py
  2.  
  3. 17/01/20 09:29:30 INFO SparkContext: Running Spark version 2.0.2
  4. 17/01/20 09:29:31 INFO SecurityManager: Changing view acls to: hadoop
  5. 17/01/20 09:29:31 INFO SecurityManager: Changing modify acls to: hadoop
  6. 17/01/20 09:29:31 INFO SecurityManager: Changing view acls groups to:
  7. 17/01/20 09:29:31 INFO SecurityManager: Changing modify acls groups to:
  8. 17/01/20 09:29:31 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hadoop); groups with view permissions: Set(); users  with modify permissions: Set(hadoop); groups with modify permissions: Set()
  9. 17/01/20 09:29:31 INFO Utils: Successfully started service 'sparkDriver' on port 35719.
  10. 17/01/20 09:29:31 INFO SparkEnv: Registering MapOutputTracker
  11. 17/01/20 09:29:31 INFO SparkEnv: Registering BlockManagerMaster
  12. 17/01/20 09:29:31 INFO DiskBlockManager: Created local directory at /mnt/tmp/blockmgr-0badf595-8f56-45ef-bdf5-f80573d2e188
  13. 17/01/20 09:29:31 INFO MemoryStore: MemoryStore started with capacity 414.4 MB
  14. 17/01/20 09:29:31 INFO SparkEnv: Registering OutputCommitCoordinator
  15. 17/01/20 09:29:32 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
  16. 17/01/20 09:29:32 INFO Utils: Successfully started service 'SparkUI' on port 4041.
  17. 17/01/20 09:29:32 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.31.35.244:4041
  18. 17/01/20 09:29:32 INFO Executor: Starting executor ID driver on host localhost
  19. 17/01/20 09:29:32 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35372.
  20. 17/01/20 09:29:32 INFO NettyBlockTransferService: Server created on 172.31.35.244:35372
  21. 17/01/20 09:29:32 INFO BlockManager: external shuffle service port = 7337
  22. 17/01/20 09:29:32 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.31.35.244, 35372)
  23. 17/01/20 09:29:32 INFO BlockManagerMasterEndpoint: Registering block manager 172.31.35.244:35372 with 414.4 MB RAM, BlockManagerId(driver, 172.31.35.244, 35372)
  24. 17/01/20 09:29:32 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.31.35.244, 35372)
  25.  
  26. ** Script Started: 2017-01-20 09:29:34.115886 **
  27. Loading file...  Done!
  28. Adjusting data to fit our needs...  done!
  29.  
  30. ** DOWNSTREAM_SIZE Statistical Measures **
  31.  
  32. 17/01/20 09:36:58 ERROR Executor: Exception in task 2.0 in stage 3.0 (TID 160)
  33. java.lang.NullPointerException
  34.         at java.text.DecimalFormat.parse(DecimalFormat.java:1997)
  35.         at java.text.NumberFormat.parse(NumberFormat.java:383)
  36.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply$mcD$sp(CSVInferSchema.scala:259)
  37.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply(CSVInferSchema.scala:259)
  38.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply(CSVInferSchema.scala:259)
  39.         at scala.util.Try.getOrElse(Try.scala:79)
  40.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:259)
  41.         at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:116)
  42.         at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:85)
  43.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:128)
  44.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:127)
  45.         at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
  46.         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
  47.         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  48.         at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:91)
  49.         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
  50.         at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
  51.         at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
  52.         at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.next(InMemoryRelation.scala:106)
  53.         at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.next(InMemoryRelation.scala:98)
  54.         at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
  55.         at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:935)
  56.         at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926)
  57.         at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
  58.         at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:926)
  59.         at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:670)
  60.         at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
  61.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
  62.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  63.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  64.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  65.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  66.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  67.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  68.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  69.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  70.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  71.         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
  72.         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
  73.         at org.apache.spark.scheduler.Task.run(Task.scala:86)
  74.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  75.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  76.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  77.         at java.lang.Thread.run(Thread.java:745)
  78. 17/01/20 09:36:58 ERROR TaskSetManager: Task 2 in stage 3.0 failed 1 times; aborting job
  79. 17/01/20 09:36:58 ERROR Executor: Exception in task 4.0 in stage 3.0 (TID 162)
  80. org.apache.spark.TaskKilledException
  81.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:264)
  82.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  83.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  84.         at java.lang.Thread.run(Thread.java:745)
  85. Traceback (most recent call last):
  86.   File "/home/hadoop/./adiel.py", line 59, in <module>
  87.     withColumn("Variance", pow(col("Stddev"), 2)).show(3, False)
  88.   File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 287, in show
  89.   File "/usr/lib/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__
  90.   File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  91.   File "/usr/lib/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value
  92. py4j.protocol.Py4JJavaError: An error occurred while calling o77.showString.
  93. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 3.0 failed 1 times, most recent failure: Lost task 2.0 in stage 3.0 (TID 160, localhost): java.lang.NullPointerException
  94.         at java.text.DecimalFormat.parse(DecimalFormat.java:1997)
  95.         at java.text.NumberFormat.parse(NumberFormat.java:383)
  96.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply$mcD$sp(CSVInferSchema.scala:259)
  97.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply(CSVInferSchema.scala:259)
  98.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply(CSVInferSchema.scala:259)
  99.         at scala.util.Try.getOrElse(Try.scala:79)
  100.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:259)
  101.         at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:116)
  102.         at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:85)
  103.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:128)
  104.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:127)
  105.         at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
  106.         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
  107.         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  108.         at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:91)
  109.         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
  110.         at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
  111.         at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
  112.         at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.next(InMemoryRelation.scala:106)
  113.         at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.next(InMemoryRelation.scala:98)
  114.         at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
  115.         at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:935)
  116.         at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926)
  117.         at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
  118.         at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:926)
  119.         at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:670)
  120.         at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
  121.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
  122.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  123.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  124.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  125.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  126.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  127.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  128.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  129.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  130.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  131.         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
  132.         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
  133.         at org.apache.spark.scheduler.Task.run(Task.scala:86)
  134.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  135.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  136.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  137.         at java.lang.Thread.run(Thread.java:745)
  138.  
  139. Driver stacktrace:
  140.         at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
  141.         at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
  142.         at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
  143.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  144.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  145.         at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
  146.         at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
  147.         at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
  148.         at scala.Option.foreach(Option.scala:257)
  149.         at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
  150.         at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667)
  151.         at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
  152.         at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
  153.         at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
  154.         at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
  155.         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1873)
  156.         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1886)
  157.         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1899)
  158.         at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:347)
  159.         at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:39)
  160.         at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2193)
  161.         at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
  162.         at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2546)
  163.         at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2192)
  164.         at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2199)
  165.         at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1935)
  166.         at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1934)
  167.         at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2576)
  168.         at org.apache.spark.sql.Dataset.head(Dataset.scala:1934)
  169.         at org.apache.spark.sql.Dataset.take(Dataset.scala:2149)
  170.         at org.apache.spark.sql.Dataset.showString(Dataset.scala:239)
  171.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  172.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  173.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  174.         at java.lang.reflect.Method.invoke(Method.java:498)
  175.         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
  176.         at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
  177.         at py4j.Gateway.invoke(Gateway.java:280)
  178.         at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  179.         at py4j.commands.CallCommand.execute(CallCommand.java:79)
  180.         at py4j.GatewayConnection.run(GatewayConnection.java:214)
  181.         at java.lang.Thread.run(Thread.java:745)
  182. Caused by: java.lang.NullPointerException
  183.         at java.text.DecimalFormat.parse(DecimalFormat.java:1997)
  184.         at java.text.NumberFormat.parse(NumberFormat.java:383)
  185.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply$mcD$sp(CSVInferSchema.scala:259)
  186.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply(CSVInferSchema.scala:259)
  187.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$$anonfun$castTo$4.apply(CSVInferSchema.scala:259)
  188.         at scala.util.Try.getOrElse(Try.scala:79)
  189.         at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:259)
  190.         at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:116)
  191.         at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:85)
  192.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:128)
  193.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:127)
  194.         at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
  195.         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
  196.         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  197.         at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:91)
  198.         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
  199.         at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
  200.         at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
  201.         at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.next(InMemoryRelation.scala:106)
  202.         at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.next(InMemoryRelation.scala:98)
  203.         at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
  204.         at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:935)
  205.         at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926)
  206.         at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
  207.         at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:926)
  208.         at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:670)
  209.         at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
  210.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
  211.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  212.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  213.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  214.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  215.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  216.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  217.         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  218.         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
  219.         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
  220.         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
  221.         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
  222.         at org.apache.spark.scheduler.Task.run(Task.scala:86)
  223.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  224.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  225.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  226.         ... 1 more
  227.  
  228. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  229. java.io.IOException: Filesystem closed
  230.         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
  231.         at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:710)
  232.         at java.io.FilterInputStream.close(FilterInputStream.java:181)
  233.         at org.apache.hadoop.util.LineReader.close(LineReader.java:150)
  234.         at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:231)
  235.         at org.apache.spark.sql.execution.datasources.RecordReaderIterator.close(RecordReaderIterator.scala:66)
  236.         at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.close(HadoopFileLinesReader.scala:54)
  237.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$7$$anonfun$apply$1.apply(CSVFileFormat.scala:116)
  238.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$7$$anonfun$apply$1.apply(CSVFileFormat.scala:116)
  239.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  240.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  241.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  242.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  243.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  244.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  245.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  246.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  247.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  248.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  249.         at java.lang.Thread.run(Thread.java:745)
  250. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  251. java.lang.IllegalStateException: Block broadcast_5 not found
  252.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  253.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  254.         at scala.Option.getOrElse(Option.scala:121)
  255.         at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:287)
  256.         at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:630)
  257.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  258.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  259.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  260.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  261.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  262.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  263.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  264.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  265.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  266.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  267.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  268.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  269.         at java.lang.Thread.run(Thread.java:745)
  270. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  271. java.lang.IllegalStateException: Block broadcast_6 not found
  272.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  273.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  274.         at scala.Option.getOrElse(Option.scala:121)
  275.         at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:287)
  276.         at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:630)
  277.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  278.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  279.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  280.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  281.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  282.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  283.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  284.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  285.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  286.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  287.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  288.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  289.         at java.lang.Thread.run(Thread.java:745)
  290. 17/01/20 09:36:59 ERROR Executor: Exception in task 1.0 in stage 3.0 (TID 159)
  291. java.util.NoSuchElementException: None.get
  292.         at scala.None$.get(Option.scala:347)
  293.         at scala.None$.get(Option.scala:345)
  294.         at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
  295.         at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
  296.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
  297.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  298.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  299.         at java.lang.Thread.run(Thread.java:745)
  300. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  301. java.io.IOException: Filesystem closed
  302.         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
  303.         at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:710)
  304.         at java.io.FilterInputStream.close(FilterInputStream.java:181)
  305.         at org.apache.hadoop.util.LineReader.close(LineReader.java:150)
  306.         at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:231)
  307.         at org.apache.spark.sql.execution.datasources.RecordReaderIterator.close(RecordReaderIterator.scala:66)
  308.         at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.close(HadoopFileLinesReader.scala:54)
  309.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$7$$anonfun$apply$1.apply(CSVFileFormat.scala:116)
  310.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$7$$anonfun$apply$1.apply(CSVFileFormat.scala:116)
  311.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  312.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  313.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  314.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  315.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  316.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  317.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  318.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  319.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  320.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  321.         at java.lang.Thread.run(Thread.java:745)
  322. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  323. java.lang.IllegalStateException: Block broadcast_5 not found
  324.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  325.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  326.         at scala.Option.getOrElse(Option.scala:121)
  327.         at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:287)
  328.         at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:630)
  329.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  330.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  331.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  332.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  333.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  334.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  335.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  336.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  337.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  338.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  339.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  340.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  341.         at java.lang.Thread.run(Thread.java:745)
  342. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  343. java.lang.IllegalStateException: Block broadcast_6 not found
  344.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  345.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  346.         at scala.Option.getOrElse(Option.scala:121)
  347.         at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:287)
  348.         at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:630)
  349.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  350.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  351.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  352.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  353.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  354.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  355.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  356.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  357.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  358.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  359.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  360.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  361.         at java.lang.Thread.run(Thread.java:745)
  362. 17/01/20 09:36:59 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 158)
  363. java.util.NoSuchElementException: None.get
  364.         at scala.None$.get(Option.scala:347)
  365.         at scala.None$.get(Option.scala:345)
  366.         at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
  367.         at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
  368.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
  369.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  370.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  371.         at java.lang.Thread.run(Thread.java:745)
  372. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  373. java.io.IOException: Filesystem closed
  374.         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
  375.         at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:710)
  376.         at java.io.FilterInputStream.close(FilterInputStream.java:181)
  377.         at org.apache.hadoop.util.LineReader.close(LineReader.java:150)
  378.         at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:231)
  379.         at org.apache.spark.sql.execution.datasources.RecordReaderIterator.close(RecordReaderIterator.scala:66)
  380.         at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.close(HadoopFileLinesReader.scala:54)
  381.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$7$$anonfun$apply$1.apply(CSVFileFormat.scala:116)
  382.         at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$7$$anonfun$apply$1.apply(CSVFileFormat.scala:116)
  383.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  384.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  385.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  386.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  387.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  388.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  389.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  390.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  391.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  392.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  393.         at java.lang.Thread.run(Thread.java:745)
  394. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  395. java.lang.IllegalStateException: Block broadcast_5 not found
  396.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  397.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  398.         at scala.Option.getOrElse(Option.scala:121)
  399.         at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:287)
  400.         at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:630)
  401.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  402.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  403.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  404.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  405.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  406.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  407.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  408.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  409.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  410.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  411.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  412.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  413.         at java.lang.Thread.run(Thread.java:745)
  414. 17/01/20 09:36:59 ERROR TaskContextImpl: Error in TaskCompletionListener
  415. java.lang.IllegalStateException: Block broadcast_6 not found
  416.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  417.         at org.apache.spark.storage.BlockInfoManager$$anonfun$1.apply(BlockInfoManager.scala:288)
  418.         at scala.Option.getOrElse(Option.scala:121)
  419.         at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:287)
  420.         at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:630)
  421.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  422.         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$releaseLock$1.apply(TorrentBroadcast.scala:210)
  423.         at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
  424.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
  425.         at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
  426.         at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  427.         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  428.         at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:95)
  429.         at org.apache.spark.scheduler.Task.run(Task.scala:99)
  430.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
  431.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  432.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  433.         at java.lang.Thread.run(Thread.java:745)
  434. 17/01/20 09:36:59 ERROR Executor: Exception in task 3.0 in stage 3.0 (TID 161)
  435. java.util.NoSuchElementException: None.get
  436.         at scala.None$.get(Option.scala:347)
  437.         at scala.None$.get(Option.scala:345)
  438.         at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
  439.         at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
  440.         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
  441.         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  442.         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  443.         at java.lang.Thread.run(Thread.java:745)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement