SHARE
TWEET

Untitled

a guest Jul 12th, 2019 72 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. data = {'visitor': ['foo', 'bar', 'jelmer'], 'A': [0, 1, 0], 'B': [1, 0, 1], 'C': [1, 0, 0]}
  2. import pandas as pd
  3. df = pd.DataFrame(data)
  4. ddf = spark.createDataFrame(df)
  5. ddf.write.csv("s3://demo-atlan-lake/test_with_no_keys_with_presto_jar")
  6.      
  7. >>> ddf.write.csv("s3://demo-atlan-lake/test_with_no_keys_with_presto_jar")
  8. Traceback (most recent call last):
  9.   File "<stdin>", line 1, in <module>
  10.   File "/usr/lib/spark/python/pyspark/sql/readwriter.py", line 927, in csv
  11.     self._jwrite.csv(path)
  12.   File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  13.   File "/usr/lib/spark/python/pyspark/sql/utils.py", line 63, in deco
  14.     return f(*a, **kw)
  15.   File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
  16. py4j.protocol.Py4JJavaError: An error occurred while calling o84.csv.
  17. : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
  18.     at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2369)
  19.     at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
  20.     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2857)
  21.     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
  22.     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2896)
  23.     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2878)
  24.     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:392)
  25.     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
  26.     at org.apache.spark.sql.execution.datasources.DataSource.planForWritingFileFormat(DataSource.scala:424)
  27.     at org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:524)
  28.     at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:290)
  29.     at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
  30.     at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
  31.     at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:664)
  32.     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  33.     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  34.     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  35.     at java.lang.reflect.Method.invoke(Method.java:498)
  36.     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
  37.     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
  38.     at py4j.Gateway.invoke(Gateway.java:282)
  39.     at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  40.     at py4j.commands.CallCommand.execute(CallCommand.java:79)
  41.     at py4j.GatewayConnection.run(GatewayConnection.java:238)
  42.     at java.lang.Thread.run(Thread.java:748)
  43. Caused by: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
  44.     at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2273)
  45.     at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2367)
  46.     ... 24 more
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
Not a member of Pastebin yet?
Sign Up, it unlocks many cool features!
 
Top