Prof_Carvalho

Untitled

Feb 2nd, 2022
730
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.76 KB | None | 0 0
  1. #Instalando o Java no Colab
  2. !apt-get install openjdk-8-jdk-headless -qq > /dev/null
  3. #Realiza o download do Spark no Google Colab
  4. !wget -q https://dlcdn.apache.org/spark/spark-3.2.1/spark-3.2.1-bin-hadoop2.7.tgz
  5. #Descompacta o Spark que foi baixado na etapa anterior
  6. !tar -xf /content/spark-3.2.1-bin-hadoop2.7.tgz
  7. #Instala o pacote Python que acha o Spark
  8. !pip install findspark
  9. #Configura o Colab para utilizar a nossa instalação do Spark
  10. import os
  11. import findspark
  12.  
  13. os.environ['JAVA_HOME'] = '/usr/lib/jvm/java-8-openjdk-amd64'
  14. os.environ['SPARK_HOME'] = '/content/spark-3.2.1-bin-hadoop2.7'
  15.  
  16. findspark.init('spark-3.2.1-bin-hadoop2.7')
  17. #Cria um SparkSession
  18. from pyspark.sql import SparkSession
  19. spark = SparkSession.builder.master('local[*]').getOrCreate()
Advertisement
Add Comment
Please, Sign In to add comment