Advertisement
Guest User

Untitled

a guest
Mar 23rd, 2016
68
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.74 KB | None | 0 0
  1. %dep
  2. z.reset() // clean up
  3. z.load("/data/extraJarFiles/postgresql-9.4.1208.jar") // load a jdbc driver for postgresql
  4.  
  5. %spark
  6. // This code loads some data from a PostGreSql DB with the help of a JDBC driver.
  7. // The JDBC driver is stored on the Zeppelin server, the necessary Code is transfered to the Spark Workers and the workers build the connection with the DB.
  8. //
  9. // The connection between table and data source is "lazy". So the data will only be loaded in the case that an action need them.
  10. // With the current script means this the DB is queried twice. ==> Q: How can I keep a RDD in Mem or on disk?
  11.  
  12. import org.apache.spark.SparkContext
  13. import org.apache.spark.SparkContext._
  14. import org.apache.spark.SparkConf
  15. import org.apache.spark.rdd.JdbcRDD
  16. import java.sql.Connection
  17. import java.sql.DriverManager
  18. import java.sql.ResultSet
  19.  
  20. import org.apache.spark.sql.hive._
  21. import org.apache.spark.sql._
  22.  
  23. val url = "jdbc:postgresql://10.222.22.222:5432/myDatabase"
  24. val username = "postgres"
  25. val pw = "geheim"
  26.  
  27. Class.forName("org.postgresql.Driver").newInstance // activating the jdbc driver. The jar file was loaded inside of the %dep block
  28.  
  29.  
  30. case class RowClass(Id:Integer, Col1:String , Col2:String) // create a class with possible values
  31.  
  32. val myRDD = new JdbcRDD(sc, // SparkContext sc
  33. () => DriverManager.getConnection(url,username,pw), // scala.Function0<java.sql.Connection> getConnection
  34. "select * from tab1 where "Id">=? and "Id" <=? ", // String sql Important: we need here two '?' for the lower/upper Bounds vlaues
  35. 0, // long lowerBound = start value
  36. 10000, // long upperBound, = end value that is still included
  37. 1, // int numPartitions = the area is spitted into x sub commands.
  38. // e.g. 0,1000,2 => first cmd from 0 ... 499, second cmd from 500..1000
  39. row => RowClass(row.getInt("Id"),
  40. row.getString("Col1"),
  41. row.getString("Col2"))
  42. )
  43.  
  44. myRDD.toDF().registerTempTable("Tab1")
  45.  
  46. // --- improved methode (not working at the moment)----
  47. val prop = new java.util.Properties
  48. prop.setProperty("user",username)
  49. prop.setProperty("password",pw)
  50.  
  51. val tab1b = sqlContext.read.jdbc(url,"tab1",prop) // <-- not working
  52.  
  53. tab1b.show
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement