Advertisement
Guest User

large-csv-to-kafka.clj

a guest
Jul 15th, 2020
66
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.84 KB | None | 0 0
  1. #!/usr/bin/env bb
  2. ;; ^^ this tells our shell to use Babashka to run this script
  3.  
  4. ;; read the file path of our CSV and the key field from the command line args
  5. (def csv-file-path (first *command-line-args*))
  6. (def key-field (second *command-line-args*))
  7.  
  8. (with-open [reader (io/reader csv-file-path)]
  9.  
  10. ;; read the CSV line-by-line into a data structure
  11. (def csv-data
  12. (csv/read-csv reader))
  13.  
  14. (def headers (first csv-data))
  15. (def body (rest csv-data))
  16.  
  17. ;; For each line in the body, create a map with the headers as the keys
  18. (def values
  19. (->> body
  20. (map (partial zipmap headers))
  21. ;; if you need to do any additional processing on each line, do it here
  22. ))
  23.  
  24. (def output-lines
  25. (->> values
  26. (map #(str (get % key-field) "::" (json/generate-string %)))))
  27.  
  28. (doseq [output output-lines]
  29. (println output))
  30. )
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement