Advertisement
wagner-cipriano

Sparse data structures in python (scipy.sparse + Pandas)

Jan 20th, 2019
177
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.79 KB | None | 0 0
  1. # -*- coding: utf-8 -*-
  2. """
  3. Created on Sat Jan 19 12:37:22 2018
  4. @author: Wagner Cipriano
  5.  
  6.  
  7. Sparse data structures in python
  8. scipy.sparse + Pandas
  9. """
  10.  
  11. #Imports:
  12. from StringIO import StringIO
  13. import pandas as pd
  14. from scipy.sparse import csr_matrix
  15.  
  16. #flight table data:
  17. TESTDATA = StringIO("""ori des voo
  18. 0 3 1
  19. 1 2 1
  20. 1 4 1
  21. 2 3 1
  22. """)
  23.  
  24. #Split file reading into chunks
  25. chunksize = 2  #1000000
  26. chunks = pd.read_csv( TESTDATA, sep=" ", chunksize=chunksize)
  27. #Concat chunks in dataframe
  28. df = pd.concat( chunk.to_sparse(fill_value=0) for chunk in chunks )
  29. ME = csr_matrix((list(df['voo']), (list(df['ori']), list(df['des']))), shape=[5,5])
  30.  
  31. #Symmetrization of scipy sparse matrices
  32. rows, cols = ME.nonzero()
  33. ME[cols, rows] = ME[rows, cols]
  34.  
  35. #Results:
  36. print ME
  37. print ME.todense()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement