SHARE
TWEET

Untitled

a guest Aug 19th, 2019 62 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. human_129.sort_index(axis=1, inplace=True)
  2. human_129.sort_index(inplace=True)
  3.  
  4. human_152.sort_index(axis=1, inplace=True)
  5. human_152.sort_index(inplace=True)
  6.  
  7. intersection_cols = np.intersect1d(human_129.columns, human_152.columns)
  8. intersection_rows = np.intersect1d(human_129.index, human_152.index)
  9.  
  10. print(len(intersection_rows))
  11. print(len(intersection_cols))
  12.  
  13. human_129 = human_129.reindex(index=intersection_rows, columns=intersection_cols)
  14. human_152 = human_152.reindex(index=intersection_rows, columns=intersection_cols)
  15.  
  16. human_129.sort_index(axis=1, inplace=True)
  17. human_129.sort_index(inplace=True)
  18. human_152.sort_index(axis=1, inplace=True)
  19. human_152.sort_index(inplace=True)
  20.  
  21. print(human_129.shape)
  22. print(human_152.shape)
  23.  
  24.  
  25. sparsematrix = io.mmread('my_gene_ontology_edit.obo.2014-01-01.hierarchy_annotations.mtx')
  26. go_hierarchy = sparsematrix.todok()
  27.  
  28. col_names = row_names = np.genfromtxt('my_gene_ontology_edit.obo.2014-01-01.hierarchy_annotations.mtx.rownames.tsv', dtype=str)
  29.  
  30. go_hierarchy.axes = col_names
  31.  
  32. #necessario rinomiare per fare combaciare con termini go
  33. human_129.columns = human_129.columns.str.replace(':', '_')
  34. human_152.columns = human_152.columns.str.replace(':', '_')
  35.  
  36. #determinare termini go comuni tra organismo e matrice gerarchica
  37. intersection = np.sort(np.intersect1d(go_hierarchy.axes, human_129.columns))
  38. human_129 = human_129.reindex(columns=intersection)
  39. human_152 = human_152.reindex(columns=intersection)
  40.  
  41. human_129.sort_index(axis=1, inplace=True)
  42. human_129.sort_index(inplace=True)
  43. human_152.sort_index(axis=1, inplace=True)
  44. human_152.sort_index(inplace=True)
  45.  
  46. #determinazione indici elementi usabili della matrice gerarchica
  47. import numpy_indexed as npi
  48. %time idx = npi.indices(go_hierarchy.axes, intersection)
  49.  
  50.  
  51. # seleziona le colonne: attenzione impiega 4-5 minuti con 4 core
  52. %time go__ = go_hierarchy[:,idx]
  53.  
  54. #seleziona le colonne
  55. %time go = go__[idx,:]
  56.  
  57. human_129 = csr_matrix(human_129)
  58. human_152 = csr_matrix(human_152)
  59.  
  60. human_129_tp = (human_129.dot(go) > 0).astype('int8')
  61. human_129_tp = human_129_tp = human_129
  62.  
  63. human_152_tp = (human_152.dot(go) > 0).astype('int8')
  64. human_152_tp = human_152_tp + human_152
  65.  
  66. human_129 = pd.DataFrame(human_129_tp.todense(), index=intersection_rows, columns=intersection)
  67. human_152 = pd.DataFrame(human_152_tp.todense(), index=intersection_rows, columns=intersection)
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
Not a member of Pastebin yet?
Sign Up, it unlocks many cool features!
 
Top