Advertisement
Guest User

Untitled

a guest
Dec 8th, 2016
68
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.50 KB | None | 0 0
  1. ## The Setup:
  2. I have MRI data from a group of patients with brain damage (due to stroke, head injury, etc). For each patient,
  3. I have a binary MRI image (3D array) for each patient, where a value of 0 or 1 indicates the presence or absence
  4. of damage at a particular voxel (3D pixel). Each image is ~870k voxels.
  5.  
  6. ## The Goal:
  7. Return a list of patients with damage at a given voxel.
  8.  
  9. Here's the stupid/inelegant way to do this:
  10.  
  11. ```
  12. voxel_id = 12345 # Index of the voxel I'm interested in
  13. patient_matches = []
  14. for patient in patient_list:
  15. patient_img_file = '/path/to/images/{}.MRI'.format(patient) # Get the path to the image for this patient.
  16. img = load_image(patient_img_file) # Load the image as a NumPy ndarray.
  17. img_flat = img.ravel() #Flatten to a 1D
  18. vox_val = img[voxel_id]
  19. if vox_val == 1:
  20. patient_matches.append(patient)
  21. ```
  22. ## The Original Plan
  23.  
  24. 1. Load and flatten each image, so I have a 1x870k vector for each patient.
  25. 2. Read these vectors into a DB (`table='damage'`), with 1 row per patient and 1 column per voxel.
  26. 3. Query `select patient_id from damage where vox12345 = 1`
  27.  
  28. ## Other Possibilities
  29. I quickly realized that it's not reasonable/possible to have almost 1 million columns in a table, so my next thought was to have a separate table for each voxel (which would also reduce the amount of data that is read into memory with each query).
  30.  
  31. Another solution would be to read all the images into a single huge 2D table (n_patients x n_voxels) and query it using something like Pandas/xarray/PyTables/HDF.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement