Advertisement
Guest User

Untitled

a guest
Mar 25th, 2019
97
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.48 KB | None | 0 0
  1. # Threads or processors per block
  2. threads_per_block = 512
  3. # Blocks per grid (or blocks per GPU)
  4. blocks_per_grid = 36
  5. # Send coord1 to GPU
  6. coord1_gpu = cuda.to_device(coord1)
  7. # Send coord2 to GPU
  8. coord2_gpu = cuda.to_device(coord2)
  9. # Create an output on GPU
  10. out_gpu = cuda.device_array(shape=(n,), dtype=np.int32)
  11. # Oof, okay, function[block_def,thread_def](*args)
  12. get_nearby_kernel[blocks_per_grid, threads_per_block](coord1_gpu, coord2_gpu, 1.0, out_gpu)
  13. # Get the result!
  14. out_gpu.copy_to_host()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement