Advertisement
spather

Untitled

Sep 30th, 2023
15
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.34 KB | None | 0 0
  1. commit e0cfe5cedd7b8a744a87d44d5f96b06b49655291
  2. Author: Shyam Pather <shyam.pather@gmail.com>
  3. Date: Thu Sep 21 15:06:39 2023 -0700
  4.  
  5. Analyze a few more
  6.  
  7. commit 926284d508e9227ab9e44316c8f8d7db9a7945cd
  8. Author: Shyam Pather <shyam.pather@gmail.com>
  9. Date: Thu Sep 21 14:49:20 2023 -0700
  10.  
  11. Start of analysis of long block results
  12.  
  13. 1. Code to find the interesting indices
  14. 2. Augment analyze_attention_weight_results() to plot the relevant parts of wei
  15. 3. Improve plot_wei() to print the repr of tokens as labels.
  16.  
  17. commit 95be86bb6663ea427156d2c29703ebc5ed5cf17f
  18. Author: Shyam Pather <shyam.pather@gmail.com>
  19. Date: Thu Sep 21 09:10:00 2023 -0700
  20.  
  21. Run the attention weights experiments on all layers/heads
  22.  
  23. Got through all layers/heads on s_len=7 and through block 0/head 4 on
  24. the s_len=256 run. Not committing the binaries for the s_len=256 run
  25. because the file sizes are too large. Will work on a solution to split
  26. them.
  27.  
  28. commit 67d477581f1d6eb8b20d2e70e88b74321fcbf25e
  29. Author: Shyam Pather <shyam.pather@gmail.com>
  30. Date: Wed Sep 20 11:04:58 2023 -0700
  31.  
  32. Reduce copypasta in analyze_attention_weight_results() and print prefixes
  33.  
  34. commit 5559b4fed09d5c5215dd9d248378c8ef280d36b4
  35. Author: Shyam Pather <shyam.pather@gmail.com>
  36. Date: Mon Sep 18 17:11:52 2023 -0700
  37.  
  38. Analysis code for attention weight experiment results
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement