Advertisement
spather

Untitled

Sep 29th, 2023
18
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 6.27 KB | None | 0 0
  1. 04adedf Add some analysis of the adjustments from SA and ffwd.
  2. 045dfb4 Add the ffwd outputs to the experiment
  3. ab20384 Factor core logic of cluster_proj_outputs() into a more generic function
  4. 395f245 Factor core logic of get_all_proj_outputs_for_slen() into more generic function
  5. 1d556c1 Add Cluster class
  6. 12629bb Re-ran the experiment and the file sizes changed slightly
  7. 019b28d Rename ProjMatrixExperiment to BlockInternalsExperiment
  8. 644a279 Remove index from substrins
  9. a819aeb Rename cluster_proj_matrix_results() to cluster_proj_outputs()
  10. 0dba347 Add missing line to LogitsExperiment test
  11. 7929eeb Add proj matrix experiment and analysis
  12. 479c55e quick look at proj weight and bias
  13. f2f5771 Code to analyze tensors in the block and visualize them.
  14. b322789 Add plot_wei_for_all_heads()
  15. 47e9991 Make plot_wei take axes
  16. 15e11a3 Format cell containing attention_head_details() and plot_wei()
  17. 09af049 Start examining example
  18. 36d8d52 Fix bug in plot_logit_lens: first row is the input
  19. 454fc27 Find some interesting strings and begin analyzing them
  20. 62903d4 Make stride length 96 and re-run logits experiment to get more data
  21. 27711d4 Replace all the manual calculations in DataBatcher with tensor.unfold()
  22. 9ddc0e4 Introduce DataBatcher
  23. caabdb4 Experiment to run a bunch of strings through the model and examine logits
  24. 015d59c Study of the V matrix
  25. e5eb62d WIP trying to understand V
  26. e490b48 More work on positional encodings
  27. 6e8ee21 Positional encoding investigation
  28. 6e0e43d Fix bugs in plotting positional encodings and plot them for b0h0
  29. 5e07e4a Analyze a few more
  30. d89b6ae Start of analysis of long block results
  31. 175390f Add .gitignore for s_len256 files
  32. c13dac8 Run the attention weights experiments on all layers/heads
  33. 56a31c9 Add code to print the top tokens in the top decile
  34. da9ea11 Make plot_wei() take any iterable of str for the labels
  35. 0c3201e Clean up quick experiments on long contexts
  36. 9dc6238 Reduce copypasta in analyze_attention_weight_results() and print prefixes
  37. a2548ab Update analysis with new experiment results.
  38. edf5509 Make attention weights experiment run over all strings in the validation set
  39. 5982bda Add gitignore
  40. 3413973 Remove some unused variables from run_attention_weights_experiment()
  41. ecc5047 Add some explanatory text to the attention weights experiments section and allow passing in the data set
  42. 5559b4f Analysis code for attention weight experiment results
  43. 44f7910 Experiment to run a bunch of sample tokens through and get the attention weights
  44. e7cbf46 Attention head analysis showing how b1h0 copies the first row to the second.
  45. f28800c more interpretation of outputs.
  46. c70446a slight improvement to attention_head_details and more analysis of output
  47. 185f107 Add workthrough of output calculation
  48. 5a06ccc Fix math errors
  49. bec0625 Code to display attention heads
  50. ecfbdff Some exposition on the math behind attention
  51. 400fdba Detailed analysis and examination of how `:` enters at block 0
  52. ad48eda Add format_topk_chars
  53. ba6dfc1 Readability improvement
  54. d68767e Consider sa residual in addition to ffwd output
  55. 3422f65 Fix head progression analysis to use ffwd output
  56. f2cec93 Refactor head_progression to return the whole io_accessor
  57. 5091a61 Change ortho experiment to use uniform distribution and try variant with angles
  58. b93edd4 WIP analysis of blocks
  59. 9e48393 Some early evidence that predicted chars correlate with cosine sims of learned embeddings
  60. f8b7d2c Small refactorings and code to analyze frequencies in the input text and compare to transfomer.
  61. 102b1f3 Add frequency graph for corpus
  62. a5e8f06 Add title to blocks progress plot
  63. fb8d069 Cleanup random experiments a bit
  64. 130c904 Add std dev to orthogonality plot
  65. c57fa43 Response curve code cleanup and add kq response curves
  66. 32c2894 Add plots of all response curves
  67. 795d672 Basic response curve experiments
  68. ad5d8f2 Add orthogonality experiment
  69. 7d893d5 Add stats about cosine similarity matrices
  70. d2e2411 Re-run the multi-embeddings and rotations with the new code; update graphs and conclusions
  71. a408cd1 Clean up a bunch of stuff in experiments
  72. 2f12ac7 Remove a bunch of junk from experiments
  73. 980578b Add function to disambiguate filenames based on case. Use this everywhere
  74. de882ee Fix major bug in creation of char_to_embedding
  75. 5b51df5 Move singular vectors code into the main section and build char_to_embeddings from it
  76. c6d39e0 WIP: a bunch of experiments related to multi-embeddings
  77. f23a19d Add title to cosine sims plots and add plot for full embeddings
  78. 2e0cf77 Add code to create final PCA embeddings and move cosine similarity out of the random experiments section
  79. c5cf3b8 Remove some useless experiments
  80. de6e58b Clean up experiments in light of bug fix and add explanatory notes.
  81. 389b458 Replaced manual loops to find indices with helper functions.
  82. 79a09b9 Clean up some error cells
  83. 1d883d5 WIP: Adding PCA
  84. f8a32de Add code to plot embeddings
  85. c49b107 Add analysis of zeroing out last bits and replacing them with random values
  86. 1a15255 Delete old rotations files
  87. 0868e53 Port and improve code to perform rotations, save, load, and plot results.
  88. a9fcd65 Add line_profiler to dev requirements
  89. fc57231 Remove loop in computation of x[n_embed-1] in cartesian_from_spherical; brings down execution time from 373ms to 3.28ms
  90. 2fe3005 Pre-compute cumulative product of sines. Brings down execution time for cartesian_from_spherical from 373ms to 6.17ms
  91. ac71edd improve cartesian_from_spherical perf by caching sines and cosines
  92. e74d50c Port over rotation functions from old notebook with better tests
  93. 43a8081 Add code to learn final embeddings and show they are not unique
  94. 6ebebad Format logit lens code
  95. 163d2f3 Implement logit lens in the new codebase
  96. 1baec34 Show expanded graphs of block 1 self attention and final block output
  97. 9f3d17c Move the function that computes intermediates higher up.
  98. e1edaf2 Replicate the heads isolation analysis in the new codebase
  99. 677414a Iterate on new functions for running the model to the point that I can duplicate the blocks progress analysis
  100. acb69aa Start of cleaned up analysis notebook
  101. 3638246 rename file to scratchpad
  102. 6e489a6 Add vector rotations experiment
  103. 04dfd24 Fixed learning of embeddings to include layer_norm; learned embeddings for just logits. all still WIP
  104. 89b0de2 WIP checkpoint with a whole lot of rando experiments
  105. c82a374 add svd for attention heads
  106. 1ecebf7 Early experiment of projecting the singular vectors into token space
  107. ff52cc2 start of the logit lens experiment
  108. 6445566 Initial commit
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement