Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- commit e0cfe5cedd7b8a744a87d44d5f96b06b49655291
- Author: Shyam Pather <shyam.pather@gmail.com>
- Date: Thu Sep 21 15:06:39 2023 -0700
- Analyze a few more
- commit 926284d508e9227ab9e44316c8f8d7db9a7945cd
- Author: Shyam Pather <shyam.pather@gmail.com>
- Date: Thu Sep 21 14:49:20 2023 -0700
- Start of analysis of long block results
- 1. Code to find the interesting indices
- 2. Augment analyze_attention_weight_results() to plot the relevant parts of wei
- 3. Improve plot_wei() to print the repr of tokens as labels.
- commit 95be86bb6663ea427156d2c29703ebc5ed5cf17f
- Author: Shyam Pather <shyam.pather@gmail.com>
- Date: Thu Sep 21 09:10:00 2023 -0700
- Run the attention weights experiments on all layers/heads
- Got through all layers/heads on s_len=7 and through block 0/head 4 on
- the s_len=256 run. Not committing the binaries for the s_len=256 run
- because the file sizes are too large. Will work on a solution to split
- them.
- commit 67d477581f1d6eb8b20d2e70e88b74321fcbf25e
- Author: Shyam Pather <shyam.pather@gmail.com>
- Date: Wed Sep 20 11:04:58 2023 -0700
- Reduce copypasta in analyze_attention_weight_results() and print prefixes
- commit 5559b4fed09d5c5215dd9d248378c8ef280d36b4
- Author: Shyam Pather <shyam.pather@gmail.com>
- Date: Mon Sep 18 17:11:52 2023 -0700
- Analysis code for attention weight experiment results
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement