File list

Jump to navigation Jump to search

This special page shows all uploaded files.

File list
First pagePrevious pageNext pageLast page
Date Name Thumbnail Size User Description Versions
14:31, 19 March 2025 results on WikiText2.png (file) 144 KB K4liang   2
13:41, 19 March 2025 QwithRMSNorm.png (file) 172 KB K4liang   1
13:38, 19 March 2025 rmsnorm.png (file) 196 KB K4liang   1
19:32, 18 March 2025 echoatt.png (file) 324 KB Aelmancy https://doi.org/10.48550/arXiv.2409.14595 1
23:47, 17 March 2025 GLA Results2.png (file) 112 KB W33jiang   1
23:37, 17 March 2025 GLA Results.png (file) 131 KB W33jiang   1
23:22, 17 March 2025 RetNet Results.png (file) 171 KB W33jiang   1
20:09, 17 March 2025 invariance Theorem.png (file) 160 KB K4liang   1
12:52, 17 March 2025 sortcut.png (file) 125 KB J3bright   1
11:12, 17 March 2025 cost flashfft.png (file) 90 KB J3bright   1
23:57, 15 March 2025 BigBird Results.png (file) 480 KB W33jiang   1
23:46, 15 March 2025 BigBird Sparse Attention.png (file) 142 KB W33jiang   1
23:05, 15 March 2025 Flash Attentionv3-Results.png (file) 142 KB X56tan   1
23:04, 15 March 2025 Flash Attentionv3-PingPong.png (file) 57 KB X56tan   1
00:25, 15 March 2025 Screenshot 2025-03-15.png (file) 43 KB M2ghorba   1
21:54, 14 March 2025 KD.png (file) 115 KB P2zheng Sequence-level knowledge distillation​ 1
18:30, 14 March 2025 Screenshot 2025-03-14 182945.png (file) 28 KB A4ngan   1
21:07, 13 March 2025 Flash Attention V2 Attention forward + backward speed on A100 GPU.png (file) 255 KB W33jiang   1
21:32, 12 March 2025 RobustAlgorithm.jpg (file) 60 KB K4liang   2
00:50, 11 March 2025 retnet comparison.png (file) 71 KB Aelmancy comparing retnet to other models. from @misc{sun2023retentivenetworksuccessortransformer, title={Retentive Network: A Successor to Transformer for Large Language Models}, author={Yutao Sun and Li Dong and Shaohan Huang and Shuming Ma and Yuqing Xia and Jilong Xue and Jianyong Wang and Furu Wei}, year={2023}, eprint={2307.08621}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2307.08621}, } 1
23:01, 10 March 2025 RetNet dual.png (file) 64 KB M54rahma Dual form of RetNet 1
17:39, 10 March 2025 retnet impossible triangle.png (file) 65 KB Aelmancy impossible triangle from @article{sun2023retentive, title={Retentive network: A successor to transformer for large language models}, author={Sun, Yutao and Dong, Li and Huang, Shaohan and Ma, Shuming and Xia, Yuqing and Xue, Jilong and Wang, Jianyong and Wei, Furu}, journal={arXiv preprint arXiv:2307.08621}, year={2023} } 1
00:26, 8 March 2025 Overview of Sparse Sinkhorn Attention.png (file) 87 KB X56tan   1
20:18, 7 March 2025 Mamba2-VennDiag.png (file) 57 KB X56tan   1
20:17, 7 March 2025 Mamba2-SSD.png (file) 69 KB X56tan   1
20:17, 7 March 2025 Mamba2-architecture.png (file) 55 KB X56tan   1
20:05, 7 March 2025 Mamba2-Matrix Efficient Algo.png (file) 145 KB X56tan   1
14:03, 7 March 2025 memory-recall-tradeoff.png (file) 132 KB Yc24wang   1
13:10, 7 March 2025 Mamba-Architecture.png (file) 119 KB Yc24wang   1
21:33, 6 March 2025 H3 synthetic tasks eval.png (file) 75 KB W33jiang   2
21:24, 6 March 2025 Synthetic tasks.png (file) 80 KB W33jiang   1
21:21, 6 March 2025 H3 layer.png (file) 54 KB W33jiang   1
18:07, 3 March 2025 13.13.png (file) 35 KB Rtymkow   1
15:41, 2 March 2025 mode collapse.png (file) 178 KB Ksuszek   1
15:19, 2 March 2025 RNN Structure.png (file) 75 KB M54rahma RNN structure 1
21:01, 1 March 2025 GAN part d.png (file) 28 KB Ksuszek   1
21:01, 1 March 2025 GAN part c.png (file) 28 KB Ksuszek   1
21:01, 10 February 2025 10.11.png (file) 120 KB Rtymkow   1
18:48, 7 February 2025 RNN plot.png (file) 215 KB W258xu   1
15:58, 7 February 2025 ouput.png (file) 188 KB Thudon   1
15:57, 7 February 2025 output.png (file) 188 KB Thudon   1
12:40, 7 February 2025 sentence similarity matrix.png (file) 15 KB Fjean   1
10:32, 7 February 2025 Screenshot 2025-02-07 093140.png (file) 62 KB A22amiri   1
09:14, 7 February 2025 Screenshot 2025-02-07 081441.png (file) 240 KB A22amiri   1
19:52, 4 February 2025 7.10a.png (file) 119 KB Rtymkow   1
16:28, 4 February 2025 double pendulum test.png (file) 78 KB Ksuszek   1
01:46, 31 January 2025 LSTM v.s. Dense model.png (file) 271 KB Z238zhan The plot of the accuracy of the LSTM model and the dense model 1
20:15, 30 January 2025 Recurrent NN .png (file) 429 KB Z238zhan Screenshot from Lecture 8 1
08:10, 30 January 2025 Exercise8 1.png (file) 33 KB C63ng Plot for exercise 8.1 1
18:35, 28 January 2025 different pooling strategies.png (file) 257 KB Z238zhan Validation accuracy for different pooling strategies on CIFAR-10 dataset 1
First pagePrevious pageNext pageLast page