File list
Jump to navigation
Jump to search
This special page shows all uploaded files.
Date | Name | Thumbnail | Size | User | Description | Versions |
---|---|---|---|---|---|---|
07:00, 3 April 2025 | dalle 5.png (file) | ![]() |
1.19 MB | Aelmancy | 1 | |
06:41, 3 April 2025 | dalle 4.webp (file) | ![]() |
13 KB | Aelmancy | source: https://medium.com/@zaiinn440/how-openais-dall-e-works-da24ac6c12fa | 1 |
04:51, 3 April 2025 | dalle 3.webp (file) | ![]() |
20 KB | Aelmancy | source: https://medium.com/@zaiinn440/how-openais-dall-e-works-da24ac6c12fa | 1 |
04:51, 3 April 2025 | dalle 2.webp (file) | ![]() |
19 KB | Aelmancy | source: https://medium.com/@zaiinn440/how-openais-dall-e-works-da24ac6c12fa | 1 |
04:40, 3 April 2025 | dalle 1.webp (file) | ![]() |
13 KB | Aelmancy | source: https://medium.com/@zaiinn440/how-openais-dall-e-works-da24ac6c12fa | 1 |
16:57, 1 April 2025 | BLIPV2Arch.png (file) | ![]() |
404 KB | Yc24wang | 1 | |
15:43, 1 April 2025 | table2 Latency.png (file) | ![]() |
18 KB | Q9lyu | Latency | 1 |
11:07, 1 April 2025 | The elastic continued-training phase.png (file) | ![]() |
54 KB | Q9lyu | The description of the elastic continued-training phase with random sampling | 1 |
11:07, 1 April 2025 | Selection.png (file) | ![]() |
463 KB | Y2582che | 1 | |
11:04, 1 April 2025 | Elastic.png (file) | ![]() |
242 KB | Y2582che | 1 | |
01:19, 1 April 2025 | SortedNet.png (file) | ![]() |
420 KB | P2zheng | 1 | |
00:45, 1 April 2025 | BLIP.png (file) | ![]() |
362 KB | P2zheng | 1 | |
04:10, 30 March 2025 | Flextron.png (file) | ![]() |
88 KB | T54shen | Converting Trained LLM into Flextron | 1 |
03:05, 29 March 2025 | 9-1.png (file) | ![]() |
123 KB | M2ghorba | 1 | |
01:46, 29 March 2025 | MatFormer-Training.png (file) | ![]() |
95 KB | X56tan | 1 | |
13:06, 28 March 2025 | H2Q Figure1.png (file) | ![]() |
159 KB | M54rahma | accuracy-memory trade-off | 1 |
23:50, 25 March 2025 | sa-table.png (file) | ![]() |
261 KB | J3bright | 1 | |
23:31, 25 March 2025 | sa.png (file) | ![]() |
195 KB | J3bright | 1 | |
21:51, 25 March 2025 | dynamic pruning 1.png (file) | ![]() |
58 KB | Aelmancy | 1 | |
21:09, 23 March 2025 | 940 17.2 output.png (file) | ![]() |
80 KB | P59zhang | 1 | |
20:30, 22 March 2025 | ZeroQuant3.png (file) | ![]() |
106 KB | W33jiang | 1 | |
20:16, 22 March 2025 | ZeroQuant2.png (file) | ![]() |
706 KB | W33jiang | 1 | |
20:06, 22 March 2025 | ZeroQuant.png (file) | ![]() |
116 KB | W33jiang | 1 | |
13:50, 21 March 2025 | H2O eviction algo.png (file) | ![]() |
78 KB | J46lei | 1 | |
21:38, 20 March 2025 | echoatt results3.png (file) | ![]() |
60 KB | Aelmancy | 1 | |
21:25, 20 March 2025 | echoatt results2.png (file) | ![]() |
67 KB | Aelmancy | 1 | |
21:22, 20 March 2025 | echoatt results1.png (file) | ![]() |
65 KB | Aelmancy | 1 | |
18:33, 20 March 2025 | gptq.png (file) | ![]() |
88 KB | J3bright | 1 | |
14:39, 19 March 2025 | Mean zero-shot accuracy.png (file) | ![]() |
276 KB | K4liang | 1 | |
14:31, 19 March 2025 | results on WikiText2.png (file) | ![]() |
144 KB | K4liang | 2 | |
13:41, 19 March 2025 | QwithRMSNorm.png (file) | ![]() |
172 KB | K4liang | 1 | |
13:38, 19 March 2025 | rmsnorm.png (file) | ![]() |
196 KB | K4liang | 1 | |
19:32, 18 March 2025 | echoatt.png (file) | ![]() |
324 KB | Aelmancy | https://doi.org/10.48550/arXiv.2409.14595 | 1 |
23:47, 17 March 2025 | GLA Results2.png (file) | ![]() |
112 KB | W33jiang | 1 | |
23:37, 17 March 2025 | GLA Results.png (file) | ![]() |
131 KB | W33jiang | 1 | |
23:22, 17 March 2025 | RetNet Results.png (file) | ![]() |
171 KB | W33jiang | 1 | |
20:09, 17 March 2025 | invariance Theorem.png (file) | ![]() |
160 KB | K4liang | 1 | |
12:52, 17 March 2025 | sortcut.png (file) | ![]() |
125 KB | J3bright | 1 | |
11:12, 17 March 2025 | cost flashfft.png (file) | ![]() |
90 KB | J3bright | 1 | |
23:57, 15 March 2025 | BigBird Results.png (file) | ![]() |
480 KB | W33jiang | 1 | |
23:46, 15 March 2025 | BigBird Sparse Attention.png (file) | ![]() |
142 KB | W33jiang | 1 | |
23:05, 15 March 2025 | Flash Attentionv3-Results.png (file) | ![]() |
142 KB | X56tan | 1 | |
23:04, 15 March 2025 | Flash Attentionv3-PingPong.png (file) | ![]() |
57 KB | X56tan | 1 | |
00:25, 15 March 2025 | Screenshot 2025-03-15.png (file) | ![]() |
43 KB | M2ghorba | 1 | |
21:54, 14 March 2025 | KD.png (file) | ![]() |
115 KB | P2zheng | Sequence-level knowledge distillation​ | 1 |
18:30, 14 March 2025 | Screenshot 2025-03-14 182945.png (file) | ![]() |
28 KB | A4ngan | 1 | |
21:07, 13 March 2025 | Flash Attention V2 Attention forward + backward speed on A100 GPU.png (file) | ![]() |
255 KB | W33jiang | 1 | |
21:32, 12 March 2025 | RobustAlgorithm.jpg (file) | ![]() |
60 KB | K4liang | 2 | |
00:50, 11 March 2025 | retnet comparison.png (file) | ![]() |
71 KB | Aelmancy | comparing retnet to other models. from @misc{sun2023retentivenetworksuccessortransformer, title={Retentive Network: A Successor to Transformer for Large Language Models}, author={Yutao Sun and Li Dong and Shaohan Huang and Shuming Ma and Yuqing Xia and Jilong Xue and Jianyong Wang and Furu Wei}, year={2023}, eprint={2307.08621}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2307.08621}, } | 1 |
23:01, 10 March 2025 | RetNet dual.png (file) | ![]() |
64 KB | M54rahma | Dual form of RetNet | 1 |