User contributions for X56tan
Jump to navigation
Jump to search
6 April 2025
- 02:2602:26, 6 April 2025 diff hist +1,682 stat946W25 →Topic 20: Diffusion Language Model
- 02:1102:11, 6 April 2025 diff hist +419 stat946W25 →Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
- 02:0102:01, 6 April 2025 diff hist +385 stat946W25 →Masked-Diffusion LM: Faster and Smarter
- 01:5901:59, 6 April 2025 diff hist +397 stat946W25 →Motivation and Limitations of Autoregressive Models
- 01:5901:59, 6 April 2025 diff hist −397 stat946W25 →Gaussian Diffusion for Text Tag: Manual revert
- 01:5701:57, 6 April 2025 diff hist +397 stat946W25 →Gaussian Diffusion for Text
29 March 2025
- 01:4801:48, 29 March 2025 diff hist +73 stat946W25 →Slicing: Enabling Elastic Inference
- 01:4601:46, 29 March 2025 diff hist 0 N File:MatFormer-Training.png No edit summary current
- 01:4501:45, 29 March 2025 diff hist +2,099 stat946W25 →MatFormer: Nested Transformer for Elastic Inference
22 March 2025
- 23:3023:30, 22 March 2025 diff hist +1,691 stat946W25 →SmoothQuant
- 23:0123:01, 22 March 2025 diff hist +2 stat946W25 →ZeroQuant
- 22:5622:56, 22 March 2025 diff hist +1,602 stat946W25 →GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
15 March 2025
- 23:0823:08, 15 March 2025 diff hist +145 stat946W25 →Flash Attention V3
- 23:0523:05, 15 March 2025 diff hist 0 N File:Flash Attentionv3-Results.png No edit summary current
- 23:0423:04, 15 March 2025 diff hist 0 N File:Flash Attentionv3-PingPong.png No edit summary current
- 23:0223:02, 15 March 2025 diff hist +1,674 stat946W25 →Flash Attention V3
8 March 2025
- 00:2800:28, 8 March 2025 diff hist −1 stat946W25 →Sparse Sinkhorn Attention
- 00:2700:27, 8 March 2025 diff hist +109 stat946W25 →Sparse Sinkhorn Attention
- 00:2600:26, 8 March 2025 diff hist 0 N File:Overview of Sparse Sinkhorn Attention.png No edit summary current
- 00:2500:25, 8 March 2025 diff hist +1,707 stat946W25 →Topic 8: Sparse Attention
7 March 2025
- 20:3220:32, 7 March 2025 diff hist +1 stat946W25 →Mamba-2
- 20:2720:27, 7 March 2025 diff hist +1 stat946W25 →Mamba-2
- 20:2620:26, 7 March 2025 diff hist −13 stat946W25 →Mamba-2
- 20:2320:23, 7 March 2025 diff hist +252 stat946W25 →Mamba-2
- 20:1820:18, 7 March 2025 diff hist 0 N File:Mamba2-VennDiag.png No edit summary current
- 20:1720:17, 7 March 2025 diff hist 0 N File:Mamba2-SSD.png No edit summary current
- 20:1720:17, 7 March 2025 diff hist 0 N File:Mamba2-architecture.png No edit summary current
- 20:1320:13, 7 March 2025 diff hist +294 stat946W25 →Semiseparable Matrices
- 20:0820:08, 7 March 2025 diff hist 0 stat946W25 →Semiseparable Matrices
- 20:0720:07, 7 March 2025 diff hist −7 stat946W25 →Semiseparable Matrices
- 20:0720:07, 7 March 2025 diff hist +69 stat946W25 →Semiseparable Matrices
- 20:0520:05, 7 March 2025 diff hist 0 N File:Mamba2-Matrix Efficient Algo.png No edit summary current
- 20:0220:02, 7 March 2025 diff hist +4,307 stat946W25 →Topic 12: State Space Models