User contributions for M54rahma
Jump to navigation
Jump to search
5 April 2025
- 16:3516:35, 5 April 2025 diff hist +154 stat946W25 →Mathematical Formulation
- 16:2916:29, 5 April 2025 diff hist +89 N File:forward reverse diffusion processes.png A graphical model representing the forward and reverse diffusion processes. current
- 16:2716:27, 5 April 2025 diff hist +3,830 stat946W25 →Diffusion-LM: A Continuous Diffusion Model for Controllable Text Generation
- 15:1815:18, 5 April 2025 diff hist −1 stat946W25 →Controllable Generation Tasks
- 15:1715:17, 5 April 2025 diff hist −211 stat946W25 →Diffusion-LM: A Continuous Diffusion Model for Controllable Text Generation
- 14:0414:04, 5 April 2025 diff hist +702 stat946W25 →Controllable Generation with Diffusion-LM
- 12:3912:39, 5 April 2025 diff hist +21 stat946W25 →Exposure Bias in AR Models
29 March 2025
- 15:0615:06, 29 March 2025 diff hist +6 stat946W25 →H_2O: Efficient KV Cache Compression for Large Language Models
- 15:0515:05, 29 March 2025 diff hist +6,318 stat946W25 →Method
28 March 2025
- 15:5815:58, 28 March 2025 diff hist +3,625 stat946W25 →H_2O: Efficient KV Cache Compression for Large Language Models
- 15:1915:19, 28 March 2025 diff hist −27 stat946W25 →The Problem This Paper Tried to Address
- 15:1815:18, 28 March 2025 diff hist +286 stat946W25 →Method
- 14:5714:57, 28 March 2025 diff hist +1,106 stat946W25 →Key Contributions
- 13:1513:15, 28 March 2025 diff hist −5 stat946W25 →The Problem This Paper Tried to Address
- 13:1313:13, 28 March 2025 diff hist +191 stat946W25 →The Problem This Paper Tried to Address
- 13:0813:08, 28 March 2025 diff hist +803 stat946W25 →The Problem This Paper Tried to Address
- 13:0613:06, 28 March 2025 diff hist +39 N File:H2Q Figure1.png accuracy-memory trade-off current
- 12:0012:00, 28 March 2025 diff hist +28 stat946W25 →Limitations and Future Work
- 12:0012:00, 28 March 2025 diff hist +240 stat946W25 →Limitations and Future Work
- 00:1600:16, 28 March 2025 diff hist −26 stat946W25 →The Problem This Paper Tried to Address
- 00:0500:05, 28 March 2025 diff hist +181 stat946W25 →Background
21 March 2025
- 21:3821:38, 21 March 2025 diff hist +12 stat946W25 →1- Fine-grained Hardware-friendly Quantization Scheme
- 21:3321:33, 21 March 2025 diff hist +1,264 stat946W25 →1- Fine-grained Hardware-friendly Quantization Scheme
- 20:5920:59, 21 March 2025 diff hist +1,446 stat946W25 →1- Fine-grained Hardware-friendly Quantization Scheme
- 20:2920:29, 21 March 2025 diff hist +181 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
- 20:2220:22, 21 March 2025 diff hist +121 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
- 20:2120:21, 21 March 2025 diff hist +1,244 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
- 18:4118:41, 21 March 2025 diff hist +1,624 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
- 00:4400:44, 21 March 2025 diff hist +710 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
- 00:2500:25, 21 March 2025 diff hist +503 stat946W25 →Layer-by-Layer Knowledge Distillation (LKD)
- 00:2200:22, 21 March 2025 diff hist −4 stat946W25 →Topic 4: Quantization
- 00:2100:21, 21 March 2025 diff hist −1,093 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
- 00:0600:06, 21 March 2025 diff hist +2,155 stat946W25 →ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
10 March 2025
- 23:0623:06, 10 March 2025 diff hist +76 stat946W25 →Key Approaches to Linear Attention
- 23:0123:01, 10 March 2025 diff hist +33 N File:RetNet dual.png Dual form of RetNet current
- 22:5422:54, 10 March 2025 diff hist +901 stat946W25 →Gated Linear Attention (GLA)
- 22:4422:44, 10 March 2025 diff hist +874 stat946W25 →Gated Linear Attention (GLA)
- 22:0022:00, 10 March 2025 diff hist +21 stat946W25 →Topic 10: Linear Attention
- 21:5521:55, 10 March 2025 diff hist +1,101 stat946W25 →Key Approaches to Linear Attention
- 21:1221:12, 10 March 2025 diff hist +1,124 stat946W25 →Topic 10: Linear Attention
- 20:3420:34, 10 March 2025 diff hist +296 stat946W25 →Topic 10: Linear Attention
2 March 2025
- 20:4420:44, 2 March 2025 diff hist +2,046 stat946W25 →Core concepts
- 15:5715:57, 2 March 2025 diff hist +1,667 stat946W25 →Introduction
- 15:5715:57, 2 March 2025 diff hist +1,711 stat946W25 →Topic 12: State Space Models
- 15:5415:54, 2 March 2025 diff hist −3,404 main Page →State Space Models
- 15:5215:52, 2 March 2025 diff hist +78 main Page →Core concepts
- 15:3815:38, 2 March 2025 diff hist +1,120 main Page →Core concepts
- 15:1915:19, 2 March 2025 diff hist +27 N File:RNN Structure.png RNN structure current
- 15:1615:16, 2 March 2025 diff hist +469 main Page →State Space Models
- 14:2414:24, 2 March 2025 diff hist +1,736 main Page →Statistical Learning - Classification (STAT 441/841 CM 763- Fall 2021)