User contributions for Q9lyu
Jump to navigation
Jump to search
8 April 2025
- 22:4122:41, 8 April 2025 diff hist +405 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
- 22:2922:29, 8 April 2025 diff hist +117 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
- 22:2422:24, 8 April 2025 diff hist +3 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
- 22:2322:23, 8 April 2025 diff hist +526 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
- 18:0418:04, 8 April 2025 diff hist +342 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
- 17:5417:54, 8 April 2025 diff hist +181 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
- 17:1117:11, 8 April 2025 diff hist +3 stat946W25 →Method: Learning Ratios via Denoising Score Entropy
1 April 2025
- 15:4415:44, 1 April 2025 diff hist +73 stat946W25 →Performance
- 15:4315:43, 1 April 2025 diff hist +21 N File:table2 Latency.png Latency current
- 15:4015:40, 1 April 2025 diff hist +275 stat946W25 →How Does FLEXTRON Work
- 15:2915:29, 1 April 2025 diff hist +167 stat946W25 →How Does FLEXTRON Work
- 15:1715:17, 1 April 2025 diff hist +247 stat946W25 →Step 4: Training the Routers using a Surrogate Model
- 15:0615:06, 1 April 2025 diff hist −12 stat946W25 →Step 2: Elastic Continued-Training
- 15:0315:03, 1 April 2025 diff hist +59 stat946W25 →How Does FLEXTRON Work
- 13:0813:08, 1 April 2025 diff hist +12 stat946W25 →Steps
- 13:0813:08, 1 April 2025 diff hist 0 stat946W25 →Step 3: Automatic Network Selection via Routers
- 13:0113:01, 1 April 2025 diff hist 0 stat946W25 →Step 3: Automatic Network Selection via Routers
- 13:0113:01, 1 April 2025 diff hist −40 stat946W25 →Steps
- 11:2111:21, 1 April 2025 diff hist +52 stat946W25 →Steps
- 11:1111:11, 1 April 2025 diff hist +76 stat946W25 →Steps
- 11:0711:07, 1 April 2025 diff hist +90 N File:The elastic continued-training phase.png The description of the elastic continued-training phase with random sampling current
25 March 2025
- 21:3121:31, 25 March 2025 diff hist +1,084 stat946W25 →Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
- 21:0721:07, 25 March 2025 diff hist +316 stat946W25 →Example of Fixed-Point Multiplier
- 15:4115:41, 25 March 2025 diff hist −16 stat946W25 →Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
- 15:4015:40, 25 March 2025 diff hist +450 stat946W25 →Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
- 13:5713:57, 25 March 2025 diff hist +7 stat946W25 →Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
18 March 2025
- 13:4113:41, 18 March 2025 diff hist +650 stat946W25 →Flash Attention V1
14 March 2025
- 22:5022:50, 14 March 2025 diff hist −6 stat946W25 →Diagonal State Space Model (DSS)
- 22:4822:48, 14 March 2025 diff hist +102 stat946W25 →Diagonal State Space Model (DSS)
- 22:4222:42, 14 March 2025 diff hist +3 stat946W25 →Diagonal State Space Model (DSS)
- 22:4222:42, 14 March 2025 diff hist +15 stat946W25 →Two Variants of DSS
- 22:3622:36, 14 March 2025 diff hist +592 stat946W25 →Diagonal State Space Model (DSS)
9 March 2025
- 22:0522:05, 9 March 2025 diff hist +75 stat946W25 →Big Bird Sparse Attention
- 18:5518:55, 9 March 2025 diff hist +411 stat946W25 →Big Bird Sparse Attention
6 March 2025
- 14:4514:45, 6 March 2025 diff hist +13 m stat946W25 →Core concepts