User contributions for W55xu
Jump to navigation
Jump to search
2 April 2025
- 13:4713:47, 2 April 2025 diff hist +1,057 stat946W25 →Introduction and Motivation
- 13:2713:27, 2 April 2025 diff hist +1,449 stat946W25 →Limitations and Future Work
1 April 2025
- 15:0715:07, 1 April 2025 diff hist +640 stat946W25 →Key Methods:
- 15:0315:03, 1 April 2025 diff hist +467 stat946W25 →Metrics for Distilled Data
14 March 2025
- 17:5617:56, 14 March 2025 diff hist +661 stat946W25 →Topic 5: KD / Pruning / Sharing
- 17:4717:47, 14 March 2025 diff hist +516 stat946W25 →TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
- 17:3917:39, 14 March 2025 diff hist +6 stat946W25 →Hardware Optimizations
- 17:3817:38, 14 March 2025 diff hist +436 stat946W25 →Hardware Optimizations
- 17:3717:37, 14 March 2025 diff hist −481 stat946W25 →Chunking Strategy Tag: Manual revert
- 17:3517:35, 14 March 2025 diff hist +481 stat946W25 →Chunking Strategy
- 17:3117:31, 14 March 2025 diff hist +482 stat946W25 →Memory-Recall Tradeoff
7 March 2025
- 16:2316:23, 7 March 2025 diff hist +17 stat946W25 →Mamba
- 16:2116:21, 7 March 2025 diff hist +2 stat946W25 →Selective State Space Models in Mamba
- 16:2016:20, 7 March 2025 diff hist +2,363 stat946W25 →Mamba