User contributions for Catai
Jump to navigation
Jump to search
4 April 2025
- 21:4921:49, 4 April 2025 diff hist +1,548 stat946W25 →Masked-Diffusion LM: Faster and Smarter: : Added more details and examples to Masked-Diffusion LM.
29 March 2025
- 16:1216:12, 29 March 2025 diff hist +3,737 stat946W25 →Topic 7: Dynamic Models: Many-in-One Language Models: : Added a description for SHARCS.
22 March 2025
- 22:1922:19, 22 March 2025 diff hist +1,311 stat946W25 →Topic 4: Quantization: : Added more to the beginning to provide an overview of the technologies before more description is given on each one
12 March 2025
- 22:0422:04, 12 March 2025 diff hist +21 stat946W25 →Flash Attention V3: : Fixed formatting for better readability
- 22:0222:02, 12 March 2025 diff hist +1,578 stat946W25 →Topic 11: Flash Attention: : Added section for Flash Attention V3 and elaborated on the technique
- 21:3521:35, 12 March 2025 diff hist +178 stat946W25 →Flash Attention V1: : Add the key point of Flash Attention V1 and transition starter to the detailed description.
9 March 2025
- 15:2815:28, 9 March 2025 diff hist +4 stat946W25 →SpAtten: : Adjusted spacing for readability
- 15:2715:27, 9 March 2025 diff hist +2,322 stat946W25 →SpAtten: : Elaborated on the technique with more details
- 14:5614:56, 9 March 2025 diff hist 0 stat946W25 →Attention with Linear Biases (ALiBi): : Fixed typo.
- 14:5514:55, 9 March 2025 diff hist −71 stat946W25 →Sparse Sinkhorn Attention: : Reword the sorting section to be block specific given its heading.
- 14:5114:51, 9 March 2025 diff hist +253 stat946W25 →Sparse Sinkhorn Attention: : Added more description to the sorting algorithm and fixed typos.
- 12:2012:20, 9 March 2025 diff hist +1,091 stat946W25 →Topic 12: State Space Models: : Added key takeaway section with table to summarize the main wiki points
- 11:3811:38, 9 March 2025 diff hist −15 stat946W25 →Diagonal State Space Model (DSS): : Edited the transition sentence for better wiki flow.
- 11:3711:37, 9 March 2025 diff hist +975 stat946W25 →Structured State Space (S4): : Provided more description on the model and its shortcomings
- 11:2711:27, 9 March 2025 diff hist +2 stat946W25 Separated equations from description in core concepts section
- 11:2711:27, 9 March 2025 diff hist −1 stat946W25 Formatted spacing for core concepts
- 11:2711:27, 9 March 2025 diff hist +3 stat946W25 Formatted core concepts better with spacing
- 11:2611:26, 9 March 2025 diff hist +229 stat946W25 Added further description to "Core concepts"