File:retnet comparison.png: Revision history

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

11 March 2025

  • curprev 00:5000:50, 11 March 2025Aelmancy talk contribs 476 bytes +476 comparing retnet to other models. from @misc{sun2023retentivenetworksuccessortransformer, title={Retentive Network: A Successor to Transformer for Large Language Models}, author={Yutao Sun and Li Dong and Shaohan Huang and Shuming Ma and Yuqing Xia and Jilong Xue and Jianyong Wang and Furu Wei}, year={2023}, eprint={2307.08621}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2307.08621}, }