-
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10 -
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper • 2401.06951 • Published • 26 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 25
Juan Herrera
juampahc
AI & ML interests
None yet
Organizations
Transformers alternatives
-
Transformers are Multi-State RNNs
Paper • 2401.06104 • Published • 39 -
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Paper • 2402.10644 • Published • 81 -
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 42 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 630
Extending Context-Lenght
-
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10 -
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper • 2401.06951 • Published • 26 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 25
Transformers alternatives
-
Transformers are Multi-State RNNs
Paper • 2401.06104 • Published • 39 -
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Paper • 2402.10644 • Published • 81 -
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 42 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 630
models 9
juampahc/gliner_multi-v2.1-openvino
Token Classification • Updated • 10
juampahc/llama-3.2-3b-openvino
Updated • 1
juampahc/gliner_multi-v2.1-onnx
Token Classification • Updated • 35
juampahc/bge-m3-m2v-758
Sentence Similarity • Updated • 1
juampahc/bge-m3-m2v-256
Sentence Similarity • Updated • 2 • 2
juampahc/bge-m3-m2v-1024
Sentence Similarity • Updated • 2
juampahc/bge-m3-baai-onnx
Sentence Similarity • Updated • 5
juampahc/bge-m3-baai-quant-opt
Sentence Similarity • Updated • 3
juampahc/bge-m3-baai-quant
Sentence Similarity • Updated • 6
datasets 0
None public yet