Search Results - Weinbach, Samuel
-
1
-
2
-
3
-
4
-
5
u-\(\mu\)P: The Unit-Scaled Maximal Update Parametrization
Published in arXiv.orgGet full text
Article -
6
-
7
-
8
-
9
Tokenizer Choice For LLM Training: Negligible or Crucial?
Published in arXiv.orgGet full text
Article -
10
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs
Published in arXiv.orgGet full text
Article -
11
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Published in arXiv.orgGet full text
Article -
12