fix: detokenization case where first token does not start with a leading space (#1375)
* Fix tokenization edge case where llama output does not start with a space See this notebook: https://colab.research.google.com/drive/1Ooz11nFPk19zyJdMDx42CeesU8aWZMdI#scrollTo=oKpHw5PZ30uC * Update _internals.py Fixing to compare to b' ' instead of (str)' ' --------- Co-authored-by: Andrei <abetlen@gmail.com>
N
Noam Gat committed
e0d7674e62bdc5b906d2461238993ea3a022f61f
Parent: 1f56c64
Committed by GitHub <noreply@github.com>
on 5/4/2024, 2:14:59 PM