SIGN IN SIGN UP

fix: detokenization case where first token does not start with a leading space (#1375)

* Fix tokenization edge case where llama output does not start with a space

See this notebook:
https://colab.research.google.com/drive/1Ooz11nFPk19zyJdMDx42CeesU8aWZMdI#scrollTo=oKpHw5PZ30uC

* Update _internals.py

Fixing to compare to b' ' instead of (str)' '

---------

Co-authored-by: Andrei <abetlen@gmail.com>
N
Noam Gat committed
e0d7674e62bdc5b906d2461238993ea3a022f61f
Parent: 1f56c64
Committed by GitHub <noreply@github.com> on 5/4/2024, 2:14:59 PM