Tokenization
Byte Pair Encoding (BPE)
A data compression algorithm adapted for tokenization that iteratively merges the most frequent character pairs to create an optimized subword vocabulary.
← Indietro