Tokenization and Encoding
N-gram Tokenisation
Approach that segments text into contiguous sequences of n items (characters or words), capturing local context information but suffering from combinatorial vocabulary explosion.
← Kembali