KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
162
Kategorien
2.032
Unterkategorien
23.060
Begriffe
Begriffe
Count Vectorizer
Transformation method that converts a collection of documents into a matrix of token occurrence counts. Provides a basic frequency representation without normalization or TF-IDF weighting.
Begriffe
Subword Tokenization
Segmentation method that divides words into smaller units (subwords) to handle unknown vocabulary and rare words. Algorithms like BPE, WordPiece or SentencePiece optimize segmentation based on frequency.
🔍