Word embeddings - KI-Glossar

📖

Begriffe

FastText

Extension of Word2Vec developed by Facebook that represents each word as the sum of its character n-gram vectors, allowing handling of out-of-vocabulary words and complex morphologies.

📖

Begriffe

Contextual Embeddings

Dynamic vector representations whose values change according to the usage context of the word, unlike static embeddings that assign a unique vector per word.

📖

Begriffe

Static Embeddings

Fixed vector representations where each word has a single vector representation independent of its context, as in classic Word2Vec or GloVe.

📖

Begriffe

Skip-gram

Training architecture that predicts context words from a central word, excellent for capturing semantic relationships with small corpora.

📖

Begriffe

CBOW

Continuous Bag of Words, model that predicts a central word from the sum of vectors of its context words, efficient for training on large corpora.

📖

Begriffe

Subword Embeddings

Vector representation technique that decomposes words into smaller units (characters, morphemes) to handle open vocabulary and capture morphological information.

📖

Begriffe

ELMo

Embeddings from Language Models, approach that generates contextual embeddings by combining hidden states of bidirectional LSTM networks pretrained on vast corpora.

📖

Begriffe

Sentence Embeddings

Vector representations that encode entire sentences into unique vectors, capturing global meaning and semantic structure at the sentence level.

📖

Begriffe

Doc2Vec

Extension of Word2Vec that generates embeddings for entire documents by introducing a document identifier as additional context during training.

📖

Begriffe

Universal Sentence Encoder

Google model that transforms texts into high-dimensional embeddings, optimized for semantic similarity and text classification tasks.

📖

Begriffe

RoBERTa

Robustly Optimized BERT Pretraining Approach, improved version of BERT with longer pre-training on more data and optimized hyperparameters.

📖

Begriffe

Embedding Layer

First layer of NLP neural networks that transforms token indices into dense vectors, learning these representations during training.

📖

Begriffe

Vector Space Model

Algebraic representation where words are points in a multidimensional space, allowing mathematical operations to measure semantic similarities.

KI-Glossar

FastText

Contextual Embeddings

Static Embeddings

Skip-gram

CBOW

Subword Embeddings

ELMo

Sentence Embeddings

Doc2Vec

Universal Sentence Encoder

RoBERTa

Embedding Layer

Vector Space Model

Keine Ergebnisse gefunden