Multilingual NER - AI-ordlista

📖

termer

Cross-Lingual Transfer

Ability of an NER model trained on a source language to apply its knowledge to recognize entities in a target language, without requiring annotated data for the latter.

📖

termer

Unified Multilingual Model

NER architecture where a single model is trained simultaneously on data from multiple languages, sharing vector representations to capture universal entity recognition patterns.

📖

termer

Vector Space Alignment

Technique aiming to project the semantic spaces of different languages into a common vector space, thus enabling a model to process and compare words or entities from distinct languages.

📖

termer

Multilingual Fine-Tuning

Process of adapting a pre-trained language model on vast multilingual corpora, specializing it for the NER task using an annotated dataset in multiple languages.

📖

termer

Code-Switching NER

Challenge of multilingual NER involving recognizing entities within a text where speakers alternate between multiple languages, often within the same sentence.

📖

termer

Translingual Entities

Named entities that maintain an identical form or reference across multiple languages, such as brand names (Google), organizations (UN), or people (Barack Obama).

📖

termer

Multilingual Domain Adaptation

Technique for adjusting a multilingual NER model to a specific domain (medical, legal) using unannotated or weakly annotated data in multiple languages.

📖

termer

Multilingual Character Embeddings

Vector representations at the character level, shared between languages, that allow the model to capture similar morphologies (e.g., Latin roots) and generalize to new words.

📖

termer

Projected Annotation

Method for creating NER training data in a target language using a machine translation system to project entity labels from an annotated source language.

📖

termer

Low-Resource NER Models

NER systems designed to work with very limited amounts of annotated data in one or more target languages, often through transfer learning from high-resource languages.

📖

termer

Multilingual Entity Normalization

Task of grouping different linguistic or orthographic variants of the same entity (e.g., 'New York', 'Nueva York', 'New York City') under a single canonical identifier.

📖

termer

Multilingual Evaluation

Process of measuring the performance of an NER system on a diverse set of languages, often using standard metrics (precision, recall, F1-score) calculated per language and in aggregate.

📖

termer

Large-Scale Multilingual Language Models (mLLM)

Foundation models like mBERT or XLM-R, pre-trained on hundreds of languages, which serve as a basis for building high-performing multilingual NER systems through fine-tuning.

📖

termer

Language Detection for NER

Crucial preliminary step in multilingual NER pipelines consisting of identifying the language of the input text to activate the appropriate entity recognition model.

📖

termer

Script-Independent NER

Ability of an NER model to recognize entities independently of the writing system (Latin alphabet, Cyrillic, Arabic, etc.), relying on abstract language representations.

📖

termer

Back-Translation for NER

Data augmentation technique where an annotated text in a source language is translated to a target language, then back-translated to the source language, to create robust new training examples.

AI-ordlista

Cross-Lingual Transfer

Unified Multilingual Model

Vector Space Alignment

Multilingual Fine-Tuning

Code-Switching NER

Translingual Entities

Multilingual Domain Adaptation

Multilingual Character Embeddings

Projected Annotation

Low-Resource NER Models

Multilingual Entity Normalization

Multilingual Evaluation

Large-Scale Multilingual Language Models (mLLM)

Language Detection for NER

Script-Independent NER

Back-Translation for NER

Inga resultat hittades