Glosarium AI
Kamus lengkap Kecerdasan Buatan
ALBERT
Lightweight version of BERT significantly reducing parameters through embedding sharing and matrix factorization of layers. Maintains competitive performance while being more memory-efficient.
ELECTRA
Efficient pre-training architecture replacing masked language modeling with corrupted token replacement. Uses a discriminator that identifies replaced tokens, enabling faster and more effective training.
ERNIE
Chinese model integrating structured and hierarchical knowledge into the base Transformer architecture. Simultaneously masks words, entities, and phrases to capture multi-level semantics.
BART
Bidirectional and autoregressive Transformer architecture combining the advantages of BERT and GPT. Uses an encoder-decoder with text corruption for pre-training, excellent for generation tasks.
Funnel Transformers
Hierarchical architecture progressively reducing sequence length across layers while preserving important information. Significantly saves computational memory for long sequences.
DeBERTa
Improvement on BERT incorporating enhanced decoding with disentangled content and position attention. Uses a disentangled attention mechanism and enhanced size masking for better performance.
TinyBERT
Ultra-compact version of BERT reducing parameters up to 7.5 times while maintaining high performance. Applies bidirectional distillation and multi-level attention for compression.
CamemBERT
French version of BERT pre-trained on 138GB of French text. Maintains the original BERT architecture but is specialized for French understanding and processing.
FlauBERT
French Transformer-based language model with progressive pre-training using increasingly large corpora. Incorporates French linguistic specificities for optimal performance.
XLM-RoBERTa
Multilingual version of RoBERTa pre-trained on 100 languages using massive Common Crawl dataset. Outperforms XLM and mBERT thanks to improved pre-training and better handling of low-resource languages.
Sentence-BERT
BERT modification optimized for encoding entire sentences into semantic vectors. Uses siamese and triplet networks to produce relevant embeddings for semantic similarity.
VideoBERT
Multimodal extension of BERT learning joint video-text representations. Performs pre-training on visual and linguistic tokens for video understanding.
Controlled BERT
BERT variant allowing control of style attributes during text generation. Integrates controllers in the architecture to modulate desired linguistic characteristics.