AI-ordlista
Den kompletta ordlistan över AI
Document Chunking
Process of segmenting large documents into smaller, coherent fragments to optimize their processing by language models and vector search systems.
Fixed-size Chunking
Segmentation strategy that divides documents into fragments of predefined size, based on a constant number of characters, words, or tokens.
Semantic Chunking
Segmentation approach based on semantic understanding of content, creating fragments that preserve thematic and contextual coherence.
Recursive Character Splitting
Hierarchical segmentation method that divides documents according to a sequence of separators (paragraphs, sentences, words) until reaching the desired fragment size.
Token-based Chunking
Segmentation strategy using tokens as the basic unit, essential for respecting the context limits of language models like GPT or BERT.
Overlapping Chunks
Technique creating fragments with overlapping areas to preserve context between adjacent segments and improve coherence during retrieval.
Hierarchical Chunking
Multi-level approach organizing fragments according to a hierarchical structure (chapters, sections, paragraphs) to enable contextual retrieval at different granularities.
Sliding Window Chunking
Method sliding a fixed-size window over the document with a defined step, creating sequential fragments with controlled overlap.
Markdown-aware Chunking
Intelligent segmentation strategy that respects the Markdown structure of documents, splitting at logical boundaries of headings, lists, and code blocks.
Context-aware Chunking
Advanced approach considering the global semantic context of the document to determine optimal breakpoints that preserve narrative coherence.
Embedding-based Chunking
Method using semantic embeddings to identify natural boundaries between thematically distinct segments in a document.
Hybrid Chunking Strategy
Combination of multiple segmentation techniques, such as semantic chunking with fixed size limits, to optimize both coherence and efficiency.
Dynamic Chunk Sizing
Adaptive approach adjusting fragment size based on information density and semantic complexity of each document section.
Metadata-enriched Chunking
Technique associating contextual metadata (position, parent title, hierarchical level) with each fragment to improve context retrieval and reconstruction.
Cross-document Chunking
Advanced strategy segmenting sets of related documents into coherent fragments preserving inter-document relationships for better global understanding.
Multi-level Chunking
Approach creating multiple levels of fragments (summaries, detailed sections, paragraphs) to enable flexible retrieval according to granularity needs.
Adaptive Chunking
Intelligent system dynamically adjusting the segmentation strategy based on document type, domain, and observed usage patterns.