🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

End-to-End Latency

Measurement of the total time elapsed between a user sending a request and receiving the complete response, including all processing steps of the QA system.

📖
thuật ngữ

Semantic Cache

Mechanism for temporarily storing answers based on the semantic similarity of queries, allowing for quick serving of pre-computed answers for similar questions without recalculation.

📖
thuật ngữ

Real-Time Inverted Index

A data structure that continuously updates the mapping of terms to documents, enabling instant querying of newly added or modified data.

📖
thuật ngữ

Dense Retrieval Model

An approach using vector embeddings to represent documents and queries in a common semantic space, optimized for fast and accurate search.

📖
thuật ngữ

Online Neural Reranking

The process of re-evaluating search results by a deep learning model applied dynamically to refine the order of the most relevant answers.

📖
thuật ngữ

Asynchronous Processing Pipeline

An architecture where processing steps run in parallel without blocking the main flow, reducing the user-perceived latency.

📖
thuật ngữ

Pre-computation of Representations

A strategy involving generating and storing document encoding vectors in advance to eliminate this costly step during real-time queries.

📖
thuật ngữ

Knowledge Sharding

Horizontal partitioning of the knowledge base across multiple nodes to parallelize searches and increase the throughput of simultaneous queries.

📖
thuật ngữ

Low-Latency Filtering

Fast filtering layer using heuristics or lightweight models to eliminate irrelevant candidates before processing by more complex models.

📖
thuật ngữ

Response Streaming

Method of transmitting responses in successive fragments as they are generated, improving the perceived response time for long answers.

📖
thuật ngữ

Vector Pruning

Process of reducing the search space by eliminating less relevant vectors based on pre-calculated distance or similarity metrics.

📖
thuật ngữ

Batched GPU Inference

Optimization technique that groups multiple requests to process them simultaneously on a GPU, maximizing resource utilization and reducing per-request latency.

📖
thuật ngữ

Hybrid Search System

Architecture combining keyword-based (sparse) and semantic (dense) search to balance precision and recall while maintaining low latency.

📖
thuật ngữ

Persistent Connection (WebSocket)

Bidirectional communication protocol kept open between client and server, allowing instant exchanges without connection overhead for each request.

📖
thuật ngữ

Multi-Level Caching

Strategy for storing responses at multiple layers (e.g., memory, Redis, CDN) to serve requests from the fastest available cache.

📖
thuật ngữ

Request Path Optimization

Analysis and refinement of a request's journey through the system to eliminate bottlenecks and minimize each network hop or processing step.

🔍

Không tìm thấy kết quả