🏠 홈
벤치마크
📊 모든 벤치마크 🦖 공룡 v1 🦖 공룡 v2 ✅ 할 일 목록 앱 🎨 창의적인 자유 페이지 🎯 FSACB - 궁극의 쇼케이스 🌍 번역 벤치마크
모델
🏆 톱 10 모델 🆓 무료 모델 📋 모든 모델 ⚙️ 킬로 코드 모드
리소스
💬 프롬프트 라이브러리 📖 AI 용어 사전 🔗 유용한 링크

AI 용어집

인공지능 완전 사전

162
카테고리
2,032
하위 카테고리
23,060
용어
📖
용어

Multi-Modal Diffusion

Class of generative models learning a joint probability distribution over multiple modalities (text, image, audio) through a shared or coordinated diffusion process.

📖
용어

Unified Latent Space

Common vector representation where data from different modalities are projected to enable their interaction and mutual transformation within a diffusion model.

📖
용어

Cross-Modal Conditioning

Technique where the generation process of one modality is guided by information from another modality, for example generating an image from text or audio from an image.

📖
용어

Multi-Modal Structured Noise

Noise addition process that preserves inter-modal correlations, jointly degrading different modalities to maintain their semantic alignment throughout the diffusion process.

📖
용어

Coordinated Denoising

Denoising step where neural networks dedicated to each modality exchange information to coherently reconstruct data from their shared noisy version.

📖
용어

Multi-Modal Encoder

Neural network responsible for projecting data from different modalities into the unified latent space, capturing their essential features and relationships.

📖
용어

Multi-Modal Decoder

Neural network that reconstructs data for each modality from their representation in the unified latent space after the denoising process.

📖
용어

Inter-Modal Alignment

Learning objective aimed at minimizing the distance between latent representations of different modalities describing the same concept, ensuring their semantic consistency.

📖
용어

Unified Diffusion Model

Single model architecture that simultaneously processes and generates multiple modalities using a single diffusion process and a shared set of weights.

📖
용어

Multi-Modal Guidance

Inference technique that uses the gradient of a multi-modal classification model to guide the sampling process towards outputs better aligned with a given condition.

📖
용어

Multi-Arm Diffusion

Architecture where a central diffusion process has specialized 'arms' to handle noise addition and denoising specific to each modality while sharing a common trunk.

📖
용어

Multi-Modal Consistency Loss

Loss function that penalizes semantic inconsistencies between generated modalities, measured for example via cosine distance in the unified latent space.

📖
용어

Inter-Modal Sampling

Generation process where one modality is sampled while conditioning on another already existing or simultaneously generated modality.

📖
용어

Shared Noise Prediction Network

Central component of the diffusion model, often a U-Net architecture, whose lower layers are shared between modalities and upper layers are specialized.

📖
용어

Multi-Modal Time Embedding

Representation of the diffusion process timestep that is injected into the model, often conditioned by the modality to handle different noise dynamics.

📖
용어

Multi-Modal Sequence Diffusion

Application of diffusion to sequential data involving multiple modalities, such as video generation (image + time) or synchronized dialogue (audio + text).

📖
용어

Multi-Modal Tokenization

Process of discretizing data from different modalities into a unified sequence of tokens that can be processed by a Transformer-like architecture in the context of diffusion.

🔍

결과를 찾을 수 없습니다