KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Classifier-Free Guidance
A technique for improving conditional fidelity that combines predictions from a conditional model and an unconditional model to enhance the impact of the condition on generation.
Condition Encoding
The process of transforming an input condition (text, image, etc.) into a vector representation that can be integrated into the diffusion network to influence the generation process.
Text-to-Image
An application of conditional diffusion where the condition is a textual description used to generate a corresponding image synchronously.
Image-to-Image
A conditional diffusion task using a source image as a condition to generate a new one, often for stylization, colorization, or modification applications.
ControlNet
A neural network architecture that duplicates and locks the weights of a pre-trained diffusion model while adding layers to interpret precise spatial conditions such as depth maps or sketches.
Negative Embedding
A technique involving providing the model with a condition describing what should be avoided in the generation, to refine control over the output content.
Conditional Inpainting
A form of conditional diffusion where a partially masked image serves as a condition for the model to fill in the missing areas in a manner consistent with the context.
Outpainting
A conditional diffusion process that extends the borders of an existing image by generating new coherent content, using the original image as a condition.
Condition Modulation
Method of integrating condition into the diffusion model, often via AdaIN (Adaptive Instance Normalization) layers, to adapt feature statistics to the condition.
Conditional Fidelity Score
Metric evaluating how well the output generated by a conditional diffusion model aligns with the provided input condition.
DreamBooth
Fine-tuning technique for a conditional diffusion model on a small set of images to teach it to generate a specific concept or subject, often a person or object.
Textual Inversion
Process that learns a new text embedding token from a set of images, allowing a unique word to be associated with a specific visual style or concept in a diffusion model.
IP-Adapter (Image Prompt Adapter)
Module added to a diffusion model to enable it to use an image as a prompt, by encoding the reference image and integrating it via cross-attention mechanisms.
Multi-Modality Reference
Simultaneous use of multiple types of conditions (e.g., text and image) to guide generation, offering more nuanced and precise control over the final result.