Day 1: The Basics

ComfyUI Fundamentals

The essential glossary for Node-based Generative AI. Stop guessing what "KSampler" does and start building with purpose.

Navigate Topics

The Graph Architecture Core Model Components Latent Space Logic Sampling & Generation Advanced Inputs

The Graph Architecture

Understanding how ComfyUI processes data through nodes and wires.

Nodes

Individual processing blocks (Load Image, KSampler, VAE Decode) that perform specific actions.

Wires/Links

Connections that pass data (images, latents, models) between nodes. Must connect matching types.

Flow Execution

ComfyUI executes from left to right (usually), following dependencies required to generate the final output.

Workflows (.json)

The saved structure of your graph. Can be embedded inside generated PNGs for easy sharing.

Queue

The list of prompts waiting to be processed. ComfyUI processes them one by one.

Core Model Components

The essential building blocks of stable diffusion models.

Checkpoint (safetensors)

The main model file containing CLIP, UNet, and VAE. The 'brain' of the generation.

CLIP

Contrastive Language-Image Pre-training. Converts your text prompt into embeddings the model understands.

UNet / Transformer

The core engine that predicts noise and denoises the latent image.

VAE (Variational Autoencoder)

Compresses images into 'Latent Space' for processing and decompresses result back to pixels.

LoRA (Low-Rank Adaptation)

Small, trainable add-on models that modify styles or characters without changing the main checkpoint.

Latent Space Logic

How images exist as compressed mathematical representations.

Latent Image

A compressed representation (usually 64x64 for a 512x512 image) where the denoising happens.

Empty Latent Image

The starting canvas for Text-to-Image. Pure noise effectively.

VAE Encode

Turning a pixel image into a latent (used in Image-to-Image).

VAE Decode

Turning the processed latent back into a viewable pixel image.

Latent Upscale

Resizing the latent itself before decoding. Fast but can lose details.

Sampling & Generation

The process of turning noise into a coherent image.

KSampler

The core node that runs the denoising loop step-by-step.

Steps

Number of denoising iterations. More isn't always better (usually 20-30 is sweet spot).

CFG Scale (Guidance)

How strictly the model follows your prompt. High = rigid, Low = creative/random.

Scheduler

The strategy for removing noise (Normal, Karras, Exponential). Karras fits most needs.

Sampler Name

The algorithm used (Euler a, DPM++ 2M Karras). DPM++ 2M is a popular choice for speed/quality.

Seed

The starting random number. Same seed + same settings = same image.

Denoise Strength

For Img2Img: How much to change the original. 1.0 = full replacement, 0.3 = slight edit.

Advanced Inputs

Controlling the generation with more than just text.

ControlNet

Guidance network that enforces structure (edges, pose, depth) from an input image onto the generation.

IPAdapter

Image Prompt Adapter. Using an image as a prompt instead of/alongside text for style or content transfer.

Masking

Selecting specific areas of an image for Inpainting (editing only that part).

Conditioning

The technical term for 'Prompts' (Positive/Negative) fed into the KSampler.

Visuals Demystified.

Now that you speak the language of nodes, you're ready to build. Day 2 will cover Advanced Workflows and ControlNet.