ML Architecture
U-Net Encoder-Decoder for Segmentation
Symmetric encoder-decoder with skip connections for pixel-wise prediction.
Prompt
A U-Net architecture diagram drawn as the canonical "U" shape. Left side β Encoder (downsampling): - Four levels, each with two 3x3 convolutions + ReLU + 2x2 max pooling. - Channels double at each level: 64, 128, 256, 512. - Spatial resolution halves at each level. Bottom β Bottleneck: - Two 3x3 convolutions with 1024 channels at the lowest spatial resolution. Right side β Decoder (upsampling): - Four levels mirroring the encoder. - Each level: 2x2 transposed convolution (or bilinear upsample), concatenation with the corresponding encoder feature map (skip connection drawn as a horizontal arrow), then two 3x3 convolutions. Output: - A 1x1 convolution produces a per-pixel class probability map (C output channels). Annotations: - Every block labeled with channel count. - Skip connections drawn as bold horizontal arrows crossing the U. Style: clean publication-style vector, navy palette with one accent color for skip connections, white background. Suitable for medical imaging / remote-sensing journals.Use in Generator
When to use
For medical imaging / remote sensing / semantic segmentation papers.
Variations
Attention U-Net
Add an attention gate on each skip connection that filters encoder features using the decoder feature as a query. Show the gate as a small AND-style symbol on the skip arrow.
Tips
- Channel counts must be labeled. Without them the figure looks generic and uninformative.
- Bold the skip connections β they are the defining feature of U-Net and should pop visually.
- Show the bottleneck explicitly. Many auto-generated U-Nets shrink it into the encoder by accident.
FAQ
How do I show 3D U-Net for volumetric data?
State "3D convolutions and 2x2x2 pooling, channel counts unchanged. Replace 2D feature maps with 3D cuboid icons."
