Semantic Segmentation

Label Each Pixel in the image with a category Label - We want classification per pixel

Fully Convolutional Network

Screen Shot 2022-07-30 at 12.50.23 PM.png

Problems

Effective Receptive Field size is Linear in number of Convolution Layers
- With L 3x3 Conv Layers, receptive field is 1 + 2L
Convolution on high res image is expensive.

We use a Design with Downsampling and Upsampling inside the Network

Screen Shot 2022-07-30 at 12.50.41 PM.png

Bed of Nails - Upsize, and copy value to the upper right of the enlarged cell. Rest is zero
Nearest Neighbor - Upsize and Copy the nearest neighbor
Bilinear Interpolation - choose closest two neighbors in x and y to construct linear approximation
- Bicubic Interpolation - choose three closest neighbors and construct cubic approximation
Max-Unpooling - Similar to Bed of Nails but remember the location of the pixel from Max Pooling and put it in the location
Transposed Convolution (Learnable Upsampling)
- Kind of a reverse of strided conv, Output is the product of pixel value and input filter

Screen Shot 2022-07-30 at 1.07.43 PM.png

Detect all objects in the image, and identify the pixels that belong to each object