Semantic Segmentation
Label Each Pixel in the image with a category Label - We want classification per pixel
Fully Convolutional Network
![Screen Shot 2022-07-30 at 12.50.23 PM.png](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/50922e74-d2a1-47cf-a939-5c90a25cd190/Screen_Shot_2022-07-30_at_12.50.23_PM.png)
- Convolutional Network with no Pooling or FC Layers
Problems
- Effective Receptive Field size is Linear in number of Convolution Layers
- With L 3x3 Conv Layers, receptive field is 1 + 2L
- Convolution on high res image is expensive.
We use a Design with Downsampling and Upsampling inside the Network
- Downsampling: Pooling, strided Convolution
![Screen Shot 2022-07-30 at 12.50.41 PM.png](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/d4a4b522-31cb-4daf-8a40-67534b1a4508/Screen_Shot_2022-07-30_at_12.50.41_PM.png)
UpSampling Methods
- Bed of Nails - Upsize, and copy value to the upper right of the enlarged cell. Rest is zero
- Nearest Neighbor - Upsize and Copy the nearest neighbor
- Bilinear Interpolation - choose closest two neighbors in x and y to construct linear approximation
- Bicubic Interpolation - choose three closest neighbors and construct cubic approximation
- Max-Unpooling - Similar to Bed of Nails but remember the location of the pixel from Max Pooling and put it in the location
- Transposed Convolution (Learnable Upsampling)
- Kind of a reverse of strided conv, Output is the product of pixel value and input filter
![Screen Shot 2022-07-30 at 1.07.43 PM.png](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/8d5d4443-0623-4a0f-833d-187fbb3a7e5b/Screen_Shot_2022-07-30_at_1.07.43_PM.png)
Instance Segmentation
Detect all objects in the image, and identify the pixels that belong to each object