Components of Convolutional Network

Fully connected Layers

Instead of stretching the data, the input image preserves spacial structure: 32 x 32 x 3
Weight matrix also has such dimension. i.e. 5 x 5 x 3 filter
- The weight matrix needs the same weight
- 1 output number = result of dot product between filter and small chunk in input data
Output would be a 1 x 28 x 28 matrix (called Activation Map)
Usually there are multiple filters of size 5 x 5 x 3, so the Convolution layer would be sth like. 6 x 5 x 5 x 3, and out put would be 6 activation maps, with each size 1 x 28 x 28
- Each layer filters would also have a bias vector
- Also, the input will usually be a batch of images

Input: $N\times C_{in}\times H\times W$

filters: $C_{out}\times C_{in}\times K_w \times K_h$

Output: $N\times C_{out}\times H'\times W'$

Screen Shot 2022-07-13 at 9.57.44 PM.png

When doing convolution, the size shrinks based on size of filter