8 faces with fully connected networks
In this excercise we work with the 8 faces dataset. this dataset has 350 images of 8 celebrities.
To get an overview of the data open the notebook 8 faces overview and look at the celebrities and the images.
The data is from a random sample of 8 persons of the OXFORD VGG Face dataset (over 2600 Persons),
for more information look here: http://www.robots.ox.ac.uk/~vgg/data/vgg_face/
a) Open the notebook 8 faces only fc and bulit this network and then train it.
How good is the model? Look at the train valid and test accuracy.
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_1 (Dense) (None, 8) 55304 dense_input_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 8) 0 dense_1[0][0]
====================================================================================================
Total params: 55,304
Trainable params: 55,304
Non-trainable params: 0
____________________________________________________________________________________________________
b) Now let’s add some hidden layers to the network.
Restart the notebook and built a new network with hidden layers, see below.
How good is this model? Look at the train valid and test accuracy.
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_1 (Dense) (None, 400) 2765200 dense_input_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 400) 0 dense_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 200) 80200 activation_1[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 200) 0 dense_2[0][0]
____________________________________________________________________________________________________
dense_3 (Dense) (None, 8) 1608 activation_2[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 8) 0 dense_3[0][0]
====================================================================================================
Total params: 2,847,008
Trainable params: 2,847,008
Non-trainable params: 0
____________________________________________________________________________________________________
8 faces with convolutional neural networks
Hint: the training of the networks takes some time because we compute only on the cpu.
(up to 1h with the last network)
a) Open the notebook 8 faces cnn and bulit this network and then train it.
Do you expect it to be better then the last one with only fully connected layers?
How good is the model? Look at the train valid and test accuracy.
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
convolution2d_1 (Convolution2D) (None, 48, 48, 32) 896 convolution2d_input_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 48, 48, 32) 0 convolution2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 48, 48, 32) 9248 activation_1[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 48, 48, 32) 0 convolution2d_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 24, 24, 32) 0 activation_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 24, 24, 64) 18496 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 24, 24, 64) 0 convolution2d_3[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 24, 24, 64) 36928 activation_3[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 24, 24, 64) 0 convolution2d_4[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 12, 12, 64) 0 activation_4[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 9216) 0 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 200) 1843400 flatten_1[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 200) 0 dense_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 8) 1608 activation_5[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 8) 0 dense_2[0][0]
====================================================================================================
Total params: 1,910,576
Trainable params: 1,910,576
Non-trainable params: 0
____________________________________________________________________________________________________
b) Now let’s add the tricks which we already used on the MNIST dataset.
Restart the notebook and built the same network as above and add dropout layers, see below.
How good is the model now? Look at the train valid and test accuracy.
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
convolution2d_1 (Convolution2D) (None, 48, 48, 32) 896 convolution2d_input_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 48, 48, 32) 0 convolution2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 48, 48, 32) 9248 activation_1[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 48, 48, 32) 0 convolution2d_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 24, 24, 32) 0 activation_2[0][0]
____________________________________________________________________________________________________
dropout_1 (Dropout) (None, 24, 24, 32) 0 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 24, 24, 64) 18496 dropout_1[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 24, 24, 64) 0 convolution2d_3[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 24, 24, 64) 36928 activation_3[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 24, 24, 64) 0 convolution2d_4[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 12, 12, 64) 0 activation_4[0][0]
____________________________________________________________________________________________________
dropout_2 (Dropout) (None, 12, 12, 64) 0 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 9216) 0 dropout_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 200) 1843400 flatten_1[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 200) 0 dense_1[0][0]
____________________________________________________________________________________________________
dropout_3 (Dropout) (None, 200) 0 activation_5[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 8) 1608 dropout_3[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 8) 0 dense_2[0][0]
====================================================================================================
Total params: 1,910,576
Trainable params: 1,910,576
Non-trainable params: 0
____________________________________________________________________________________________________
c) Finally add batchnormalization to your network.
Restart the notebook and built the same network as above and add batchnormalization layers, see below.
How good is the model now? Look at the train valid and test accuracy.
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
convolution2d_1 (Convolution2D) (None, 48, 48, 32) 896 convolution2d_input_1[0][0]
____________________________________________________________________________________________________
batchnormalization_1 (BatchNorma (None, 48, 48, 32) 128 convolution2d_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 48, 48, 32) 0 batchnormalization_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 48, 48, 32) 9248 activation_1[0][0]
____________________________________________________________________________________________________
batchnormalization_2 (BatchNorma (None, 48, 48, 32) 128 convolution2d_2[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 48, 48, 32) 0 batchnormalization_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 24, 24, 32) 0 activation_2[0][0]
____________________________________________________________________________________________________
dropout_1 (Dropout) (None, 24, 24, 32) 0 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 24, 24, 64) 18496 dropout_1[0][0]
____________________________________________________________________________________________________
batchnormalization_3 (BatchNorma (None, 24, 24, 64) 256 convolution2d_3[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 24, 24, 64) 0 batchnormalization_3[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 24, 24, 64) 36928 activation_3[0][0]
____________________________________________________________________________________________________
batchnormalization_4 (BatchNorma (None, 24, 24, 64) 256 convolution2d_4[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 24, 24, 64) 0 batchnormalization_4[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 12, 12, 64) 0 activation_4[0][0]
____________________________________________________________________________________________________
dropout_2 (Dropout) (None, 12, 12, 64) 0 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 9216) 0 dropout_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 200) 1843400 flatten_1[0][0]
____________________________________________________________________________________________________
batchnormalization_5 (BatchNorma (None, 200) 800 dense_1[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 200) 0 batchnormalization_5[0][0]
____________________________________________________________________________________________________
dropout_3 (Dropout) (None, 200) 0 activation_5[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 8) 1608 dropout_3[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 8) 0 dense_2[0][0]
====================================================================================================
Total params: 1,912,144
Trainable params: 1,911,360
Non-trainable params: 784
____________________________________________________________________________________________________