Deep Learning with Python - Chapters 7 and 8

The chapters "Working with Keras: A Deep Dive" and "Introduction to Deep Learning for Computer Vision" go over the Keras library somewhat in-depth and give an introduction to convolutional neural nets for image classification respectively.

Working with Keras: A deep Dive

In this chapter, you'll get a complete overview of the key ways to work with Keras APIs.

The Keras API is guided by the principle of progressive disclosure of complexity: make it easy to get started, yet make it possible to handle high-complexity uses cases, only requiring incremental learning at each step. Simple uses cases should be easy and approachable, and arbitrarily advanced workflows should be possible: no matter how niche and complex the thing you want to do, there should be a clear path to it.

Three APIs for building models in Keras:

  • The Sequential model: the most approachable API . It's limited to simple stacks of layers
  • The Functional API - focuses on graph-like model architectures. It represents a nice mid-point between usability and flexibility, and as such, it's the most commonly use model-building API.
  • Model subclassing, a low-elevl option where you write everything yourself from scratch. This is ideal f you want full control over every little thing.
"""
The Sequntial Model
"""
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Dense(64,activation="relu"),
    layers.Dense(10, activation="softmax")
])
"""
- Possible to build the same model sequentially
- The layers only get built when they are called for the first time dur to needing to know the shape of the layers' weights
"""
model = keras.Sequential()
model.add(layers.Dense(64,activation="relu"))
model.add(layers.Dense(10, activation="softmax"))

model.build(input_shape=(None,3))
print(model.weights) # Retrieving the model's weights
out[2]

[<KerasVariable shape=(3, 64), dtype=float32, path=sequential_1/dense_2/kernel>, <KerasVariable shape=(64,), dtype=float32, path=sequential_1/dense_2/bias>, <KerasVariable shape=(64, 10), dtype=float32, path=sequential_1/dense_3/kernel>, <KerasVariable shape=(10,), dtype=float32, path=sequential_1/dense_3/bias>]

"""
After the model is built, you can display its contents via the summary() method
"""
print(model.summary())
"""
You can give names to everything in Keras - every model, every layer
"""
model = keras.Sequential(name="my_example_model")
model.add(layers.Dense(64,activation="relu",name="my_first_layer"))
model.add(layers.Dense(10, activation="softmax",name="my_second_layer"))
model.build(input_shape=(None,3))
print(model.summary())
"""
There is a way to have the Sequential model built on the fly: do this via the Input class
"""
model = keras.Sequential()
# Use th Input to declare the shape of the inputs. Note that the shape argument must be the shape of each sample, not the shape of one batch
model.add(keras.Input(shape=(3,)))
model.add(layers.Dense(64,activation="relu",name="a"))
model.add(layers.Dense(10, activation="softmax",name="b"))
print(model.summary())
out[3]

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ dense_2 (Dense) │ (None, 64) │ 256 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ dense_3 (Dense) │ (None, 10) │ 650 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 906 (3.54 KB)

 Trainable params: 906 (3.54 KB)

 Non-trainable params: 0 (0.00 B)

None

Model: "my_example_model"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ my_first_layer (Dense) │ (None, 64) │ 256 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ my_second_layer (Dense) │ (None, 10) │ 650 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 906 (3.54 KB)

 Trainable params: 906 (3.54 KB)

 Non-trainable params: 0 (0.00 B)

None

Model: "sequential_2"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ a (Dense) │ (None, 64) │ 256 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ b (Dense) │ (None, 10) │ 650 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 906 (3.54 KB)

 Trainable params: 906 (3.54 KB)

 Non-trainable params: 0 (0.00 B)

None

"""
Most Keras models in the wild use the Fuunctional API (the Sequential API
is too simplistic to represent most models in the wild),
"""
# Start by declaring an input
# It holds information about the shape and dtype of the data that the model will produce
# Such an objetc is called a *symbolic tensor*. It doesn;t contain any actual data
# but it encode the specifications of the actual tensors of data that the model will see
# when you use it. It *stands for* future tensors of data
inputs = keras.Input(shape=(3,),name="my_input")
print("Input Shape:",inputs.shape)
print("Input dtype:",inputs.dtype)
# Create a lyaer and call it on the input
# All Keras layers can be called on real tensors of data and on symbolic tensors
# In the latter case, they return a new symbolic tensor, with updated shape and dtype information
features = layers.Dense(64,activation="relu")(inputs)
print("Features Shape:",features.shape)
# After obtaining the final outputs, we instantiated the mdoel by specifying
# its inputs and outputs in the `Model` constructor
outputs = layers.Dense(10,activation="softmax")(features)
model = keras.Model(inputs=inputs,outputs=outputs)
print(model.summary())
out[4]

Input Shape: (None, 3)
Input dtype: float32
Features Shape: (None, 64)

Model: "functional_4"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ my_input (InputLayer) │ (None, 3) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ dense_4 (Dense) │ (None, 64) │ 256 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ dense_5 (Dense) │ (None, 10) │ 650 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 906 (3.54 KB)

 Trainable params: 906 (3.54 KB)

 Non-trainable params: 0 (0.00 B)

None

"""
- Most deep learning models don't look like lists - they look like graphs
- They have multiple inputs and outputs for instance

Examples of Multi-Input, Multi-Output Functional Model:
"""
vocabulary_size = 10_000
num_tags = 100
num_departments = 4

"""
Define the model inputs
"""
title = keras.Input(shape=(vocabulary_size,),name="title")
text_body = keras.Input(shape=(vocabulary_size,),name="text_body")
tags = keras.Input(shape=(num_tags,),name="tags")

# Combvine input features into a single tensor, features, by concatenating them
features = layers.Concatenate()([title, text_body, tags])
# Apply an intermediate layer to recombine input features into richer representations
features = layers.Dense(64,activation="relu")(features)

# Define the Model Outputs
priority = layers.Dense(1, activation="sigmoid", name="priority")(features)
department = layers.Dense(num_departments, activation="softmax",name="department")(features)

# Create the model by specifying its inputs and outputs
model = keras.Model(inputs=[title,text_body,tags], outputs=[priority,department])

"""
Training a Mult-Input, Multi-Output model
"""
import numpy as np
num_samples = 1280

# Dummy Input Data
title_data = np.random.randint(0,2, size=(num_samples, vocabulary_size))
text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))
tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))

# Dummy Target Data
priority_data = np.random.random(size=(num_samples, 1))
department_data = np.random.randint(0,2, size=(num_samples, num_departments))

model.compile(optimizer="rmsprop", loss=["mean_squared_error", "categorical_crossentropy"], metrics=[["mean_absolute_error"], ["accuracy"]])

model.fit([title_data, text_body_data, tags_data],[priority_data, department_data],epochs=1)

model.evaluate([title_data, text_body_data, tags_data],[priority_data, department_data])
priority_preds, department_preds = model.predict([title_data, text_body_data, tags_data])
out[5]

40/40 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - department_accuracy: 0.1615 - loss: 44.3773 - priority_mean_absolute_error: 0.4680
40/40 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - department_accuracy: 0.2143 - loss: 8.9119 - priority_mean_absolute_error: 0.4862
40/40 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step

# Visualize the connectivity of the model just defined - the *topology* of the
# model
# The None, in the tensor shapes represent the batch size: this model
# allows batches of any size
keras.utils.plot_model(model,"ticket_classifier.png",show_shapes=True)
out[6]
Jupyter Notebook Image

<IPython.core.display.Image object>

Access to layer connectivity means that you can inspect and reuse individual nodes (layer calls) in the graph. The model.layers model property provides the list of layers that make up the model, and for each layer you can query layer.input and layer.output.

This enables you to do feature extraction: creating mdoels that reuse intermediate features of another model.

print(model.layers)
"""
Creating a new model by reusing intermediate layer outputs
"""
features = model.layers[4].output
difficulty = layers.Dense(3, activation="softmax", name="difficulty")(features)
new_model = keras.Model(
    inputs=[title, text_body, tags],
    outputs=[priority, department, difficulty]
)
# Plotting the model
keras.utils.plot_model(new_model, "updated_ticket_classifier.png",show_shapes=True)
out[8]

[<InputLayer name=title, built=True>, <InputLayer name=text_body, built=True>, <InputLayer name=tags, built=True>, <Concatenate name=concatenate, built=True>, <Dense name=dense_6, built=True>, <Dense name=priority, built=True>, <Dense name=department, built=True>]

Jupyter Notebook Image

<IPython.core.display.Image object>

The Model subclassing model is pretty similar to creating custom models.

  • In the __init__() method, define the leayers the model will use
  • In the call() method, define the forward pass of the model, reusing the layers previously created.
  • Instantiate your subclass, and call it on data to create its weights.

What;s the difference between a Layer subclass and a Model subclass? A "layer" is a building block you can use to create models, and a "model" is the top-level object that you will actually tain, export for inference, etc. In short, a Model has fit(), evaluate(), and predict() methods. (you can also save a model).

Which Model to Use?

In general, the FUnctional APU provides you with a good trade-off between ease of use and flexibility. It also gives you direct access to layer connectivity, which is very powerful for use cases such as model plotting or feature extraction.

"""
A Simple subclassed model
"""
class CustomerTicketModel(keras.Model):
  def __init__(self,num_departments):
    super().__init__() # Don't forget to call super()
    self.concat_layer = layers.Concatenate()
    """
    Define the sublayers in the constructor
    """
    self.mixing_layer = layers.Dense(64,activation="relu")
    self.priority_scorer = layers.Dense(1,activation="sigmoid")
    self.department_classifier = layers.Dense(num_departments, activation="softmax")
  def call(self,inputs):
    """
    Define the forward pass in the call() method
    """
    title = inputs["title"]
    text_body = inputs["text_body"]
    tags = inputs["tags"]
    features = self.concat_layer([title, text_body, tags])
    features = self.mixing_layer(features)
    priority = self.priority_scorer(features)
    department = self.department_classifier(features)
    return priority, department

model = CustomerTicketModel(num_departments=4)

priority, department = model({"title": title_data, "text_body": text_body_data, "tags": tags_data })
out[10]

Using Built-in Training and Evaluation Loops

There are a couple ways that you can customize a simple workflow:

  • Provide your own custom metrics
  • Pass callbcks to the fit() method to schedule actions to be taken at specific points during training

A Keras metric is a subclass of the keras.metrics.Metric class. A metric has an internal state stored in TensorFlow variables.

"""
Example custom metric that measures RMSE
"""
import tensorflow as tf

class RootMeanSquaredError(keras.metrics.Metric): # Subclass the Metric class
  def __init__(self,name="rmse",**kargs):
    """
    Define the state variables in teh constructor. Like for layers, you have access to the add_weight() method
    """
    super().__init__(name=name,**kargs)
    self.mse_sum = self.add_weight(name="mse_sum",initializer="zeros")
    self.total_samples = self.add_weight(name="total_samples",initializer="zeros",dtype="int32")

  def update_state(self,y_true,y_pred,sample_weight=None):
    """
    Implement the state update logic in update_state(). The y_true argument is the targtes (or labels) for one batch, while the y_pred represents the corresponding predictions form the model. You can ignore the sample_weight argument - we won't use it here.
    To match out MNSIT model, we expect categorical predictions and integer labels
    """
    y_true = tf.one_hot(y_true, depth=tf.shape(y_pred)[1])
    mse = td.reduce_sum(tf.square(y_true - y_pred))
    self.mse_sum.assign_add(mse)
    num_samples = tf.shape(y_pred)[0]
    self.total_samples.assign_add(num_samples)

  def result(self):
    """
    Use the `result()` method to return the current value of the metric
    """
    return tf.sqrt(self.mse_sum / tf.cast(self.total_samples, tf.float32 ))

  def reset_state(self):
    """
    You need to expose a way to rest the metric ststae without having to reinstate it - this enables some metric objects to be used across different epocjs of training or across both training and evaluation. This is done with the reset_state() method.
    """
    self.mse_sum.assign(0,)
    self.total_samples.assign(0)

out[12]

Using Callbacks

A callback is an object (a class instance implementing specific methods) that is passed to the model in the call to fit() and that is called by the model at various points duringtraining. It has acess to all the available data about the state of the model and its performance, and it can take action: interrupt training, save a model, load a different weight set, or otherwise alter the state of the model:

  • Model Checkpointing: saving the current state of the model at different points during training
  • Early Stopping: Interrupting training when the validation loss is no longer improving (and saving the best model obtained during training)
  • Dynamically Adjusting the value of certain parameters during training:
  • Logging training and validation metrics during training, or visualizing the representations learned by the model as they're updated: The fit() program progress bar is actually a callback. The fit() method takes a callbacks karg that accepts a list of callbacks.

Built in Callbacks:

  • keras.callbacks.ModelCheckpoint
    • Lets you continually save the model during training
  • keras.callbacks.EarlyStopping
    • Stop training when the validation loss is no longer improving
    • The callback interrupts training once a target metric being monitored has stopped improving for a fixed number of epochs. Typicallu used in combination with ModelCheckpoint
  • keras.callbacks.LearningRateScheduler
  • keras.callbacks.ReduceLROnPlateau
  • keras.callbacks.CSVLogger
  • keras.callbacks.TensorBoard: use tensorboard, specify where to write logs
# Example list of callbacks
callbacks_list = [
    keras.callbacks.EarlyStopping(
    monitor="val_accuracy",
    patience=2,
  ),
  keras.callbacks.ModelCheckpoint(
    filepath="checkpoint_path.keras",
    monitor="val_loss",
    save_best_only=True,
  ),
  keras.callbacks.TensorBoard(
    log_dir="/full_path_to_your_log_dir",
  )
]
Writing Your Own Epochs

You can write your own callbacks. You need to subclass the keras.callbacks.Callback class. You can then implement any othe following methods which are called at various points during training:

  • on_epoch_begin(epoch, logs): caleld at the start of every epoch
  • on_epoch_end(epoch, logs): called at the end of every epoch
  • on_batch_begin(batch, logs): called right before processing each batch
  • on_batch_end(batch, logs): called right after processing each batch
  • on_train_begin(logs): called at start of training
  • on_train_end(logs): called at end of training

The Loop of Progress

Add @tf.function before any function that you want to compile. Compiling your TensorFlow code into a computation graph that can be globally optimized in a way that code interpreted line by line cannot helps the code run faster.

Introduction to Deep Leaning for Computer Vision

Computer vision is the earliest and biggest success story of deep learning.

"""
Instantiating a small covnet
"""
from tensorflow import keras
from tensorflow.keras import layers
inputs = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
out[14]
print(model.summary())
out[15]

Model: "functional_7"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ input_layer_3 (InputLayer) │ (None, 28, 28, 1) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d (Conv2D) │ (None, 26, 26, 32) │ 320 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ max_pooling2d (MaxPooling2D) │ (None, 13, 13, 32) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_1 (Conv2D) │ (None, 11, 11, 64) │ 18,496 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ max_pooling2d_1 (MaxPooling2D) │ (None, 5, 5, 64) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_2 (Conv2D) │ (None, 3, 3, 128) │ 73,856 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ flatten (Flatten) │ (None, 1152) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ dense_10 (Dense) │ (None, 10) │ 11,530 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 104,202 (407.04 KB)

 Trainable params: 104,202 (407.04 KB)

 Non-trainable params: 0 (0.00 B)

None

"""
Training the covnet on MNIST images - 229
"""
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype("float32") / 255
model.compile(optimizer="rmsprop",
 loss="sparse_categorical_crossentropy",
 metrics=["accuracy"])
model.fit(train_images, train_labels, epochs=5, batch_size=64)
out[16]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Epoch 1/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 6s 4ms/step - accuracy: 0.8693 - loss: 0.3962
Epoch 2/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 7s 3ms/step - accuracy: 0.9844 - loss: 0.0510
Epoch 3/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.9907 - loss: 0.0304
Epoch 4/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.9923 - loss: 0.0227
Epoch 5/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.9945 - loss: 0.0181

<keras.src.callbacks.history.History at 0x781920337c70>

"""
Evaluating the Covnet
"""
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc:.3f}")
out[17]

313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.9902 - loss: 0.0324
Test accuracy: 0.992

The Convolutional Operation

The funamental difference between a densely connected layer and a convolutional layer is this: Dense layers learn global patterns in their input feature space (for example, for a MNIST digit, patterns involving all pixels), whereas convolutional layers learn local patterms - in the case of images, patterns found in small 2D windows of the inputs.

Local Pattern of Covnets

COvnet interesting properties:

  • The patterns they learn are translation-invariant: After learning a pattern in the lower right hand corner of a picture, a covnet can recognize it anywhere: for example, in the upper right hadn corner. Covnet is efficient when processing images because the visual world is fundamentally translation invariant
  • They have spacial hierarchies of power: First conv layer will learn local patterns such as edges, the second conv layer will learn larger patterns made up of features of the first layers, and so on. This allows covnets to efficiently learn increasingly complex and abstract visual concepts, becuase the visual world is fundamentally spatially hierarchal

Spatial Hierarchy of Visual Modules

Convolutions operate over rank-3 tensors called feature maps, with two spatial axes (height and width) as well as a depth axis (also called the channels axis). The convolutional operation extracts patches from its input feature map anbd applies the same transformation to all of these patches, producing an output feature map. The output feature map is still a rank-3 tensor and the different layers in its depth axis no longer stand for specific colors as in RGB input; rather, they stand fro filters. Filters encode specific aspects of the input data: a single filter could encode the conept "presence of a face of the input".

Each of the channels of the output of a filter is a feature map of the filter of the input, indicating the response of that filter pattern at different locations in the input.

Response Map

That is what the term feature map means: every dimension in the depth axis is a feature (or filter), and the rank-2 tensor output[:, :, n] is the 2D spatial map of the response of this filter over the input.

  • Covolutions are defined by (Conv2d(output_depth, (window_height, window_width)):
    • Size of the patches extracted from the inputs: typically 3×33 \times 33×3 or 5×55 \times 55×5 . widow_height, widow_width
    • Depth of the output feature map: the number of filters computed by the convolution. output_depth

A convolution works by sliding these windows of size 3×33 \times 33×3 or 5×55 \times 55×5 over the 3D input feature map, stopping at every possible location, and extracting the 3D patch of surrounding features (window_height, window_depth, input_depth). Each such 3D patch is then transformed into a 1D vector of shape (output_depth,), which is done via a tensor product with a learned wight matrix: called the convolutional kernel - the same kernel is reused across every path. The vectors are then spatially reassembled into a 3D output of shape (height, width, output_depth).

How Convolution Works

Padding consists of assing an appropriate number of rows and columns on each side of the input feature map so as to make it possible to fit center convolution windows around every input tile. In Conv2D layers, passing can be configured with the padding karg, which can be set to same meaning add no padding - only valid window locations will be used and same which means to "pad" in such a way as to have an output with the same width and height as the input.

Valid Locations of 3x3 Patches in 5x5 Featur Map

Passing a 5x5 Input

The distance between two successive windows is a parameter of the convolution, called its stride, which defaults to 1. It's possible to have a strided convolution: convolutions with a stide higher than 1. Strided convolutions are rarely used in classification problems, but they come in handy in other types of problems. In classification problems, instead of stride, a max-pooling operation to downsample feature maps is used.

The rulw of max pooling layers is to agressively downsample feature maps, much like strided convolutions. Max pooling consists of extracting windows from their input feature maps and outputting the max value of each channel. A big difference from convolution is that max pooling is usually done with 2×22 \times 22×2 windows and stride 2, in order to downsample the feature maps by a factor of 2, which convolution is typically done with 3×33 \times 33×3 windows.

The reason to use downsampling is to reduce the number of feature-map coefficients to process, as well as to induce a spatial-filter hierarchy by making sucessive convolution layers look at increasing large windows. Features tend to encode the spatial prescence of some pattern or concept over the different tiles of the feature map (hence the term feature map), and it's more informative to look at the maximal presence of different features than to look at their average presence.

Deep learning mdels are repurposeable by nature: you can take an image-classification or speech0to-text model trained on a large-scsale dataset and reuse it on a significantly different problem with only minor changes. Specifically in the case of computer vision, many pretrained models are now publicly available for download and can be used to bootstrap powerful vision models out of very little data.

"""
1. Download the Kaggle key from Kaggle website (settings)
"""
from google.colab import files
files.upload()
out[19]

<IPython.core.display.HTML object>

Saving kaggle.json to kaggle.json

{'kaggle.json': b'{"username":"civgaugeinc","key":"5daa4158fa6523671fb71d3de2eb1dcb"}'}

!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod kaggle.json ~/.kaggle/kaggle.json
!kaggle competitions download -c dogs-vs-cats
out[20]

chmod: invalid mode: ‘kaggle.json’
Try 'chmod --help' for more information.
Warning: Your Kaggle API key is readable by other users on this system! To fix this, you can run 'chmod 600 /root/.kaggle/kaggle.json'
Downloading dogs-vs-cats.zip to /content
100% 809M/812M [00:09<00:00, 131MB/s]
100% 812M/812M [00:09<00:00, 86.7MB/s]

# Uncompress (unzip) the training data, which comes in a zip folder, silently (-qq)
!unzip -qq "/content/dogs-vs-cats.zip"
out[21]
!unzip -qq "/content/test1.zip"
out[22]
!unzip -qq "/content/train.zip"
out[23]
"""
- Dogs vs Cats Dataset (https://www.kaggle.com/competitions/dogs-vs-cats/overview)
- The dataset contains 25_000 imagesw of dogs and cats (12_500 from each class) and is 543 MB (compressed)
"""

"""
Copying images to training, validation, and test directories
"""
import os, shutil, pathlib
original_dir = pathlib.Path("/content/train") # Path to the directory where the original dataset was uncompressed
new_base_dir = pathlib.Path("/content/cats_vs_dogs_small") # Directory where we will store our smaller dataset
out[24]
def make_subset(subset_name, start_index, end_index):
  """
  Utility function to copy cat and (and dog) images from index start_index to index end_index to the subdirectory new_base_dir/{subset_name}/cat|dog.
  """
  for category in ("cat", "dog"):
    dir = new_base_dir / subset_name / category
    os.makedirs(dir)
    fnames = [f"{category}.{i}.jpg" for i in range(start_index, end_index)]
    for i, fname in enumerate(fnames):
      shutil.copyfile(src=original_dir / fname, dst=dir / fname)
# Create a training subset with the first 1_000 images of each category
make_subset("train", start_index=0, end_index=1000)
# Create the validation subset with the next 500 images of each category
make_subset("validation", start_index=1000, end_index=1500)
# Create the test subset with the next 1_000 images of each category
make_subset("test", start_index=1500, end_index=2500)
out[25]
from tensorflow import keras
from tensorflow.keras import layers
"""
The covnet will be a stack of alternetaed convolutional and MaxPooling laers
- The depth of the feature maps progressively increase in the model from 22 to 256, whereas the size of the feature map decreases from 180x180 to 7x7. This is a patterm you will see in almost all covnets
"""
# The model expects RGN images of size 180 x 180
inputs = keras.Input(shape=(180,180,3))
# Rescale inputs to [0,1] range by dividing them by 255
x = layers.Rescaling(1./255)(inputs)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
out[26]
print(model.summary())
out[27]

Model: "functional_8"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ input_layer_4 (InputLayer) │ (None, 180, 180, 3) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ rescaling (Rescaling) │ (None, 180, 180, 3) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_3 (Conv2D) │ (None, 178, 178, 32) │ 896 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ max_pooling2d_2 (MaxPooling2D) │ (None, 89, 89, 32) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_4 (Conv2D) │ (None, 87, 87, 64) │ 18,496 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ max_pooling2d_3 (MaxPooling2D) │ (None, 43, 43, 64) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_5 (Conv2D) │ (None, 41, 41, 128) │ 73,856 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ max_pooling2d_4 (MaxPooling2D) │ (None, 20, 20, 128) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_6 (Conv2D) │ (None, 18, 18, 256) │ 295,168 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ max_pooling2d_5 (MaxPooling2D) │ (None, 9, 9, 256) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ conv2d_7 (Conv2D) │ (None, 7, 7, 256) │ 590,080 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ flatten_1 (Flatten) │ (None, 12544) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ dense_11 (Dense) │ (None, 1) │ 12,545 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 991,041 (3.78 MB)

 Trainable params: 991,041 (3.78 MB)

 Non-trainable params: 0 (0.00 B)

None

model.compile(loss="binary_crossentropy",optimizer="rmsprop",metrics=["accuracy"])
out[28]

Data Prepocessing

The steps to convert jpg files to appropriate format:

  1. Read the picture files
  2. Decode the JPEG conetent to RGB grid of pixels
  3. Convert these into floating point tensors
  4. Resize them to a shared size (we use 180x180)
  5. Pack them into branches (we use batches of 32 images)

Keras has utilities to take care of these steps automatically. The utility function image_dataset_from_directory() kets you set up a quick datapipeline that can automatically turn image files on disk into batches of preprocessed tensors.

The Dataset object:

the Dataset object is an iterator. You can pass it directly into the fit() method of a Keras model. It handles many features that would be otherwise cumbersome to implement yourself - in particular, asynchronous data prefetching (preprocessing the next batch of data while the previous one is being handled by the model, which keeps execution flowing without interruptions).

from tensorflow.keras.utils import image_dataset_from_directory
train_dataset = image_dataset_from_directory(new_base_dir / "train",image_size=(180, 180),batch_size=32)
validation_dataset = image_dataset_from_directory(new_base_dir / "validation",image_size=(180, 180),batch_size=32)
test_dataset = image_dataset_from_directory(new_base_dir / "test",image_size=(180, 180),batch_size=32)
out[30]

Found 2000 files belonging to 2 classes.
Found 1000 files belonging to 2 classes.
Found 2000 files belonging to 2 classes.

for data_batch, labels_batch in train_dataset:
  print("data batch shape:",data_batch.shape)
  print("labels batch shape:",labels_batch.shape)
  break
out[31]

data batch shape: (32, 180, 180, 3)
labels batch shape: (32,)

# Save the model after each epoch
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="/content/models/convert_from_scrath.keras",
        save_best_only=True,
        monitor="val_loss" # Overwrite the current file when the current value of the val_loss metric is lower than at any previous time during training.
    )
]

history = model.fit(train_dataset,epochs=30,validation_data=validation_dataset,callbacks=callbacks)
out[32]

Epoch 1/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 16s 170ms/step - accuracy: 0.5197 - loss: 0.7321 - val_accuracy: 0.5000 - val_loss: 0.6919
Epoch 2/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 108ms/step - accuracy: 0.5082 - loss: 0.6999 - val_accuracy: 0.5000 - val_loss: 0.7845
Epoch 3/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 54ms/step - accuracy: 0.5401 - loss: 0.7019 - val_accuracy: 0.5970 - val_loss: 0.6754
Epoch 4/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 84ms/step - accuracy: 0.5826 - loss: 0.6841 - val_accuracy: 0.5880 - val_loss: 0.6631
Epoch 5/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 105ms/step - accuracy: 0.6437 - loss: 0.6369 - val_accuracy: 0.5840 - val_loss: 0.7948
Epoch 6/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 54ms/step - accuracy: 0.6676 - loss: 0.6294 - val_accuracy: 0.6550 - val_loss: 0.6178
Epoch 7/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 53ms/step - accuracy: 0.7004 - loss: 0.5722 - val_accuracy: 0.6670 - val_loss: 0.6414
Epoch 8/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 92ms/step - accuracy: 0.7025 - loss: 0.5831 - val_accuracy: 0.6700 - val_loss: 0.6234
Epoch 9/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 52ms/step - accuracy: 0.7271 - loss: 0.5348 - val_accuracy: 0.6370 - val_loss: 0.6707
Epoch 10/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 53ms/step - accuracy: 0.7616 - loss: 0.4783 - val_accuracy: 0.6740 - val_loss: 0.6416
Epoch 11/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 9s 107ms/step - accuracy: 0.7753 - loss: 0.4712 - val_accuracy: 0.6970 - val_loss: 0.5911
Epoch 12/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 57ms/step - accuracy: 0.8135 - loss: 0.4154 - val_accuracy: 0.7110 - val_loss: 0.5862
Epoch 13/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 52ms/step - accuracy: 0.8503 - loss: 0.3584 - val_accuracy: 0.7320 - val_loss: 0.6424
Epoch 14/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 83ms/step - accuracy: 0.8909 - loss: 0.2836 - val_accuracy: 0.7180 - val_loss: 0.7397
Epoch 15/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 89ms/step - accuracy: 0.8986 - loss: 0.2467 - val_accuracy: 0.7380 - val_loss: 0.6902
Epoch 16/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 60ms/step - accuracy: 0.9207 - loss: 0.2048 - val_accuracy: 0.7240 - val_loss: 0.8791
Epoch 17/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 58ms/step - accuracy: 0.9244 - loss: 0.1811 - val_accuracy: 0.6950 - val_loss: 1.2544
Epoch 18/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.9131 - loss: 0.2327 - val_accuracy: 0.7300 - val_loss: 0.9690
Epoch 19/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 89ms/step - accuracy: 0.9644 - loss: 0.0914 - val_accuracy: 0.7640 - val_loss: 1.1060
Epoch 20/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 79ms/step - accuracy: 0.9350 - loss: 0.1765 - val_accuracy: 0.7300 - val_loss: 1.2227
Epoch 21/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 55ms/step - accuracy: 0.9746 - loss: 0.0705 - val_accuracy: 0.7290 - val_loss: 1.2965
Epoch 22/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 104ms/step - accuracy: 0.9742 - loss: 0.0629 - val_accuracy: 0.7400 - val_loss: 1.3968
Epoch 23/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 97ms/step - accuracy: 0.9873 - loss: 0.0380 - val_accuracy: 0.7460 - val_loss: 1.4535
Epoch 24/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 93ms/step - accuracy: 0.9870 - loss: 0.0421 - val_accuracy: 0.7090 - val_loss: 1.4770
Epoch 25/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.9859 - loss: 0.0360 - val_accuracy: 0.7450 - val_loss: 1.6772
Epoch 26/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 52ms/step - accuracy: 0.9800 - loss: 0.0600 - val_accuracy: 0.7320 - val_loss: 1.7400
Epoch 27/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 91ms/step - accuracy: 0.9804 - loss: 0.0770 - val_accuracy: 0.7170 - val_loss: 2.2931
Epoch 28/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 52ms/step - accuracy: 0.9902 - loss: 0.0261 - val_accuracy: 0.7340 - val_loss: 2.0403
Epoch 29/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 53ms/step - accuracy: 0.9898 - loss: 0.0420 - val_accuracy: 0.7170 - val_loss: 1.9197
Epoch 30/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 97ms/step - accuracy: 0.9845 - loss: 0.0598 - val_accuracy: 0.7140 - val_loss: 2.0113

"""
Plot the loss and accuracy of the model over the training and validation data
"""
import matplotlib.pyplot as plt
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
out[33]
Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

The plots above are characteristic of overfitting. Because we have few training samples (2000, ), overfitting is a number one concern.

Overfitting is cause by having too few samples to learn from, rendering you unable to train a model that can generalize to new data. Data augmentation takes the approach of generating more trainingdata from existing training samples by augmenting the samples via a number of random transformations that yielf believable-looking images. In Keras, this can be done by adding a number of data augementation layers at the start of the model.

Note that data augmentation does not produce new information - it just remixes existing information. As such, it might not completely get rid of overfitting.

curr_model = "/content/models/convert_from_scrath.keras"
test_model = keras.models.load_model(curr_model)
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
out[35]

63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 46ms/step - accuracy: 0.7109 - loss: 0.5671
Test accuracy: 0.725

"""
Define data augementation stage to add to an image model
"""
data_augmentation = keras.Sequential(
 [
 layers.RandomFlip("horizontal"), # Applies horizontal filliping to a random 50% of the images taht go through it
 layers.RandomRotation(0.1), # Rotates the input images by a ranodm value in the range `[-10%, 10%]`
 layers.RandomZoom(0.2), # Zoomns in or out of the image by a random factor in the range `[-20%, +20%]`
 ]
)
out[36]
plt.figure(figsize=(10, 10))
for images, _ in train_dataset.take(1): # take(N) to only sample N batches from the dataset
  for i in range(9):
    augmented_images = data_augmentation(images) # Apply the augementation stage to the batch of images
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(augmented_images[0].numpy().astype("uint8"))# Display the first image in the output batch. For each of the nine iterations, this is a different augmentation of teh same image
    plt.axis("off")
out[37]
Jupyter Notebook Image

<Figure size 1000x1000 with 9 Axes>

"""
Defining a new covnet that includes image augmentation and dropout
"""
inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = layers.Rescaling(1./255)(x)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss="binary_crossentropy",
 optimizer="rmsprop",
 metrics=["accuracy"])
"""
Training the regularized covnet
"""
callbacks = [
 keras.callbacks.ModelCheckpoint(
 filepath="convnet_from_scratch_with_augmentation.keras",
 save_best_only=True,
 monitor="val_loss")
]
history = model.fit(
 train_dataset,
 epochs=100,
 validation_data=validation_dataset,
 callbacks=callbacks)
"""
Plotting the results again
"""
import matplotlib.pyplot as plt
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
out[38]

Epoch 1/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 92ms/step - accuracy: 0.4687 - loss: 0.7620 - val_accuracy: 0.6270 - val_loss: 0.6924
Epoch 2/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 62ms/step - accuracy: 0.5302 - loss: 0.6931 - val_accuracy: 0.5800 - val_loss: 0.6910
Epoch 3/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 94ms/step - accuracy: 0.5248 - loss: 0.6935 - val_accuracy: 0.5110 - val_loss: 0.6880
Epoch 4/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 97ms/step - accuracy: 0.5317 - loss: 0.6911 - val_accuracy: 0.5330 - val_loss: 0.6831
Epoch 5/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 9s 69ms/step - accuracy: 0.5653 - loss: 0.6790 - val_accuracy: 0.6200 - val_loss: 0.6558
Epoch 6/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 84ms/step - accuracy: 0.6191 - loss: 0.6662 - val_accuracy: 0.6520 - val_loss: 0.6278
Epoch 7/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 95ms/step - accuracy: 0.6278 - loss: 0.6402 - val_accuracy: 0.6370 - val_loss: 0.6357
Epoch 8/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 56ms/step - accuracy: 0.6260 - loss: 0.6361 - val_accuracy: 0.6180 - val_loss: 0.6981
Epoch 9/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 84ms/step - accuracy: 0.6663 - loss: 0.6197 - val_accuracy: 0.6630 - val_loss: 0.5977
Epoch 10/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 94ms/step - accuracy: 0.6705 - loss: 0.6104 - val_accuracy: 0.6680 - val_loss: 0.5877
Epoch 11/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 60ms/step - accuracy: 0.6738 - loss: 0.5980 - val_accuracy: 0.7070 - val_loss: 0.5623
Epoch 12/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 61ms/step - accuracy: 0.6894 - loss: 0.5899 - val_accuracy: 0.6670 - val_loss: 0.6051
Epoch 13/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 104ms/step - accuracy: 0.6936 - loss: 0.5927 - val_accuracy: 0.7130 - val_loss: 0.5437
Epoch 14/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - accuracy: 0.6834 - loss: 0.5848 - val_accuracy: 0.6800 - val_loss: 0.6271
Epoch 15/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 60ms/step - accuracy: 0.7156 - loss: 0.5700 - val_accuracy: 0.6950 - val_loss: 0.5689
Epoch 16/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.7313 - loss: 0.5531 - val_accuracy: 0.7320 - val_loss: 0.5229
Epoch 17/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 72ms/step - accuracy: 0.7207 - loss: 0.5635 - val_accuracy: 0.6780 - val_loss: 0.6705
Epoch 18/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 108ms/step - accuracy: 0.7330 - loss: 0.5436 - val_accuracy: 0.7350 - val_loss: 0.5053
Epoch 19/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 59ms/step - accuracy: 0.7525 - loss: 0.5220 - val_accuracy: 0.6950 - val_loss: 0.5866
Epoch 20/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.7520 - loss: 0.5137 - val_accuracy: 0.6920 - val_loss: 0.6136
Epoch 21/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.7377 - loss: 0.5123 - val_accuracy: 0.6870 - val_loss: 0.5851
Epoch 22/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 94ms/step - accuracy: 0.7507 - loss: 0.5117 - val_accuracy: 0.7810 - val_loss: 0.4773
Epoch 23/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 56ms/step - accuracy: 0.7698 - loss: 0.4989 - val_accuracy: 0.7760 - val_loss: 0.4564
Epoch 24/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 75ms/step - accuracy: 0.7615 - loss: 0.4849 - val_accuracy: 0.7660 - val_loss: 0.4816
Epoch 25/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 95ms/step - accuracy: 0.7934 - loss: 0.4507 - val_accuracy: 0.7720 - val_loss: 0.5284
Epoch 26/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 55ms/step - accuracy: 0.7726 - loss: 0.5096 - val_accuracy: 0.7800 - val_loss: 0.4759
Epoch 27/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.7887 - loss: 0.4609 - val_accuracy: 0.7940 - val_loss: 0.4678
Epoch 28/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 107ms/step - accuracy: 0.8008 - loss: 0.4404 - val_accuracy: 0.8050 - val_loss: 0.4672
Epoch 29/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 76ms/step - accuracy: 0.7965 - loss: 0.4345 - val_accuracy: 0.7810 - val_loss: 0.4727
Epoch 30/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.7926 - loss: 0.4442 - val_accuracy: 0.8040 - val_loss: 0.4232
Epoch 31/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 64ms/step - accuracy: 0.8213 - loss: 0.4288 - val_accuracy: 0.7650 - val_loss: 0.5629
Epoch 32/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 108ms/step - accuracy: 0.8048 - loss: 0.4345 - val_accuracy: 0.7740 - val_loss: 0.4736
Epoch 33/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 74ms/step - accuracy: 0.7895 - loss: 0.4331 - val_accuracy: 0.7810 - val_loss: 0.5175
Epoch 34/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.8428 - loss: 0.3809 - val_accuracy: 0.7930 - val_loss: 0.5304
Epoch 35/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 84ms/step - accuracy: 0.8139 - loss: 0.4275 - val_accuracy: 0.8050 - val_loss: 0.4250
Epoch 36/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 109ms/step - accuracy: 0.8274 - loss: 0.3880 - val_accuracy: 0.7880 - val_loss: 0.5299
Epoch 37/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 109ms/step - accuracy: 0.8401 - loss: 0.3860 - val_accuracy: 0.8170 - val_loss: 0.4034
Epoch 38/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 75ms/step - accuracy: 0.8321 - loss: 0.3741 - val_accuracy: 0.7160 - val_loss: 0.6293
Epoch 39/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 54ms/step - accuracy: 0.8372 - loss: 0.3921 - val_accuracy: 0.7090 - val_loss: 0.6756
Epoch 40/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.8479 - loss: 0.3633 - val_accuracy: 0.8250 - val_loss: 0.4724
Epoch 41/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 76ms/step - accuracy: 0.8340 - loss: 0.3505 - val_accuracy: 0.7410 - val_loss: 0.6809
Epoch 42/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 101ms/step - accuracy: 0.8428 - loss: 0.3705 - val_accuracy: 0.8140 - val_loss: 0.4462
Epoch 43/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.8451 - loss: 0.3382 - val_accuracy: 0.7620 - val_loss: 0.6549
Epoch 44/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.8381 - loss: 0.3596 - val_accuracy: 0.7940 - val_loss: 0.4908
Epoch 45/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.8572 - loss: 0.3370 - val_accuracy: 0.8060 - val_loss: 0.4815
Epoch 46/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 93ms/step - accuracy: 0.8487 - loss: 0.3323 - val_accuracy: 0.8150 - val_loss: 0.4854
Epoch 47/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 55ms/step - accuracy: 0.8585 - loss: 0.3395 - val_accuracy: 0.8110 - val_loss: 0.4829
Epoch 48/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.8697 - loss: 0.3586 - val_accuracy: 0.8410 - val_loss: 0.4034
Epoch 49/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 89ms/step - accuracy: 0.8720 - loss: 0.2998 - val_accuracy: 0.8050 - val_loss: 0.4507
Epoch 50/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 55ms/step - accuracy: 0.8664 - loss: 0.3075 - val_accuracy: 0.7430 - val_loss: 1.2748
Epoch 51/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.8774 - loss: 0.3364 - val_accuracy: 0.7850 - val_loss: 0.5878
Epoch 52/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 73ms/step - accuracy: 0.8524 - loss: 0.3248 - val_accuracy: 0.7760 - val_loss: 0.6667
Epoch 53/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 108ms/step - accuracy: 0.8688 - loss: 0.3217 - val_accuracy: 0.8120 - val_loss: 0.4562
Epoch 54/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 56ms/step - accuracy: 0.8882 - loss: 0.2792 - val_accuracy: 0.8190 - val_loss: 0.4966
Epoch 55/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 57ms/step - accuracy: 0.8868 - loss: 0.2806 - val_accuracy: 0.8260 - val_loss: 0.4639
Epoch 56/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 94ms/step - accuracy: 0.8951 - loss: 0.2660 - val_accuracy: 0.8560 - val_loss: 0.4327
Epoch 57/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 55ms/step - accuracy: 0.8811 - loss: 0.3163 - val_accuracy: 0.8030 - val_loss: 0.5789
Epoch 58/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.8862 - loss: 0.2582 - val_accuracy: 0.8270 - val_loss: 0.5020
Epoch 59/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 106ms/step - accuracy: 0.9051 - loss: 0.2412 - val_accuracy: 0.8440 - val_loss: 0.4804
Epoch 60/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 85ms/step - accuracy: 0.9000 - loss: 0.2305 - val_accuracy: 0.8110 - val_loss: 0.4972
Epoch 61/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 9s 62ms/step - accuracy: 0.9034 - loss: 0.2425 - val_accuracy: 0.8000 - val_loss: 0.5581
Epoch 62/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 96ms/step - accuracy: 0.8876 - loss: 0.2765 - val_accuracy: 0.8120 - val_loss: 0.6195
Epoch 63/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 80ms/step - accuracy: 0.9082 - loss: 0.2522 - val_accuracy: 0.8350 - val_loss: 0.4187
Epoch 64/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.9073 - loss: 0.2280 - val_accuracy: 0.7340 - val_loss: 1.3465
Epoch 65/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 56ms/step - accuracy: 0.9073 - loss: 0.2654 - val_accuracy: 0.8420 - val_loss: 0.4577
Epoch 66/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.9194 - loss: 0.2186 - val_accuracy: 0.8070 - val_loss: 0.6530
Epoch 67/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 99ms/step - accuracy: 0.9102 - loss: 0.2170 - val_accuracy: 0.8260 - val_loss: 0.5703
Epoch 68/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 89ms/step - accuracy: 0.8877 - loss: 0.2623 - val_accuracy: 0.8530 - val_loss: 0.4752
Epoch 69/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 11s 96ms/step - accuracy: 0.9114 - loss: 0.2264 - val_accuracy: 0.8370 - val_loss: 0.4746
Epoch 70/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 84ms/step - accuracy: 0.9186 - loss: 0.1939 - val_accuracy: 0.7830 - val_loss: 0.9465
Epoch 71/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.9161 - loss: 0.2612 - val_accuracy: 0.7740 - val_loss: 0.8024
Epoch 72/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.8990 - loss: 0.2632 - val_accuracy: 0.8390 - val_loss: 0.5292
Epoch 73/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.9107 - loss: 0.2217 - val_accuracy: 0.8460 - val_loss: 0.4928
Epoch 74/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 106ms/step - accuracy: 0.9231 - loss: 0.2097 - val_accuracy: 0.8390 - val_loss: 0.5487
Epoch 75/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 70ms/step - accuracy: 0.9272 - loss: 0.2073 - val_accuracy: 0.8000 - val_loss: 0.6881
Epoch 76/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.9279 - loss: 0.2168 - val_accuracy: 0.8160 - val_loss: 0.5814
Epoch 77/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 55ms/step - accuracy: 0.9090 - loss: 0.2252 - val_accuracy: 0.8430 - val_loss: 0.5889
Epoch 78/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 95ms/step - accuracy: 0.9221 - loss: 0.2060 - val_accuracy: 0.8570 - val_loss: 0.4378
Epoch 79/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 5s 74ms/step - accuracy: 0.9248 - loss: 0.1756 - val_accuracy: 0.8220 - val_loss: 0.5584
Epoch 80/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 55ms/step - accuracy: 0.9302 - loss: 0.2041 - val_accuracy: 0.8270 - val_loss: 0.5261
Epoch 81/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 71ms/step - accuracy: 0.9354 - loss: 0.1802 - val_accuracy: 0.8210 - val_loss: 0.6228
Epoch 82/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 109ms/step - accuracy: 0.9308 - loss: 0.2022 - val_accuracy: 0.8390 - val_loss: 0.5088
Epoch 83/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 62ms/step - accuracy: 0.9267 - loss: 0.1831 - val_accuracy: 0.8550 - val_loss: 0.4234
Epoch 84/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.9327 - loss: 0.1859 - val_accuracy: 0.7980 - val_loss: 0.9202
Epoch 85/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 107ms/step - accuracy: 0.9237 - loss: 0.2413 - val_accuracy: 0.8490 - val_loss: 0.5851
Epoch 86/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 56ms/step - accuracy: 0.9343 - loss: 0.1706 - val_accuracy: 0.8600 - val_loss: 0.5009
Epoch 87/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.9392 - loss: 0.1608 - val_accuracy: 0.8450 - val_loss: 0.5331
Epoch 88/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 94ms/step - accuracy: 0.9332 - loss: 0.1990 - val_accuracy: 0.8240 - val_loss: 0.5896
Epoch 89/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 56ms/step - accuracy: 0.9358 - loss: 0.1932 - val_accuracy: 0.8500 - val_loss: 0.4994
Epoch 90/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 56ms/step - accuracy: 0.9262 - loss: 0.1759 - val_accuracy: 0.8630 - val_loss: 0.4711
Epoch 91/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 94ms/step - accuracy: 0.9414 - loss: 0.1457 - val_accuracy: 0.8250 - val_loss: 0.6283
Epoch 92/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 55ms/step - accuracy: 0.9350 - loss: 0.1620 - val_accuracy: 0.8410 - val_loss: 0.7176
Epoch 93/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.9115 - loss: 0.2907 - val_accuracy: 0.8410 - val_loss: 0.4362
Epoch 94/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 102ms/step - accuracy: 0.9390 - loss: 0.1601 - val_accuracy: 0.7890 - val_loss: 1.0193
Epoch 95/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 87ms/step - accuracy: 0.9377 - loss: 0.1916 - val_accuracy: 0.8590 - val_loss: 0.5984
Epoch 96/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 8s 55ms/step - accuracy: 0.9416 - loss: 0.1627 - val_accuracy: 0.8530 - val_loss: 0.6074
Epoch 97/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 95ms/step - accuracy: 0.9456 - loss: 0.1484 - val_accuracy: 0.8470 - val_loss: 0.5261
Epoch 98/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 6s 87ms/step - accuracy: 0.9411 - loss: 0.1606 - val_accuracy: 0.8070 - val_loss: 0.8797
Epoch 99/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 9s 61ms/step - accuracy: 0.9289 - loss: 0.2210 - val_accuracy: 0.8310 - val_loss: 0.9120
Epoch 100/100
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 91ms/step - accuracy: 0.9358 - loss: 0.2214 - val_accuracy: 0.8530 - val_loss: 0.6586

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

"""
Evaluating the model on the test set
"""
test_model = keras.models.load_model(
 "convnet_from_scratch_with_augmentation.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
out[39]

Leveraging a pretrained model

A common and highly effective approach to deep learning on small image datasets is to use a pretrained model. A pretrained model is a model that was previously trained on a large dataset, typically on a large-scale image-classification task.

If this original dataset is large enough and general enough, the spatial hierarchy of features learned by the pretrained model can effectively act as a generic model of the visual world, and hence, its features can prove useful for many different computer vision problems, even though these new problems may involve completely different classes than those of the original task.

There are two ways to use a pretrained model: feature extraction and fine-tuning.

Feature extraction consists of using the representations learned by a previously trained model to extract interesting features from new samples. These features are then run through a new classifier, which is trained from scratch. Covnets for image clasification typically are comprised of a series of pooling and convolutional layers and end with a densely connected classifier. The first part is called the convolutional base of the model. In the case of covnets, feature extraction consists of taking the convolutional base of apreviously trained network, running the new data through it, and training a new classifier on top of the output.

Representations learned by the convolutional base are likely to be more generic and therefore more reusable: the feature maps of a covnet are presence naps of generic concepts over a picture, which are likely to be useful regard;ess of the computer vision problem at hand.

Swapping Classifiers While Keeping the Same Convolutional Base

The VCG16 model, the pretrained model that we will use, comes prepackaged with Keras. You can import it from the keras.application module.

"""
Instantiating the VCG16 convolutional base
"""
conv_base = keras.applications.vgg16.VGG16(
 weights="imagenet", # specifies the weight checkpoint from which to initialize the model
 include_top=False, # refers to including (or not) the densly connected classifier on top of the network
 input_shape=(180, 180, 3)) # Shape of the image tensors taht we feed to the network - optional argument
out[41]

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58889256/58889256 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step

print(conv_base.summary())
out[42]

Model: "vgg16"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓

┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩

│ input_layer_7 (InputLayer) │ (None, 180, 180, 3) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block1_conv1 (Conv2D) │ (None, 180, 180, 64) │ 1,792 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block1_conv2 (Conv2D) │ (None, 180, 180, 64) │ 36,928 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block1_pool (MaxPooling2D) │ (None, 90, 90, 64) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block2_conv1 (Conv2D) │ (None, 90, 90, 128) │ 73,856 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block2_conv2 (Conv2D) │ (None, 90, 90, 128) │ 147,584 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block2_pool (MaxPooling2D) │ (None, 45, 45, 128) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block3_conv1 (Conv2D) │ (None, 45, 45, 256) │ 295,168 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block3_conv2 (Conv2D) │ (None, 45, 45, 256) │ 590,080 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block3_conv3 (Conv2D) │ (None, 45, 45, 256) │ 590,080 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block3_pool (MaxPooling2D) │ (None, 22, 22, 256) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block4_conv1 (Conv2D) │ (None, 22, 22, 512) │ 1,180,160 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block4_conv2 (Conv2D) │ (None, 22, 22, 512) │ 2,359,808 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block4_conv3 (Conv2D) │ (None, 22, 22, 512) │ 2,359,808 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block4_pool (MaxPooling2D) │ (None, 11, 11, 512) │ 0 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block5_conv1 (Conv2D) │ (None, 11, 11, 512) │ 2,359,808 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block5_conv2 (Conv2D) │ (None, 11, 11, 512) │ 2,359,808 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block5_conv3 (Conv2D) │ (None, 11, 11, 512) │ 2,359,808 │

├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤

│ block5_pool (MaxPooling2D) │ (None, 5, 5, 512) │ 0 │

└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 14,714,688 (56.13 MB)

 Trainable params: 14,714,688 (56.13 MB)

 Non-trainable params: 0 (0.00 B)

None

Two ways to proceed:

  1. Run the convolutional base over our dataset, record its output to a NumPy array on disk, and then use this data as input to a standalone, densely connected classifier similar to those you saw in chapter 4 of this book. This solution is fast and cheap to run, because it only requires running the convolutional base once for every input image, and the convolutional base is by far the most expensive part of the pipeline. But for the same reason, this technique won't allow us to use data augmentation.
  2. Extend the model we have (conv_base) by adding Dense layers on top, and run the whole thing from end to end on the input data. This will allow us to use data augmentation, because every input image goes through the convolutional base every time it's seen by the model. But for the same reason, this technique is far more expensive than the first.
"""
Exracting the CG16 features and corresponding labels
"""
import numpy as np
def get_features_and_labels(dataset):
 all_features = []
 all_labels = []
 for images, labels in dataset:
  preprocessed_images = keras.applications.vgg16.preprocess_input(images)
  features = conv_base.predict(preprocessed_images)
  all_features.append(features)
  all_labels.append(labels)
 return np.concatenate(all_features), np.concatenate(all_labels)

train_features, train_labels = get_features_and_labels(train_dataset)
val_features, val_labels = get_features_and_labels(validation_dataset)
test_features, test_labels = get_features_and_labels(test_dataset)
out[44]

1/1 ━━━━━━━━━━━━━━━━━━━━ 10s 10s/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 45ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 31ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 31ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 38ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 38ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 46ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 3s 3s/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 38ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 31ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 38ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 45ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step

print(train_features.shape)
out[45]

(2000, 5, 5, 512)

"""
Defining and traing the densely connetced classifier
"""
inputs = keras.Input(shape=(5, 5, 512))
# Note the use of the Flatten layer before passing the features to a Dense layer
x = layers.Flatten()(inputs)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)

model.compile(loss="binary_crossentropy",
 optimizer="rmsprop",
 metrics=["accuracy"])
callbacks = [
 keras.callbacks.ModelCheckpoint(
 filepath="feature_extraction.keras",
 save_best_only=True,
 monitor="val_loss")
]
history = model.fit(
 train_features, train_labels,
 epochs=20,
 validation_data=(val_features, val_labels),
 callbacks=callbacks)

# Note that training is very fast because we only have to deal with two Dense layers - an epoch takes less than one second even on CPU

"""
Plotting the results
"""
import matplotlib.pyplot as plt
acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, "bo", label="Training accuracy")
plt.plot(epochs, val_acc, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
out[46]

Epoch 1/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 4s 34ms/step - accuracy: 0.8593 - loss: 33.5145 - val_accuracy: 0.9610 - val_loss: 5.8598
Epoch 2/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.9807 - loss: 2.9287 - val_accuracy: 0.9740 - val_loss: 4.1440
Epoch 3/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9753 - loss: 5.3033 - val_accuracy: 0.9320 - val_loss: 16.8070
Epoch 4/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.9872 - loss: 1.5244 - val_accuracy: 0.9760 - val_loss: 4.0692
Epoch 5/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.9914 - loss: 1.4569 - val_accuracy: 0.9770 - val_loss: 5.5388
Epoch 6/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.9959 - loss: 0.5370 - val_accuracy: 0.9780 - val_loss: 5.1549
Epoch 7/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.9978 - loss: 0.4119 - val_accuracy: 0.9700 - val_loss: 5.8892
Epoch 8/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.9956 - loss: 1.0025 - val_accuracy: 0.9770 - val_loss: 5.8722
Epoch 9/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9980 - loss: 0.1450 - val_accuracy: 0.9770 - val_loss: 5.4295
Epoch 10/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.9992 - loss: 0.0216 - val_accuracy: 0.9780 - val_loss: 5.2768
Epoch 11/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.9992 - loss: 0.0426 - val_accuracy: 0.9800 - val_loss: 4.7593
Epoch 12/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 1.0000 - loss: 0.0012 - val_accuracy: 0.9810 - val_loss: 4.4106
Epoch 13/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9976 - loss: 0.2752 - val_accuracy: 0.9720 - val_loss: 5.8832
Epoch 14/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9967 - loss: 0.2898 - val_accuracy: 0.9760 - val_loss: 5.1115
Epoch 15/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.9993 - loss: 0.0432 - val_accuracy: 0.9760 - val_loss: 4.8862
Epoch 16/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9994 - loss: 0.0116 - val_accuracy: 0.9810 - val_loss: 4.4429
Epoch 17/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 1.0000 - loss: 7.8161e-14 - val_accuracy: 0.9810 - val_loss: 4.4429
Epoch 18/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 1.0000 - loss: 1.8860e-21 - val_accuracy: 0.9810 - val_loss: 4.4429
Epoch 19/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.9999 - loss: 0.0090 - val_accuracy: 0.9720 - val_loss: 5.5274
Epoch 20/20
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.9977 - loss: 0.0069 - val_accuracy: 0.9780 - val_loss: 4.3776

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

The technique above gets us an accuracy of 97% - much better than when we trained a small network from scratch. The plots indicate that we're overfitting form the start - that's because this technique doesn't use data augementation, which is essential for preventing overfitting with small image datasets.

Feature Extraction together With Data Augmentation

Creating a model that chains the conv_base with a new dense classifier, and training it end to end on the inputs. In order to do this, we will first freeze the convolutional base. Freezing a layer or set of layers means preventing the weights from being updated during training. If we don;t do this, then the point of using a pretrained model is destroyed.

Note that Feature Extraction together with data augmentation is a problem that is often intractable on CPUs.

"""
Instantianting and freezing the VCG16 cnvolutional base
"""
conv_base = keras.applications.vgg16.VGG16(weights="imagenet", include_top=False)
# Setting trainable to `False` empties the list of trainable weights ofthe layer or model
conv_base.trainable = False
out[48]
"""
Printing the list of trainable weights before and after freezing
"""
conv_base.trainable = True
print("This is the number of trainable weights before freezing the conv base:", len(conv_base.trainable_weights))
conv_base.trainable = False
print("This is the number of trainable weights after freezing the conv base:", len(conv_base.trainable_weights))
out[49]

This is the number of trainable weights before freezing the conv base: 26
This is the number of trainable weights after freezing the conv base: 0

"""
Adding a data augmentation stage and a classifier to a convolutional base
"""
data_augmentation = keras.Sequential(
 [
 layers.RandomFlip("horizontal"),
 layers.RandomRotation(0.1),
 layers.RandomZoom(0.2),
 ]
)
inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs) # Appluy data augementation
x = keras.applications.vgg16.preprocess_input(x) # Apply input value scaling
x = conv_base(x)
x = layers.Flatten()(x)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(loss="binary_crossentropy",
 optimizer="rmsprop",
 metrics=["accuracy"])

callbacks = [
 keras.callbacks.ModelCheckpoint(
 filepath="feature_extraction_with_data_augmentation.keras",
 save_best_only=True,
 monitor="val_loss")
]

history = model.fit(
 train_dataset,
 epochs=50,
 validation_data=validation_dataset,
 callbacks=callbacks)

import matplotlib.pyplot as plt
acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, "bo", label="Training accuracy")
plt.plot(epochs, val_acc, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()

test_model = keras.models.load_model(
 "feature_extraction_with_data_augmentation.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
out[50]

Epoch 1/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 233ms/step - accuracy: 0.8068 - loss: 45.1071 - val_accuracy: 0.9750 - val_loss: 2.6951
Epoch 2/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 194ms/step - accuracy: 0.9557 - loss: 4.5008 - val_accuracy: 0.9710 - val_loss: 3.9373
Epoch 3/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 165ms/step - accuracy: 0.9528 - loss: 4.8248 - val_accuracy: 0.9760 - val_loss: 3.2888
Epoch 4/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 193ms/step - accuracy: 0.9640 - loss: 3.8937 - val_accuracy: 0.9660 - val_loss: 6.0385
Epoch 5/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 193ms/step - accuracy: 0.9484 - loss: 6.8233 - val_accuracy: 0.9640 - val_loss: 5.6102
Epoch 6/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 192ms/step - accuracy: 0.9601 - loss: 5.1158 - val_accuracy: 0.9730 - val_loss: 3.7353
Epoch 7/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 165ms/step - accuracy: 0.9722 - loss: 3.4475 - val_accuracy: 0.9790 - val_loss: 3.2312
Epoch 8/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 192ms/step - accuracy: 0.9746 - loss: 2.0627 - val_accuracy: 0.9780 - val_loss: 2.7985
Epoch 9/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 166ms/step - accuracy: 0.9773 - loss: 2.0710 - val_accuracy: 0.9760 - val_loss: 3.4787
Epoch 10/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 196ms/step - accuracy: 0.9862 - loss: 1.5714 - val_accuracy: 0.9640 - val_loss: 7.2224
Epoch 11/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 165ms/step - accuracy: 0.9727 - loss: 2.1822 - val_accuracy: 0.9730 - val_loss: 4.6338
Epoch 12/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 11s 169ms/step - accuracy: 0.9732 - loss: 2.0730 - val_accuracy: 0.9680 - val_loss: 4.6297
Epoch 13/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 164ms/step - accuracy: 0.9802 - loss: 1.3970 - val_accuracy: 0.9760 - val_loss: 3.3570
Epoch 14/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 164ms/step - accuracy: 0.9775 - loss: 2.6104 - val_accuracy: 0.9720 - val_loss: 3.5636
Epoch 15/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 195ms/step - accuracy: 0.9830 - loss: 1.2946 - val_accuracy: 0.9530 - val_loss: 8.9430
Epoch 16/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 18s 162ms/step - accuracy: 0.9765 - loss: 1.8776 - val_accuracy: 0.9790 - val_loss: 2.7998
Epoch 17/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 23s 206ms/step - accuracy: 0.9801 - loss: 1.3697 - val_accuracy: 0.9830 - val_loss: 2.3681
Epoch 18/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 18s 167ms/step - accuracy: 0.9877 - loss: 0.8332 - val_accuracy: 0.9780 - val_loss: 3.3741
Epoch 19/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 176ms/step - accuracy: 0.9687 - loss: 2.4666 - val_accuracy: 0.9800 - val_loss: 2.2856
Epoch 20/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 164ms/step - accuracy: 0.9762 - loss: 1.6879 - val_accuracy: 0.9830 - val_loss: 2.3138
Epoch 21/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 164ms/step - accuracy: 0.9853 - loss: 1.2945 - val_accuracy: 0.9830 - val_loss: 2.3295
Epoch 22/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 194ms/step - accuracy: 0.9818 - loss: 1.3945 - val_accuracy: 0.9700 - val_loss: 2.9231
Epoch 23/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 169ms/step - accuracy: 0.9884 - loss: 0.5049 - val_accuracy: 0.9550 - val_loss: 6.8600
Epoch 24/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 11s 177ms/step - accuracy: 0.9728 - loss: 2.1937 - val_accuracy: 0.9760 - val_loss: 2.0076
Epoch 25/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 13s 204ms/step - accuracy: 0.9788 - loss: 1.0911 - val_accuracy: 0.9800 - val_loss: 1.8782
Epoch 26/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 18s 164ms/step - accuracy: 0.9816 - loss: 0.9337 - val_accuracy: 0.9800 - val_loss: 1.9663
Epoch 27/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 165ms/step - accuracy: 0.9856 - loss: 1.0523 - val_accuracy: 0.9690 - val_loss: 2.8967
Epoch 28/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 165ms/step - accuracy: 0.9831 - loss: 1.1668 - val_accuracy: 0.9780 - val_loss: 2.3538
Epoch 29/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 193ms/step - accuracy: 0.9869 - loss: 0.6647 - val_accuracy: 0.9670 - val_loss: 3.5220
Epoch 30/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 10s 164ms/step - accuracy: 0.9897 - loss: 0.6342 - val_accuracy: 0.9810 - val_loss: 1.9567
Epoch 31/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 13s 204ms/step - accuracy: 0.9874 - loss: 0.8583 - val_accuracy: 0.9850 - val_loss: 1.2944
Epoch 32/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 194ms/step - accuracy: 0.9900 - loss: 0.6442 - val_accuracy: 0.9850 - val_loss: 1.5957
Epoch 33/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 192ms/step - accuracy: 0.9899 - loss: 0.7120 - val_accuracy: 0.9830 - val_loss: 1.6378
Epoch 34/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 194ms/step - accuracy: 0.9818 - loss: 1.0647 - val_accuracy: 0.9820 - val_loss: 1.7402
Epoch 35/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 193ms/step - accuracy: 0.9848 - loss: 0.7367 - val_accuracy: 0.9820 - val_loss: 1.9459
Epoch 36/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 164ms/step - accuracy: 0.9879 - loss: 0.6302 - val_accuracy: 0.9790 - val_loss: 1.7937
Epoch 37/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 166ms/step - accuracy: 0.9862 - loss: 0.6283 - val_accuracy: 0.9820 - val_loss: 1.9308
Epoch 38/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 168ms/step - accuracy: 0.9931 - loss: 0.5184 - val_accuracy: 0.9800 - val_loss: 1.6820
Epoch 39/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 194ms/step - accuracy: 0.9811 - loss: 0.7753 - val_accuracy: 0.9790 - val_loss: 1.7990
Epoch 40/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 192ms/step - accuracy: 0.9916 - loss: 0.4037 - val_accuracy: 0.9820 - val_loss: 1.6276
Epoch 41/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 166ms/step - accuracy: 0.9915 - loss: 0.7177 - val_accuracy: 0.9780 - val_loss: 1.9409
Epoch 42/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 168ms/step - accuracy: 0.9828 - loss: 0.7589 - val_accuracy: 0.9810 - val_loss: 1.8210
Epoch 43/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 168ms/step - accuracy: 0.9848 - loss: 0.8378 - val_accuracy: 0.9790 - val_loss: 1.5759
Epoch 44/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 196ms/step - accuracy: 0.9929 - loss: 0.3317 - val_accuracy: 0.9710 - val_loss: 3.1422
Epoch 45/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 193ms/step - accuracy: 0.9812 - loss: 0.7095 - val_accuracy: 0.9790 - val_loss: 1.7247
Epoch 46/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 201ms/step - accuracy: 0.9821 - loss: 0.9309 - val_accuracy: 0.9830 - val_loss: 1.2739
Epoch 47/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 11s 175ms/step - accuracy: 0.9903 - loss: 0.2938 - val_accuracy: 0.9810 - val_loss: 1.1891
Epoch 48/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 11s 168ms/step - accuracy: 0.9921 - loss: 0.3357 - val_accuracy: 0.9830 - val_loss: 1.3706
Epoch 49/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 165ms/step - accuracy: 0.9867 - loss: 0.4941 - val_accuracy: 0.9730 - val_loss: 2.0314
Epoch 50/50
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 194ms/step - accuracy: 0.9940 - loss: 0.2132 - val_accuracy: 0.9790 - val_loss: 1.6718

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

Jupyter Notebook Image

<Figure size 640x480 with 1 Axes>

63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 107ms/step - accuracy: 0.9780 - loss: 1.9747
Test accuracy: 0.980

Another widely used techniqye for model reuse, complementary to feature extraction, is fine-tuning. Fine tuning consists of unfreexing a few of the top layers of a frozen model base used for feature extraction, and jointly training both the newly added part of the model and these top layers. This is called fine-tuning because it slightly adjusts the mroe abstract representatons of the model being reused in order to make them more relevant for the problem at hand.

Fine Tuning the Last Convolutional Block of VCG16

Steps for fine tuning network:

  1. Add your custom network on top of an already-trained base network.
  2. Freeze the base network.
  3. Train the part we added.
  4. Unfreeze some layers in the base network.
  5. Jointly train both these lauers and the part we just added.

Why not fine-tune the entire convolutional base?

  • Earlier layers in the convolutional base encode more generic, reusable features, whereas layers higher up encode mroe specialized features. It's more useful to fine-tune the more specialized features, becuase these are the ones that need to be repurposed on the new problem
  • The more parameters you're training, the more you're at risk of overfitting.
"""
Freexing all layers until the fourth fro the last
"""
conv_base.trainable = True
for layer in conv_base.layers[:-4]:
  layer.trainable = False

"""
Fine tuning the model
"""
model.compile(loss="binary_crossentropy",
 optimizer=keras.optimizers.RMSprop(learning_rate=1e-5), # Note that the reason for using a low learning rate s that we want to limit the magnitude of the modifications we make to the representations of the trhree layers we're fine-tuning
 metrics=["accuracy"])
callbacks = [
 keras.callbacks.ModelCheckpoint(
  filepath="fine_tuning.keras",
  save_best_only=True,
  monitor="val_loss")
]
history = model.fit(
 train_dataset,
 epochs=30,
 validation_data=validation_dataset,
 callbacks=callbacks)

"""
Evaluate the model on the test data
"""
model = keras.models.load_model("fine_tuning.keras")
test_loss, test_acc = model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
out[52]

Epoch 1/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 15s 203ms/step - accuracy: 0.9867 - loss: 0.7392 - val_accuracy: 0.9800 - val_loss: 1.1916
Epoch 2/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 200ms/step - accuracy: 0.9949 - loss: 0.1607 - val_accuracy: 0.9830 - val_loss: 1.1518
Epoch 3/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 194ms/step - accuracy: 0.9906 - loss: 0.5825 - val_accuracy: 0.9840 - val_loss: 1.0133
Epoch 4/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 207ms/step - accuracy: 0.9923 - loss: 0.2672 - val_accuracy: 0.9830 - val_loss: 1.0147
Epoch 5/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 210ms/step - accuracy: 0.9852 - loss: 0.2680 - val_accuracy: 0.9760 - val_loss: 1.6889
Epoch 6/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 186ms/step - accuracy: 0.9875 - loss: 0.5628 - val_accuracy: 0.9810 - val_loss: 1.1799
Epoch 7/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 26s 275ms/step - accuracy: 0.9933 - loss: 0.2585 - val_accuracy: 0.9840 - val_loss: 0.9170
Epoch 8/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 15s 186ms/step - accuracy: 0.9933 - loss: 0.1060 - val_accuracy: 0.9860 - val_loss: 0.9633
Epoch 9/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 187ms/step - accuracy: 0.9961 - loss: 0.1190 - val_accuracy: 0.9850 - val_loss: 0.9709
Epoch 10/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 26s 278ms/step - accuracy: 0.9920 - loss: 0.3242 - val_accuracy: 0.9830 - val_loss: 0.9117
Epoch 11/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 15s 182ms/step - accuracy: 0.9936 - loss: 0.1565 - val_accuracy: 0.9800 - val_loss: 1.4414
Epoch 12/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 187ms/step - accuracy: 0.9982 - loss: 0.0400 - val_accuracy: 0.9780 - val_loss: 1.1673
Epoch 13/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 27s 291ms/step - accuracy: 0.9937 - loss: 0.2057 - val_accuracy: 0.9880 - val_loss: 0.7395
Epoch 14/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 14s 186ms/step - accuracy: 0.9926 - loss: 0.2263 - val_accuracy: 0.9820 - val_loss: 0.9659
Epoch 15/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 12s 199ms/step - accuracy: 0.9934 - loss: 0.1389 - val_accuracy: 0.9870 - val_loss: 0.6290
Epoch 16/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 183ms/step - accuracy: 0.9941 - loss: 0.0860 - val_accuracy: 0.9850 - val_loss: 0.8048
Epoch 17/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 184ms/step - accuracy: 0.9945 - loss: 0.1363 - val_accuracy: 0.9820 - val_loss: 0.9074
Epoch 18/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 188ms/step - accuracy: 0.9965 - loss: 0.0825 - val_accuracy: 0.9830 - val_loss: 0.6554
Epoch 19/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 187ms/step - accuracy: 0.9983 - loss: 0.0404 - val_accuracy: 0.9860 - val_loss: 0.9429
Epoch 20/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 215ms/step - accuracy: 0.9876 - loss: 0.5480 - val_accuracy: 0.9820 - val_loss: 0.9502
Epoch 21/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 185ms/step - accuracy: 0.9988 - loss: 0.0172 - val_accuracy: 0.9790 - val_loss: 1.0249
Epoch 22/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 182ms/step - accuracy: 0.9937 - loss: 0.1540 - val_accuracy: 0.9840 - val_loss: 0.8283
Epoch 23/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 209ms/step - accuracy: 0.9972 - loss: 0.0901 - val_accuracy: 0.9840 - val_loss: 0.8751
Epoch 24/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 11s 182ms/step - accuracy: 0.9992 - loss: 0.0113 - val_accuracy: 0.9880 - val_loss: 0.7144
Epoch 25/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 20s 179ms/step - accuracy: 0.9916 - loss: 0.2318 - val_accuracy: 0.9890 - val_loss: 0.6921
Epoch 26/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 183ms/step - accuracy: 0.9972 - loss: 0.0529 - val_accuracy: 0.9780 - val_loss: 1.2277
Epoch 27/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 185ms/step - accuracy: 0.9966 - loss: 0.0369 - val_accuracy: 0.9790 - val_loss: 1.1668
Epoch 28/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 21s 186ms/step - accuracy: 0.9977 - loss: 0.0700 - val_accuracy: 0.9840 - val_loss: 0.9010
Epoch 29/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 13s 208ms/step - accuracy: 0.9964 - loss: 0.0400 - val_accuracy: 0.9840 - val_loss: 1.0589
Epoch 30/30
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 181ms/step - accuracy: 0.9964 - loss: 0.1354 - val_accuracy: 0.9810 - val_loss: 1.0691
63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 107ms/step - accuracy: 0.9740 - loss: 1.7638
Test accuracy: 0.978