Hands On Machine Learning Chapter 12 - Custom Models and Training with TensorFlow

I am going to re-read Hands-On Machine Learning with Scikit-learn Keras & TensorFlow because I don't feel that I got a good grasp of machine learning the first time I read it, and I skipped neural networks the first time I read the book. Since the first time reading this textbook.

Custom Models and Training with TensorFlow

In 95% of use cases you will encounter, all you will need is tf.keras (and tf.data - see next chapter). In this chapter, we dive deeper into TensorFlow and take a look at its lower level Python API. This will be useful when you need extra control, to write custom loss functions, custom metrics, layers, models, initializers, regularizers, weight constraints, and more. You may even need to fully control the training loop itself, for example to apply special transformations or constraints to the gradients (beyond just clipping them), or to use multiple optimizeres for different parts of the network.

Quick Tour of TensorFlow

TensorFlow is a powerful library for numerical computation, particularly well-suited for large-scale Machine Learning. Developed by Google Brain team. Open sourced in 2015. It is now the most popular deep learning library: countless projects use TensorFlow for all sorts of Machine Learning tasks, such as image classification, natural language processing, recommender systems, time series foreasting, and much more. What does it support?

  • Similar to NumPy, but with GPU support
  • Supports distributed computing- Has a kind of just in time compiler that allows it to optimize computations for speed and memory usage
  • Computation graphs can be exported to a portable format
  • It implements autodiff and some excellent optimizers
  • Offers many more features, built on top of core deatures: most important is keras, but it also has data loading and preprocessing ops, image processing ops, signal processing ops, and more (see image below)

Tensor Flow's Python API

At its lowest level, eahc TensorFlow operation is implemnted by using highly efficient C++ code. Many operations have multiple operations, called kernels: each kernel is dedicated to a specific device type (such as CPU, GPUs, or even TPUs Tensor Processing Units). GPUs can dramatically speed up computations by splitting computations into many smaller chunks and running them in parallel across many GPU threads. TPUs are even faster. You can purchase your own GPU device, but TPUs are only available on Google Cloud Machine Learning Engine. TensorFlow's architecture can be seen below. TensorFlow can run on every major OS, on mobile devices, and in the browser.

Tensor Flow's Architecture

Using TensorFlow Like NumPy

TensorFlow's API revolves around tensors. A tensor is usually a multidimensional array (like NumPy's ndarray), but it can also hold a scalar. See the code below for examples of TensorFlow code.TensorFlow is very similar to NumPy, as can be seen. Many functions and classes have aliases (tf.add() and tf.math.add() have the same function. This allows TensorFlow to have concise names for the most common operations, while preserving well organized packages.) TensorFlow uses 32 bit precision by default, because this is generally more than enough for neural networks and it runs faster and uses less RAM. TensorFlow does not perform any type conversions automatically: it just raises an exception if you try to execute an operation on tensors with incompatable types.

building Custom Models

I don't see myself doing this anytime soon. I will combe back to it when I need to.

Custom Model Example

Autograph and Tracing

TF Function generates a new graph for every unique set of input shapes and data types, and it caches it for subsequent calls. How does TensorFlow generate graphs? It starts by analyzing the Python function's source code to capture all the control flow statements, such as for loops and while loops, if statements, as well as break, continue, and return statements. The first step is called an autograph. The reason TensorFlow has to analyze the source code is that Python does not provide any other way to capture control flow statementsit offers magic methods like __add__() or __mul__() to capture operators like + and *, but there are no __while__() or __if__() magic methods. After analyzing the function';s code, autograph outputs an upgraded version of that function in which all the control flow statements are rplaced with the appropriate TensorFlow statements, such as tf.while_loop() for loops and tf.cond() for if statements/ See the example below.

How TensorFlow Generates Graphs Using Autograph and Tracing

Next, tensorFlow calls this "upgraded" function, but instead of passing the actual argument, it passes a symbolic tensor, meaning a tensor without any actual value, only a name, a data type, and a shape. The function will run in graph mode, meaning that each TensorFlow operation will just add a node in the graph to represent itself and its output tensor(s) (as opposed to regular mode, called eager execution, or eagar mode). In graph mode, TF operations do not perform any actual computations.

import tensorflow as tf 
import numpy as np
# Create a Tensor with `tf.constant`
tf.constant([[1.,2.,3.],[4.,5.,6.]]) # Matrix 
tf.constant(42) # Scalar
t  = tf.constant([[1.,2.,3.],[4.,5.,6.]])
# Tensors have shapes and dtype like ndarray
print(t.shape)
print(t.dtype)
# Indexing works like NumPy
t[:,1:]
t[...,1,tf.newaxis]
# All sorts of operations are available 
t + 10
tf.square(t)
t @ tf.transpose(t)

a = np.array([2.,4.,5.])
tf.constant(a)
t.numpy() # or np.array(t)

## Variables 
v = tf.Variable([[1.,2.,3.],[4.,5.,6.]])
print(v.assign(2*v))

## Custom loss Functions 
# Huber Loss 
def huber_fn(y_true, y_pred):
    error = y_true - y_pred 
    is_small_error = tf.abs(error) < 1
    squared_loss = tf.square(error) / 2
    linear_loss = tf.abs(error) - 0.5 
    return tf.where(is_small_error, squared_loss, linear_loss)

# model.compile(loss=huber_fn, optimizer="nadam")
# ...
out[2]

(2, 3)
<dtype: 'float32'>
<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[ 2., 4., 6.],
[ 8., 10., 12.]], dtype=float32)>