How to tune learning rate for machine learning / deep learning?



In the Udacity's Intro to TensorFlow for Deep Learning course, Aurélien Géron introduces an interesting and simple way of automate the selection of learning rate for training your model automatically to some extent.

The idea is that setting up a LearningRateScheduler and test several learning rates (in a increased manner) for n epochs, and plot the loss history graph to see what will be suitable value range of learning rate based on where the loss starts to jiggle and eventually burst up.

Here is the code snippet for training 100 epochs with learning rate being increased by 10 every 30 epochs from 1e-6, i.e., 1e-6 => 1e-5 (from epoch30) => 1e-4 (from epoch60) => 1e-3 (from epoch90).

# Tensorflow 2.9.1
# Every 30 epochs increases by 10
lr_schedule = tf.keras.callbacks.LearningRateScheduler(
    lambda epoch: 1e-6 * 10**(epoch / 30))
    
optimizer = tf.keras.optimizers.SGD(lr=1e-6, momentum=0.9)
model.compile(loss=tf.keras.losses.Huber(),
              optimizer=optimizer,
              metrics=["mae"])
history = model.fit(train_set, epochs=100, callbacks=[lr_schedule])

plt.semilogx(history.history["lr"], history.history["loss"])
plt.axis([1e-6, 1e-3, 0, 20])
plt.show()

Based on the produced training loss history below, we can see from 1e-4 the loss history start to be unstable and increasing with a larger value of 1e-3. Therefore, we can choose 1e-5 as an approprieate value for our learning rate.