Skip to main content

Hyper_Parameter_Tuning

Brief description of the submodule

In this submodule the functions used for the selection of hyper-parameters for the training of the Networks implemented.

HP_Tuning()

Function to execute the hyper-parameter tuning for the UNet model on the Cashew Dataset.

It receives the possible values of the hyperparameters as lists and returns a dataframe with the results of each possible combination.

The metrics considered are:

  • Validation F1-Score: Highest value of validation F1-Score obtained during training.
  • Training time: Time spent on training.
  • Training rho: Spearman coefficient to check that the training accuracy is continuously increasing with the epochs.
  • NO Learning: Boolean indicating if accuracy did improve compared to the one calculated in epoch 0.

The calculation of each of the metrics is done using 20 epochs and a Linear normalization of the Cashew dataset. For more information of this dataset go here.

Params

  • dir: (dir) Directory with the dataset to be used.
  • BS: (list) List with values of batch_size to be considered during HP tuning.
  • LR: (list) List with values of learning rate to be considered during HP tuning.
  • STCh: (list) List with values of starting number of channels to be considered during HP tuning.
  • mu: (list) List with values of momentum to be considered during HP tuning.
  • Bi: (llist) List with values of bilinear to be considered during HP tuning. (Only True or False possible)
  • gamma: (list) List with values of gamma vlaues for the focal loss to be considered during HP tuning.
  • VI: (list) List with values of vegetation indices (True or False)
  • decay: (list) List with values of the decay rate of learning rate.
  • atts: (list) List with booleans for inclusion or not of Attention gates.
  • res: (list) List with booleans for inclusion or not of residual connections on the double convolutional blocks.
  • tr_size: (float: [0-1]) Amount of training set considered for HP tuning.
  • val_size: (float: [0-1]) Amount of validation set considered for HP tuning.

Outputs

  • HP_values: (pandas.DataFrame) Dataframe with the results of each iteration of the hyperparameter tuning.

Dependencies used

import pandas as pd
import time
from torchmetrics.classification import BinaryF1Score

from Dataset.Transforms import getTransforms
from Dataset.ReadyToTrain_DS import getDataLoaders
from Models.U_Net import UNet
from Models.Loss_Functions import FocalLoss

Source code

def HP_Tuning(dir, BS, LR, STCh, MU, Bi, gamma, VI, decay, atts, res, tr_size = 0.15, val_size = 0.75):
"""
Function to perform Hyperparameter tuning for the networks to be trained.

Input:
- dir: Directory with the dataset to be used.
- BS: List with values of batch_size to be considered during HP tuning.
- LR: List with values of learning rate to be considered during HP tuning.
- STCh: List with values of starting number of channels to be considered during HP tuning.
- mu: List with values of momentum to be considered during HP tuning.
- Bi: List with values of bilinear to be considered during HP tuning. (Only True or False possible)
- gamma: List with values of gamma vlaues for the focal loss to be considered during HP tuning.
- VI: List with values of vegetation indices (True or False)
- decay: decay rate of learning rate.
- atts: Inclusion or not of Attention gates.
- res: Inclusion or not of residual connections on convolutional blocks.
- tr_size: Amount of training set considered.
- val_size: Amount of validation et cosidered.

Output:
- HP_values: (pandas.DataFrame) Dataframe with the results of each iteration of the hyperparameter tuning.
"""

transforms = get_transforms()
normalization = 'Linear_1_99'
epochs = 12

rows = []

for bs in BS:
for lr in LR:
for stch in STCh:
for mu in MU:
for bi in Bi:
for g in gamma:
for vi in VI:
for de in decay:
for at in atts:
for re in res:
train_loader, val_loader, test_loader = get_DataLoaders(dir, bs, transforms, normalization, vi, train_split_size = tr_size, val_split_size = val_size)
n_channels = next(enumerate(train_loader))[1][0].shape[1] #get band number fomr actual data
n_classes = 2

loss_function = FocalLoss(gamma = g)

# Define the network
network = UNet(n_channels, n_classes, bi, stch, up_layer = 4, attention = at, resunet = re)

start = time.time()
f1_val, network_trained, spearman, no_l = training_loop(network, train_loader, val_loader, lr, mu, epochs, loss_function, decay = de, plot = False)
end = time.time()

rows.append([bs, lr, stch, mu, bi, g, vi, de, at, re, f1_val, end-start, spearman, no_l])

HP_values = pd.DataFrame(rows)
HP_values.columns = ['BatchSize','LR', 'StartCh', 'Momentum', 'Bilinear', 'gamma', 'VI', 'decay', 'attention', 'resnet', 'ValF1Score', 'Training time', 'Training rho', 'No_L']
HP_values.to_csv('TempHyperParamTuning_'+dir+'.csv')

return HP_values

LoveDA_HP_Tuning()

Function to execute the hyper-parameter tuning for the UNet model on the LoveDA dataset

It receives the possible values of the hyperparameters as lists and returns a dataframe with the results of each possible combination.

The metrics used are:

  • Validation mIOU: Highest value of mIOU obtained during training.
  • Training time: Time spent on training.
  • Training rho: Spearman coefficient to check that the training accuracy is continuously increasing with the epochs.
  • NO Learning: Boolean indicating if accuracy did improve compared to the one calculated in epoch 0.

The calculation of each of the metrics is done using 15 epochs.

Params

  • BS: (list) List with values of batch_size to be considered during HP tuning.
  • LR: (list) List with values of learning rate to be considered during HP tuning.
  • STCh: (list) List with values of starting number of channels to be considered during HP tuning.
  • mu: (list) List with values of momentum to be considered during HP tuning.
  • Bi: (llist) List with values of bilinear to be considered during HP tuning. (Only True or False possible)
  • gamma: (list) List with values of gamma values for the focal loss to be considered during HP tuning.
  • decay: (list) List with values of the decay rate of learning rate.
  • atts: (list) List with booleans for inclusion or not of Attention gates.
  • res: (list) List with booleans for inclusion or not of residual connections on the double convolutional blocks
  • tr_size: (float: [0-1]) Amount of training set considered for HP tuning. Default is 0.15.
  • val_size: (float: [0-1]) Amount of validation set considered for HP tuning. Default is 0.75.

Outputs

  • HP_values: (pandas.DataFrame) Dataframe with the results of each iteration of the hyperparameter tuning.

Dependencies used

import pandas as pd
import time
from torchmetrics.classification import BinaryF1Score

from Dataset.Transforms import getTransforms
from Dataset.ReadyToTrain_DS import get_LOVE_DataLoaders
from Models.U_Net import UNet
from Models.Loss_Functions import FocalLoss

Source code

def LoveDA_HP_Tuning(domain, BS, LR, STCh, MU, Bi, gamma, decay, atts, res, tr_size = 0.15, val_size = 0.75):
"""
Function to perform Hyperparameter tuning for the networks to be trained on LoveDA dataset.

Input:
- dir: Directory with the dataset to be used.
- BS: List with values of batch_size to be considered during HP tuning.
- LR: List with values of learning rate to be considered during HP tuning.
- STCh: List with values of starting number of channels to be considered during HP tuning.
- mu: List with values of momentum to be considered during HP tuning.
- Bi: List with values of bilinear to be considered during HP tuning. (Only True or False possible)
- gamma: List with values of gamma vlaues for the focal loss to be considered during HP tuning.
- decay: decay of learning rate.
- atts: Boolean indicating if attention gates are used or not.
- res: Boolean indicating if residua connections on convolutional blocks are used or not.

Output:
- HP_values: (pandas.DataFrame) Dataframe with the results of each iteration of the hyperparameter tuning.
"""

transforms = get_transforms()
# normalization = 'Linear_1_99'
epochs = 15

rows = []

for bs in BS:
for lr in LR:
for stch in STCh:
for mu in MU:
for bi in Bi:
for g in gamma:
for de in decay:
for at in atts:
for re in res:
train_loader, val_loader, test_loader = get_LOVE_DataLoaders(domain, bs, train_split_size = tr_size, val_split_size = val_size)
n_channels = next(enumerate(train_loader))[1]['image'].shape[1] #get band number fomr actual data
n_classes = 8

loss_function = FocalLoss(gamma = g, ignore_index = 0)

# Define the network
network = UNet(n_channels, n_classes, bi, stch, up_layer = 4, attention = at, resunet = re)

start = time.time()
f1_val, network_trained, spearman, no_l = training_loop(network, train_loader, val_loader, lr, mu, epochs, loss_function, decay = de, plot = False, accu_function=JaccardIndex(task = 'multiclass', num_classes = n_classes, ignore_index = 0) , Love = True)
end = time.time()

rows.append([bs, lr, stch, mu, bi, g, de, at, re, f1_val, end-start, spearman, no_l])

HP_values = pd.DataFrame(rows)
HP_values.columns = ['BatchSize','LR', 'StartCh', 'Momentum', 'Bilinear', 'gamma', 'decay', 'attention', 'resunet', 'ValF1Score', 'Training time', 'Training rho', 'No_L']
HP_values.to_csv('TempHyperParamTuning_LOVE.csv')

return HP_values