Tuner#
Tuner is one of the two key components of Finetuner. Given an embedding model and labeled dataset (see the guide on data formats for more information), Tuner trains the model to fit the data.
With Tuner, you can customize the training process to best fit your data, and track your experiements in a clear and transparent manner. You can do things like
Choose between different loss functions, use hard negative mining for triplets/pairs
Set your own optimizers and learning rates
Track the training and evaluation metrics with Weights and Biases
Save checkpoints during training
Write custom callbacks
As part of the training process, you can also compute IR related evaluation metrics, using the standalone
Evaluator
component.
You can read more on these different options here or in these sub-sections:
The Tuner
class#
All the functionality is exposed through the base *Tuner
class - PytorchTuner
, KerasTuner
and PaddleTuner
. This class instance also gets constructed under the hood when you call finetuner.fit()
.
When initializing a *Tuner
class, you are required to pass the embedding model, but you can also customize other training configuration.
You can then finetune your model using the .fit()
method, to which you pass the training and evaluation data (which should both be labeled dataset), as well as any other data-related configuration (see ).
A minimal example looks like this:
import torch
from finetuner.toydata import generate_fashion
from finetuner.tuner.pytorch import PytorchTuner
embed_model = torch.nn.Sequential(
torch.nn.Flatten(),
torch.nn.Linear(in_features=28 * 28, out_features=128),
)
tuner = PytorchTuner(embed_model)
tuner.fit(generate_fashion())
import tensorflow as tf
from finetuner.toydata import generate_fashion
from finetuner.tuner.keras import KerasTuner
embed_model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
]
)
tuner = KerasTuner(embed_model)
tuner.fit(generate_fashion())
import paddle
from finetuner.toydata import generate_fashion
from finetuner.tuner.paddle import PaddleTuner
embed_model = paddle.nn.Sequential(
paddle.nn.Flatten(),
paddle.nn.Linear(in_features=28 * 28, out_features=128),
)
tuner = PaddleTuner(embed_model)
tuner.fit(generate_fashion())
Customize optimization#
You can provide your own optimizer and learning rate scheduler if you wish to (by default, the Adam optimizer with a fixed learning rate will be used), using the configure_optimizer
argument to the Tuner constructor.
For Pytorch and PaddlePaddle, you can also use the scheduler_step
argument, to set whether to step the learning rate scheduler on each batch or each epoch (for Keras this is not available, there you set the frequency, in terms of batches, in the scheduler itself)
Here’s an example of how you can do this
from torch.optim import Adam
from torch.optim.lr_scheduler import MultiStepLR
from finetuner.tuner.pytorch import PytorchTuner
def configure_optimizer(model):
optimizer = Adam(model.parameters(), lr=5e-4)
scheduler = MultiStepLR(optimizer, milestones=[30, 60], gamma=0.5)
return optimizer, scheduler
tuner = PytorchTuner(
..., configure_optimizer=configure_optimizer, scheduler_step='epoch'
)
import tensorflow as tf
from finetuner.tuner.keras import KerasTuner
def configure_optimizer(model):
lr = tf.keras.optimizers.schedules.ExponentialDecay(
1.0, decay_steps=1, decay_rate=0.1
)
optimizer = tf.keras.optimizers.SGD(learning_rate=lr)
return optimizer, lr
tuner = KerasTuner(..., configure_optimizer=configure_optimizer)
from paddle import optimizer
from finetuner.tuner.paddle import PaddleTuner
def configure_optimizer(model):
scheduler = optimizer.lr.MultiStepDecay(learning_rate=5e-4, milestones=[30, 60], gamma=0.5)
optimizer = optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())
return optimizer, scheduler
tuner = PaddleTuner(
..., configure_optimizer=configure_optimizer, scheduler_step='epoch'
)
Saving the model#
After a model is tuned, you can save it by calling .save(save_path)
method.
Example - full training#
In the example below we’ll demonstrate how to make full use of the available Tuner features, as you would in any realistic setting.
We will be finetuning a simple MLP model on the Fashion MNIST data, and we will be using:
TripletLoss
with easy positive and semihard negative mining strategyA custom learning rate schedule
Tracking the experiement on Weights and Biases using
WandBLogger
callbackRandom augmentation using
preproces_fn
Tip
Before trying out the example, make sure you have wandb installed and have logged into your account.
Let’s start with the dataset - we’ll use the generate_fashion()
helper function, which will produce a Class Dataset
import numpy as np
from finetuner.toydata import generate_fashion
from docarray import Document
train_data = generate_fashion()
eval_data = generate_fashion(is_testset=True)
def preprocess_fn(doc: Document) -> np.ndarray:
"""Add some noise to the image"""
new_image = doc.tensor + np.random.normal(scale=0.01, size = doc.tensor.shape)
return new_image.astype(np.float32)
print(f'Size of train data: {len(train_data)}')
print(f'Size of eval data: {len(eval_data)}')
print(f'Example of label: {train_data[0].tags.json()}')
tensor = train_data[0].tensor
print(f'Example of tensor: {tensor.shape} shape, type {tensor.dtype}')
Size of train data: 60000
Size of train data: 10000
Example of label: {
"finetuner_label": 9.0
}
Example of tensor: (28, 28) shape, type float32
Next, we prepare the model - just a simple MLP in this case
import torch
embed_model = torch.nn.Sequential(
torch.nn.Flatten(),
torch.nn.Linear(in_features=28 * 28, out_features=128),
torch.nn.ReLU(),
torch.nn.Linear(in_features=128, out_features=32)
)
Then we can create the PytorchTuner
object. In this step we specify all the training configuration. We’ll be using
Triplet loss with hard miner with the easy positive and semihard negative strategy
Adam optimizer with initial learning rate of 0.0005, which will be halved every 30 epochs
WandB for tracking the experiement
A
TrainingCheckpoint
to save a checkpoint every epoch - if training is interrupted we can later continue from this checkpoint. We need to create acheckpoints/
folder inside our current directory to store checkpoints there.
from torch.optim import Adam
from torch.optim.lr_scheduler import MultiStepLR
from finetuner.tuner.callback import WandBLogger, TrainingCheckpoint
from finetuner.tuner.pytorch import PytorchTuner
from finetuner.tuner.pytorch.losses import TripletLoss
from finetuner.tuner.pytorch.miner import TripletEasyHardMiner
def configure_optimizer(model):
optimizer = Adam(model.parameters(), lr=5e-4)
scheduler = MultiStepLR(optimizer, milestones=[10, 20], gamma=0.5)
return optimizer, scheduler
loss = TripletLoss(
miner=TripletEasyHardMiner(pos_strategy='easy', neg_strategy='semihard')
)
logger_callback = WandBLogger()
checkpoint = TrainingCheckpoint('checkpoints')
tuner = PytorchTuner(
embed_model,
loss=loss,
configure_optimizer=configure_optimizer,
scheduler_step='epoch',
callbacks=[logger_callback, checkpoint],
device='cpu',
)
Finally, let’s put it all together and run the training
import torch
from torch.optim import Adam
from torch.optim.lr_scheduler import MultiStepLR
from finetuner.toydata import generate_fashion
from finetuner.tuner.callback import WandBLogger, TrainingCheckpoint
from finetuner.tuner.pytorch import PytorchTuner
from finetuner.tuner.pytorch.losses import TripletLoss
from finetuner.tuner.pytorch.miner import TripletEasyHardMiner
train_data = generate_fashion()
eval_data = generate_fashion(is_testset=True)
def preprocess_fn(doc: Document) -> np.ndarray:
"""Add some noise to the image"""
new_image = doc.tensor + np.random.normal(scale=0.01, size=doc.tensor.shape)
return new_image.astype(np.float32)
embed_model = torch.nn.Sequential(
torch.nn.Flatten(),
torch.nn.Linear(in_features=28 * 28, out_features=128),
torch.nn.ReLU(),
torch.nn.Linear(in_features=128, out_features=32),
)
def configure_optimizer(model):
optimizer = Adam(model.parameters(), lr=5e-4)
scheduler = MultiStepLR(optimizer, milestones=[30, 60], gamma=0.5)
return optimizer, scheduler
loss = TripletLoss(
miner=TripletEasyHardMiner(pos_strategy='easy', neg_strategy='semihard')
)
logger_callback = WandBLogger()
checkpoint = TrainingCheckpoint('checkpoints')
tuner = PytorchTuner(
embed_model,
loss=loss,
configure_optimizer=configure_optimizer,
scheduler_step='epoch',
callbacks=[logger_callback, checkpoint],
device='cpu',
)
tuner.fit(
train_data, eval_data, preprocess_fn=preprocess_fn, epochs=90, num_items_per_class=32
)
We can monitor the training by watching the progress bar, or we can log into our WanB account, and see the live updates there. Here’s an example of what we might see there