finetuner.tuner package

Subpackages

Submodules

Module contents

finetuner.tuner.fit(embed_model, train_data, eval_data=None, preprocess_fn=None, collate_fn=None, epochs=10, batch_size=256, num_items_per_class=None, loss='SiameseLoss', optimizer=None, learning_rate=0.001, device='cpu', **kwargs)[source]

Finetune the model on the training data.

Parameters
  • embed_model (AnyDNN) – an embedding model

  • train_data (DocumentSequence) – Data on which to train the model

  • eval_data (Optional[ForwardRef]) – Data on which to evaluate the model at the end of each epoch

  • preprocess_fn (Optional[ForwardRef]) – A pre-processing function, to apply pre-processing to documents on the fly. It should take as input the document in the dataset, and output whatever content the framework-specific dataloader (and model) would accept.

  • collate_fn (Optional[ForwardRef]) – The collation function to merge the content of individual items into a batch. Should accept a list with the content of each item, and output a tensor (or a list/dict of tensors) that feed directly into the embedding model

  • epochs (int) – Number of epochs to train the model

  • batch_size (int) – The batch size to use for training and evaluation

  • loss (Union[str, BaseLoss]) – Which loss to use in training. Supported losses are: - SiameseLoss for Siamese network - TripletLoss for Triplet network

  • num_items_per_class (Optional[int]) – Number of items from a single class to include in the batch. Only relevant for class datasets

  • learning_rate (float) – Learning rate for the default optimizer. If you provide a custom optimizer, this learning rate will not apply.

  • optimizer (Optional[ForwardRef]) – The optimizer to use for training. If none is passed, an Adam optimizer is used by default, with learning rate specified by the learning_rate parameter.

  • device (str) – The device to which to move the model. Supported options are "cpu" and "cuda" (for GPU)

Return type

Summary

finetuner.tuner.save(embed_model, model_path, *args, **kwargs)[source]

Save the embedding model.

Parameters
  • embed_model (AnyDNN) – The embedding model to save

  • model_path (str) – Path to file/folder where to save the model

  • args – Arguments to pass to framework-specific tuner’s save method

  • kwargs – Keyword arguments to pass to framework-specific tuner’s save method

Return type

None