finetuner.labeler package

Submodules

Module contents

finetuner.labeler.fit(embed_model, train_data, clear_labels_on_start=False, port_expose=None, runtime_backend='thread', loss='SiameseLoss', preprocess_fn=None, collate_fn=None, **kwargs)[source]

Fit the model in an interactive UI.

Parameters
  • embed_model (AnyDNN) – The embedding model to fine-tune

  • train_data (DocumentSequence) – Data on which to train the model

  • clear_labels_on_start (bool) – If set True, will remove all labeled data.

  • port_expose (Optional[int]) – The port to expose.

  • runtime_backend (str) – The parallel backend of the runtime inside the Pea, either thread or process.

  • loss (str) – Which loss to use in training. Supported losses are: - SiameseLoss for Siamese network with cosine distance - TripletLoss for Triplet network with cosine distance

  • preprocess_fn (Optional[ForwardRef]) – A pre-processing function, to apply pre-processing to documents on the fly. It should take as input the document in the dataset, and output whatever content the framework-specific dataloader (and model) would accept.

  • collate_fn (Optional[ForwardRef]) – The collation function to merge the content of individual items into a batch. Should accept a list with the content of each item, and output a tensor (or a list/dict of tensors) that feed directly into the embedding model

  • kwargs – Additional keyword arguments.

Return type

None