When you bring a pre-trained model to encode your data to embeddings, you are likely to get irrelevant search results.
Pre-trained deep learning models are usually trained on large-scale datasets, that have a different data distribution over your own datasets or domains.
This is referred to as a distribution shift.
Finetuner provides a solution to this problem by leveraging a pre-trained model from a large dataset and fine-tuning the parameters of
this model on your dataset.
Once fine-tuning is done, you get a model adapted to your domain. This new model leverages better search performance on your-task-of-interest.
Fine-tuning a pre-trained model includes a certain complexity and requires Machine Learning plus domain knowledge (on NLP, Computer Vision, etc.).
Thus, it is a non-trivial task for business owners and engineers who lack practical deep-learning knowledge. Finetuner attempts
to address this by providing a simple interface, which can be as easy as:
importfinetunerfromfinetunerimportDocumentArray# Login to Jina AI Cloudfinetuner.login()# Prepare training datatrain_data=DocumentArray(...)# Fine-tune in the cloudrun=finetuner.fit(model='resnet50',train_data=train_data,epochs=5,batch_size=128,)print(run.name)forlog_entryinrun.stream_logs():print(log_entry)# When readyrun.save_artifact(directory='experiment')
Submitted fine-tuning jobs run efficiently on the Jina AI Cloud on either CPU or GPU enabled hardware.
Finetuner fully owns the complexity of setting up and maintaining the model training infrastructure plus the complexity of delivering SOTA training methods to production use cases.
Please check out the following steps for more information: