finetuner.toydata module#
- finetuner.toydata.generate_qa(num_total=481, num_neg=0, pos_value=1, neg_value=- 1, is_testset=None)[source]#
Get a generator of QA data with synthetic negative matches.
Each document in the array will have the text saved as
text
attribute, and matches will have the label saved as a tag undertags['finetuner__label']
.- Parameters
num_total (
int
) – the total number of documents to returnnum_neg (
int
) – the number of negative matches per documentpos_value (
int
) – the label value of the positive matchesneg_value (
int
) – the label value of the negative matchesmax_seq_len – the maximum sequence length of each text.
is_testset (
Optional
[bool
]) – If to generate test data, if set to None, will all data return
- Return type
DocumentArray
- finetuner.toydata.generate_fashion(num_total=60000, upsampling=1, channels=0, channel_axis=- 1, is_testset=False, download_proxy=None)[source]#
Get a Generator of fashion-mnist Documents.
Each document in the array will have the image content saved as
tensor
, and the label saved as a tag undertags['finetuner__label']
.- Parameters
num_total (
int
) – the total number of documents to returnupsampling (
int
) – the rescale factor, must be integer and >=1. It rescales the image into a bigger image. For example, upsampling=2 gives 56 x 56 images.channels (
int
) – fashion-mnist data is gray-scale data, it does not have channel. One can set channel to 1 or 3 to simulate real grayscale or rgb imagachannel_axis (
int
) – The axis for channels, e.g. for pytorch we expect B*C*W*H, channel axis should be 1.is_testset (
bool
) – If to generate test data
- Return type
DocumentArray