I’m currently trying to implement in-batch training/sampling, in which a set of Gold+negatives is shared in each batch. This reduce total negatives number and gpu resources required.
Can such a implementation done with iteratorclass? I can’t find any idea (or, even direction) for this implementation.
Current implementation is, something like
fields['gold_and_negs_feature'] = ArrayField(np.array(data['gold_and_negs_feature'], dtype='float32')) fields['gold_and_neg_mask'] = ArrayField(np.array(data['gold_and_neg_mask'], dtype='int32')) ''' data['gold_and_neg_mask'] = [1, 0, 0, ..., 0] '''
for making Instance. This implementation samples negatives for each data point independently, and negatives are not shared in batch.
If you’d know hints or advices about this, I’d appreciate it.