data_engine package


prepare_data module

data_engine.prepare_data.keep_n_captions(ds, repeat, n=1, set_names=None)

Keeps only n captions per image and stores the rest in dictionaries for a later evaluation :param ds: Dataset object :param repeat: Number of input samples per output :param n: Number of outputs to keep. :param set_names: Set name. :return:

data_engine.prepare_data.update_dataset_from_file(ds, input_text_filename, params, splits=None, output_text_filename=None, remove_outputs=False, compute_state_below=False, recompute_references=False)

Updates the dataset instance from a text file according to the given params. Used for sampling

  • ds – Dataset instance
  • input_text_filename – New inputs.
  • params – Parameters for building the dataset
  • splits – Splits to sample
  • output_text_filename – New output sentences
  • remove_outputs – Remove outputs from dataset (if True, will ignore the output_text_filename parameter)
  • compute_state_below – Compute state below input (shifted target text for professor teaching)
  • recompute_references – Whether we should rebuild the references of the dataset or not.

Dataset object with the processed data

Module contents