sciwing.engine

engine

class sciwing.engine.engine.Engine(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13cdf910>, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, optimizer: sphinx.ext.autodoc.importer.<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12306d90>, batch_size: int, save_dir: str, num_epochs: int, save_every: int, log_train_metrics_every: int, train_metric: sciwing.metrics.BaseMetric.BaseMetric, validation_metric: sciwing.metrics.BaseMetric.BaseMetric, test_metric: sciwing.metrics.BaseMetric.BaseMetric, experiment_name: Optional[str] = None, experiment_hyperparams: Optional[Dict[str, Any]] = None, tensorboard_logdir: str = None, track_for_best: str = 'loss', collate_fn=<class 'list'>, device: Union[<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13df6690>, str] = <sphinx.ext.autodoc.importer._MockObject object>, gradient_norm_clip_value: Optional[float] = 5.0, lr_scheduler: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13df64d0>] = None, use_wandb: bool = False, sample_proportion: float = 1.0, seeds: Dict[str, int] = None)

Bases: sciwing.utils.class_nursery.ClassNursery

__init__(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13cdf910>, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, optimizer: sphinx.ext.autodoc.importer.<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12306c50>, batch_size: int, save_dir: str, num_epochs: int, save_every: int, log_train_metrics_every: int, train_metric: sciwing.metrics.BaseMetric.BaseMetric, validation_metric: sciwing.metrics.BaseMetric.BaseMetric, test_metric: sciwing.metrics.BaseMetric.BaseMetric, experiment_name: Optional[str] = None, experiment_hyperparams: Optional[Dict[str, Any]] = None, tensorboard_logdir: str = None, track_for_best: str = 'loss', collate_fn=<class 'list'>, device: Union[<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13df6690>, str] = <sphinx.ext.autodoc.importer._MockObject object>, gradient_norm_clip_value: Optional[float] = 5.0, lr_scheduler: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13df64d0>] = None, use_wandb: bool = False, sample_proportion: float = 1.0, seeds: Dict[str, int] = None)

Engine runs the models end to end. It iterates through the train dataset and passes it through the model. During training it helps in tracking a lot of parameters for the run and saving the parameters. It also reports validation and test parameters from time to time. Many utilities required for end-end running of the model is here.

Parameters:
  • model (nn.Module) – A pytorch module defining a model to be run
  • datasets_manager (DatasetsManager) – A datasets manager that handles all the different datasets
  • optimizer (torch.optim) – Any Optimizer object instantiated using torch.optim
  • batch_size (int) – Batch size for the dataset. The same batch size is used for train, valid and test dataset
  • save_dir (int) – The experiments are saved in save_dir. We save checkpoints, the best model, logs and other information into the save dir
  • num_epochs (int) – The number of epochs to run the training
  • save_every (int) – The model will be checkpointed every save_every number of iterations
  • log_train_metrics_every (int) – The train metrics will be reported every log_train_metrics_every iterations during training
  • train_metric (BaseMetric) – Anything that is an instance of BaseMetric for calculating training metrics
  • validation_metric (BaseMetric) – Anything that is an instance of BaseMetric for calculating validation metrics
  • test_metric (BaseMetric) – Anything that is an instance of BaseMetric for calculating test metrics
  • experiment_name (str) – The experiment should be given a name for ease of tracking. Instead experiment name is not given, we generate a unique 10 digit sha for the experiment.
  • experiment_hyperparams (Dict[str, Any]) – This is mostly used for tracking the different hyper-params of the experiment being run. This may be used by wandb to save the hyper-params
  • tensorboard_logdir (str) – The directory where all the tensorboard runs are stored. If None is passed then it defaults to the tensorboard default of storing the log in the current directory.
  • track_for_best (str) – Which metric should be tracked for deciding the best model?. Anything that the metric emits and is a single value can be used for tracking. The defauly value is loss. If its loss, then the best value will be the lowest one. For some other metrics like macro_fscore, the best metric might be the one that has the highest value
  • collate_fn (Callable[[List[Any]], List[Any]]) – Collates the different examples into a single batch of examples. This is the same terminology adopted from pytorch. There is no different
  • device (torch.device) – The device on which the model will be placed. If this is “cpu”, then the model and the tensors will all be on cpu. If this is “cuda:0”, then the model and the tensors will be placed on cuda device 0. You can mention any other cuda device that is suitable for your environment
  • gradient_norm_clip_value (float) – To avoid gradient explosion, the gradients of the norm will be clipped if the gradient norm exceeds this value
  • lr_scheduler (torch.optim.lr_scheduler) – Any pytorch lr_scheduler can be used for reducing the learning rate if the performance on the validation set reduces.
  • use_wandb (bool) – wandb or weights and biases is a tool that is used to track experiments online. Sciwing comes with inbuilt functionality to track experiments on weights and biases
  • seeds (Dict[str, int]) – The dict of seeds to be set. Set the random_seed, pytorch_seed and numpy_seed Found in https://github.com/allenai/allennlp/blob/master/allennlp/common/util.py
static get_iter(loader: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13cdf150>) → Iterator[T_co]

Returns the iterator for a pytorch data loader.

The loader is a pytorch DataLoader that iterates over the dataset in batches and employs many strategies to do so. We want an iterator that returns the dataset in batches. The end of the iterator would signify the end of an epoch and then we can use that information to perform house-keeping.

Parameters:loader (DataLoader) – a pytorch data loader
Returns:An iterator over the data loader
Return type:Iterator
get_loader(dataset: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13e39410>) → <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c13cdf150>

Returns the DataLoader for the Dataset

Parameters:dataset (Dataset) –
Returns:A pytorch DataLoader
Return type:DataLoader
get_test_dataset()

Returns the test dataset of the experiment

Returns:Anything that conforms to the pytorch style dataset.
Return type:Dataset
get_train_dataset()

Returns the train dataset of the experiment

Returns:Anything that conforms to the pytorch style dataset.
Return type:Dataset
get_validation_dataset()

Returns the validation dataset of the experiment

Returns:Anything that conforms to the pytorch style dataset.
Return type:Dataset
is_best_higher(current_best=None)

Returns True if the current value of the metric is HIGHER than the best metric. This is useful for tracking metrics like FSCORE where, higher the value, the better it is

Parameters:current_best (float) – The current value for the metric that is being tracked
Returns:
Return type:bool
is_best_lower(current_best=None)

Returns True if the current value of the metric is lower than the best metric. This is useful for tracking metrics like loss where, lower the value, the better it is

Parameters:current_best (float) – The current value for the metric that is being tracked
Returns:
Return type:bool
load_model_from_file(filename: str)
run()

Run the engine :return:

set_best_track_value(current_best=None)

Set the best value of the value being tracked

Parameters:current_best (float) – The current value that is best
test_epoch(epoch_num: int)

Runs the test epoch for epoch_num

Loads the best model that is saved during the training and runs the test dataset.

Parameters:epoch_num (int) – zero based epoch number for which the test dataset is run This is after the last training epoch.
test_epoch_end(epoch_num: int)

Performs house-keeping at the end of the test epoch

It reports the metric that is being traced at the end of the test epoch

Parameters:epoch_num (int) – Epoch num after which the test dataset is run
train_epoch(epoch_num: int)

Run the training for one epoch :param epoch_num: type: int The current epoch number

train_epoch_end(epoch_num: int)

Performs house-keeping at the end of a training epoch

At the end of the training epoch, it does some house-keeping. It reports the average loss, the average metric and other information.

Parameters:epoch_num (int) – The current epoch number (0 based)
validation_epoch(epoch_num: int)

Runs one validation epoch on the validation dataset

Parameters:
  • epoch_num (int) –
  • epoch number (0-based) –
validation_epoch_end(epoch_num: int)

Performs house-keeping at the end of validation epoch

Parameters:epoch_num (int) – The current epoch number