sciwing.engine¶
engine¶
-
class
sciwing.engine.engine.
Engine
(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f21a4ad0>, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, optimizer: sphinx.ext.autodoc.importer.<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f07cafd0>, batch_size: int, save_dir: str, num_epochs: int, save_every: int, log_train_metrics_every: int, train_metric: sciwing.metrics.BaseMetric.BaseMetric, validation_metric: sciwing.metrics.BaseMetric.BaseMetric, test_metric: sciwing.metrics.BaseMetric.BaseMetric, experiment_name: Optional[str] = None, experiment_hyperparams: Optional[Dict[str, Any]] = None, tensorboard_logdir: str = None, track_for_best: str = 'loss', collate_fn=<class 'list'>, device: Union[<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f22bb5d0>, str] = <sphinx.ext.autodoc.importer._MockObject object>, gradient_norm_clip_value: Optional[float] = 5.0, lr_scheduler: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f22bb410>] = None, use_wandb: bool = False, sample_proportion: float = 1.0, seeds: Dict[str, int] = None)¶ Bases:
sciwing.utils.class_nursery.ClassNursery
-
__init__
(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f21a4ad0>, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, optimizer: sphinx.ext.autodoc.importer.<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f07caed0>, batch_size: int, save_dir: str, num_epochs: int, save_every: int, log_train_metrics_every: int, train_metric: sciwing.metrics.BaseMetric.BaseMetric, validation_metric: sciwing.metrics.BaseMetric.BaseMetric, test_metric: sciwing.metrics.BaseMetric.BaseMetric, experiment_name: Optional[str] = None, experiment_hyperparams: Optional[Dict[str, Any]] = None, tensorboard_logdir: str = None, track_for_best: str = 'loss', collate_fn=<class 'list'>, device: Union[<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f22bb5d0>, str] = <sphinx.ext.autodoc.importer._MockObject object>, gradient_norm_clip_value: Optional[float] = 5.0, lr_scheduler: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f22bb410>] = None, use_wandb: bool = False, sample_proportion: float = 1.0, seeds: Dict[str, int] = None)¶ Engine runs the models end to end. It iterates through the train dataset and passes it through the model. During training it helps in tracking a lot of parameters for the run and saving the parameters. It also reports validation and test parameters from time to time. Many utilities required for end-end running of the model is here.
Parameters: - model (nn.Module) – A pytorch module defining a model to be run
- datasets_manager (DatasetsManager) – A datasets manager that handles all the different datasets
- optimizer (torch.optim) – Any Optimizer object instantiated using
torch.optim
- batch_size (int) – Batch size for the dataset. The same batch size is used for
train
,valid
andtest
dataset - save_dir (int) – The experiments are saved in
save_dir
. We save checkpoints, the best model, logs and other information into the save dir - num_epochs (int) – The number of epochs to run the training
- save_every (int) – The model will be checkpointed every
save_every
number of iterations - log_train_metrics_every (int) – The train metrics will be reported every
log_train_metrics_every
iterations during training - train_metric (BaseMetric) – Anything that is an instance of
BaseMetric
for calculating training metrics - validation_metric (BaseMetric) – Anything that is an instance of
BaseMetric
for calculating validation metrics - test_metric (BaseMetric) – Anything that is an instance of
BaseMetric
for calculating test metrics - experiment_name (str) – The experiment should be given a name for ease of tracking. Instead experiment name is not given, we generate a unique 10 digit sha for the experiment.
- experiment_hyperparams (Dict[str, Any]) – This is mostly used for tracking the different hyper-params of the experiment
being run. This may be used by
wandb
to save the hyper-params - tensorboard_logdir (str) – The directory where all the tensorboard runs are stored. If
None
is passed then it defaults to the tensorboard default of storing the log in the current directory. - track_for_best (str) – Which metric should be tracked for deciding the best model?. Anything that
the metric emits and is a single value can be used for tracking. The defauly value
is
loss
. If its loss, then the best value will be the lowest one. For some other metrics likemacro_fscore
, the best metric might be the one that has the highest value - collate_fn (Callable[[List[Any]], List[Any]]) – Collates the different examples into a single batch of examples.
This is the same terminology adopted from
pytorch
. There is no different - device (torch.device) – The device on which the model will be placed. If this is “cpu”, then the model and the tensors will all be on cpu. If this is “cuda:0”, then the model and the tensors will be placed on cuda device 0. You can mention any other cuda device that is suitable for your environment
- gradient_norm_clip_value (float) – To avoid gradient explosion, the gradients of the norm will be clipped if the gradient norm exceeds this value
- lr_scheduler (torch.optim.lr_scheduler) – Any pytorch
lr_scheduler
can be used for reducing the learning rate if the performance on the validation set reduces. - use_wandb (bool) – wandb or weights and biases is a tool that is used to track experiments online. Sciwing comes with inbuilt functionality to track experiments on weights and biases
- seeds (Dict[str, int]) – The dict of seeds to be set. Set the random_seed, pytorch_seed and numpy_seed Found in https://github.com/allenai/allennlp/blob/master/allennlp/common/util.py
-
static
get_iter
(loader: <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f21a4110>) → Iterator[T_co]¶ Returns the iterator for a pytorch data loader.
The
loader
is a pytorch DataLoader that iterates over the dataset in batches and employs many strategies to do so. We want an iterator that returns the dataset in batches. The end of the iterator would signify the end of an epoch and then we can use that information to perform house-keeping.Parameters: loader (DataLoader) – a pytorch data loader Returns: An iterator over the data loader Return type: Iterator
-
get_loader
(dataset: <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f21a40d0>) → <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f21a4110>¶ Returns the DataLoader for the Dataset
Parameters: dataset (Dataset) – Returns: A pytorch DataLoader Return type: DataLoader
-
get_test_dataset
()¶ Returns the test dataset of the experiment
Returns: Anything that conforms to the pytorch style dataset. Return type: Dataset
-
get_train_dataset
()¶ Returns the train dataset of the experiment
Returns: Anything that conforms to the pytorch style dataset. Return type: Dataset
-
get_validation_dataset
()¶ Returns the validation dataset of the experiment
Returns: Anything that conforms to the pytorch style dataset. Return type: Dataset
-
is_best_higher
(current_best=None)¶ Returns
True
if the current value of the metric is HIGHER than the best metric. This is useful for tracking metrics like FSCORE where, higher the value, the better it isParameters: current_best (float) – The current value for the metric that is being tracked Returns: Return type: bool
-
is_best_lower
(current_best=None)¶ Returns True if the current value of the metric is lower than the best metric. This is useful for tracking metrics like loss where, lower the value, the better it is
Parameters: current_best (float) – The current value for the metric that is being tracked Returns: Return type: bool
-
load_model_from_file
(filename: str)¶
-
run
()¶ Run the engine :return:
-
set_best_track_value
(current_best=None)¶ Set the best value of the value being tracked
Parameters: current_best (float) – The current value that is best
-
test_epoch
(epoch_num: int)¶ Runs the test epoch for
epoch_num
Loads the best model that is saved during the training and runs the test dataset.
Parameters: epoch_num (int) – zero based epoch number for which the test dataset is run This is after the last training epoch.
-
test_epoch_end
(epoch_num: int)¶ Performs house-keeping at the end of the test epoch
It reports the metric that is being traced at the end of the test epoch
Parameters: epoch_num (int) – Epoch num after which the test dataset is run
-
train_epoch
(epoch_num: int)¶ Run the training for one epoch :param epoch_num: type: int The current epoch number
-
train_epoch_end
(epoch_num: int)¶ Performs house-keeping at the end of a training epoch
At the end of the training epoch, it does some house-keeping. It reports the average loss, the average metric and other information.
Parameters: epoch_num (int) – The current epoch number (0 based)
-
validation_epoch
(epoch_num: int)¶ Runs one validation epoch on the validation dataset
Parameters: - epoch_num (int) –
- epoch number (0-based) –
-
validation_epoch_end
(epoch_num: int)¶ Performs house-keeping at the end of validation epoch
Parameters: epoch_num (int) – The current epoch number
-