sciwing.infer.seq_label_inference

BaseSeqLabelInference

class sciwing.infer.seq_label_inference.BaseSeqLabelInference.BaseSeqLabelInference(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c146f5310>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14740690>, None] = <sphinx.ext.autodoc.importer._MockObject object>)

Bases: object

Abstract Base Class for Sequence Labeling Inference.The BaseSeqLabelInference Inference provides a skeleton for concrete classes that would want to perform inference for a text classification task.

__init__(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c146f5310>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14740690>, None] = <sphinx.ext.autodoc.importer._MockObject object>)
Parameters:
  • model (nn.Module) – A pytorch module
  • model_filepath (str) – The path where the parameters for the best models are stored. This is usually the best_model.pt while in an experiment directory
  • datasets_manager (DatasetsManager) – Any dataset that conforms to the pytorch Dataset specification
  • device (Optional[Union[str, torch.device]]) – This is either a string like cpu, cuda:0 or a torch.device object
get_misclassified_sentences(true_label_idx: int, pred_label_idx: int) → List[str]
get_true_label_indices_names(labels: List[sciwing.data.seq_label.SeqLabel]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])

Given an list of labels, it returns the indices and the names of the label

Parameters:labels (Dict[str, Any]) – iter_dict returned by a dataset
Returns:A mapping between a label namespace and List of integers that represent the true class A mapping between a label namespace and a List of strings that represent the true class
Return type:(Dict[str, List[int]], Dict[str, List[str]])
infer_batch(lines: Union[List[sciwing.data.line.Line], List[str]]) → Dict[str, List[str]]
load_model()

Loads the best_model from the model_filepath.

model_forward_on_lines(lines: List[sciwing.data.line.Line])

Perform the model forward pass given an iter_dict

Parameters:lines (List[Line]) – iter_dict returned by a dataset
model_output_dict_to_prediction_indices_names(model_output_dict: Dict[str, Any]) -> (typing.List[int], typing.List[str])

Given an model_output_dict, it returns the predicted class indices and names

Parameters:model_output_dict (Dict[str, Any]) – output dictionary from a model
Returns:List of integers that represent the predicted class List of strings that represent the predicted class
Return type:(List[int], List[str])
on_user_input(line: Union[sciwing.data.line.Line, str]) → Dict[str, List[str]]
print_confusion_matrix()
report_metrics()

Reports the metrics for returning the dataset

run_inference()

Should Run inference on the test dataset

This method should run the model through the test dataset. It should perform inference and collect the appropriate metrics and data that is necessary for further use

Returns:Returns
Return type:Dict[str, Any]
run_test()

CONLL Inference

class sciwing.infer.seq_label_inference.conll_inference.Conll2003Inference(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12178ed0>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12178990>, None] = <sphinx.ext.autodoc.importer._MockObject object>, predicted_tags_namespace_prefix: str = 'predicted_tags')

Bases: sciwing.infer.seq_label_inference.seq_label_inference.SequenceLabellingInference

generate_predictions_for(task: str, test_filename: str, output_filename: str)
Parameters:
  • task (str) – Can be one of pos, dep or ner The task for which the predictions are made using the current model
  • test_filename (str) – This is the eng.testb of the CoNLL 2003 dataset
  • output_filename (str) – The file where you want to store predictions
Returns:

  • None – Writes the predictions to the output_filename
  • The output file is meant to be used with conlleval.perl script
  • ./conlleval < output_filename
  • The file expects the correct tag and the predicted tag to be in the last
  • two columns in that order
  • The first column is the token for which the prediction is made

SeqLabel Inference

class sciwing.infer.seq_label_inference.seq_label_inference.SequenceLabellingInference(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14681150>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14681cd0>, None] = <sphinx.ext.autodoc.importer._MockObject object>, predicted_tags_namespace_prefix: str = 'predicted_tags')

Bases: sciwing.infer.seq_label_inference.BaseSeqLabelInference.BaseSeqLabelInference

generate_scienceie_prediction_folder(dev_folder: pathlib.Path, pred_folder: pathlib.Path)

Generates the predicted folder for the dataset in the test folder for ScienceIE. This is very specific to ScienceIE. Not meant to use with other tasks

ScienceIE is a SemEval Task that needs the files to be written into a folder and it reports metrics by reading files from that folder. This method generates the predicted folder given the dev folder

Parameters:
  • dev_folder (pathlib.Path) – The path where the dev files are present
  • pred_folder (pathlib.Path) – The path where the predicted files will be written
get_misclassified_sentences(true_label_idx: int, pred_label_idx: int)
get_true_label_indices_names(labels: List[sciwing.data.seq_label.SeqLabel]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])

Given an list of labels, it returns the indices and the names of the label

Parameters:labels (Dict[str, Any]) – iter_dict returned by a dataset
Returns:A mapping between a label namespace and List of integers that represent the true class A mapping between a label namespace and a List of strings that represent the true class
Return type:(Dict[str, List[int]], Dict[str, List[str]])
infer_batch(lines: Union[List[sciwing.data.line.Line], List[str]]) → Dict[str, List[str]]
model_forward_on_lines(lines: List[sciwing.data.line.Line])

Perform the model forward pass given an iter_dict

Parameters:lines (List[Line]) – iter_dict returned by a dataset
model_output_dict_to_prediction_indices_names(model_output_dict: Dict[str, Any]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])

Given an model_output_dict, it returns the predicted class indices and names

Parameters:model_output_dict (Dict[str, Any]) – output dictionary from a model
Returns:List of integers that represent the predicted class List of strings that represent the predicted class
Return type:(List[int], List[str])
on_user_input(line: Union[sciwing.data.line.Line, str]) → Dict[str, List[str]]
print_confusion_matrix()

This prints the confusion metrics for the entire dataset :returns: :rtype: None

report_metrics()

Reports the metrics for returning the dataset

run_inference()

Should Run inference on the test dataset

This method should run the model through the test dataset. It should perform inference and collect the appropriate metrics and data that is necessary for further use

Returns:Returns
Return type:Dict[str, Any]
run_test()