sciwing.infer.seq_label_inference¶

BaseSeqLabelInference¶

class sciwing.infer.seq_label_inference.BaseSeqLabelInference.BaseSeqLabelInference(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c146f5310>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14740690>, None] = <sphinx.ext.autodoc.importer._MockObject object>)¶

Bases: object

Abstract Base Class for Sequence Labeling Inference.The BaseSeqLabelInference Inference provides a skeleton for concrete classes that would want to perform inference for a text classification task.

__init__(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c146f5310>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14740690>, None] = <sphinx.ext.autodoc.importer._MockObject object>)¶

Parameters:

model (nn.Module) – A pytorch module
model_filepath (str) – The path where the parameters for the best models are stored. This is usually the best_model.pt while in an experiment directory
datasets_manager (DatasetsManager) – Any dataset that conforms to the pytorch Dataset specification
device (Optional[Union[str, torch.device]]) – This is either a string like cpu, cuda:0 or a torch.device object

get_misclassified_sentences(true_label_idx: int, pred_label_idx: int) → List[str]¶

get_true_label_indices_names(labels: List[sciwing.data.seq_label.SeqLabel]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])¶

Given an list of labels, it returns the indices and the names of the label

Parameters:	labels (Dict[str, Any]) – `iter_dict` returned by a dataset
Returns:	A mapping between a label namespace and List of integers that represent the true class A mapping between a label namespace and a List of strings that represent the true class
Return type:	(Dict[str, List[int]], Dict[str, List[str]])

infer_batch(lines: Union[List[sciwing.data.line.Line], List[str]]) → Dict[str, List[str]]¶

load_model()¶: Loads the best_model from the model_filepath.

model_forward_on_lines(lines: List[sciwing.data.line.Line])¶

Perform the model forward pass given an iter_dict

Parameters:	lines (List[Line]) – `iter_dict` returned by a dataset

model_output_dict_to_prediction_indices_names(model_output_dict: Dict[str, Any]) -> (typing.List[int], typing.List[str])¶

Given an model_output_dict, it returns the predicted class indices and names

Parameters:	model_output_dict (Dict[str, Any]) – output dictionary from a model
Returns:	List of integers that represent the predicted class List of strings that represent the predicted class
Return type:	(List[int], List[str])

on_user_input(line: Union[sciwing.data.line.Line, str]) → Dict[str, List[str]]¶

print_confusion_matrix()¶

report_metrics()¶: Reports the metrics for returning the dataset

run_inference()¶

Should Run inference on the test dataset

This method should run the model through the test dataset. It should perform inference and collect the appropriate metrics and data that is necessary for further use

Returns:	Returns
Return type:	Dict[str, Any]

run_test()¶

CONLL Inference¶

class sciwing.infer.seq_label_inference.conll_inference.Conll2003Inference(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12178ed0>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12178990>, None] = <sphinx.ext.autodoc.importer._MockObject object>, predicted_tags_namespace_prefix: str = 'predicted_tags')¶

Bases: sciwing.infer.seq_label_inference.seq_label_inference.SequenceLabellingInference

generate_predictions_for(task: str, test_filename: str, output_filename: str)¶

Parameters:

task (str) – Can be one of pos, dep or ner The task for which the predictions are made using the current model
test_filename (str) – This is the eng.testb of the CoNLL 2003 dataset
output_filename (str) – The file where you want to store predictions

Returns:

None – Writes the predictions to the output_filename
The output file is meant to be used with conlleval.perl script
./conlleval < output_filename
The file expects the correct tag and the predicted tag to be in the last
two columns in that order
The first column is the token for which the prediction is made

SeqLabel Inference¶

class sciwing.infer.seq_label_inference.seq_label_inference.SequenceLabellingInference(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14681150>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14681cd0>, None] = <sphinx.ext.autodoc.importer._MockObject object>, predicted_tags_namespace_prefix: str = 'predicted_tags')¶

Bases: sciwing.infer.seq_label_inference.BaseSeqLabelInference.BaseSeqLabelInference

generate_scienceie_prediction_folder(dev_folder: pathlib.Path, pred_folder: pathlib.Path)¶

Generates the predicted folder for the dataset in the test folder for ScienceIE. This is very specific to ScienceIE. Not meant to use with other tasks

ScienceIE is a SemEval Task that needs the files to be written into a folder and it reports metrics by reading files from that folder. This method generates the predicted folder given the dev folder

Parameters:	dev_folder (pathlib.Path) – The path where the dev files are present pred_folder (pathlib.Path) – The path where the predicted files will be written

get_misclassified_sentences(true_label_idx: int, pred_label_idx: int)¶