sciwing.infer.seq_label_inference¶
BaseSeqLabelInference¶
-
class
sciwing.infer.seq_label_inference.BaseSeqLabelInference.
BaseSeqLabelInference
(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c146f5310>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14740690>, None] = <sphinx.ext.autodoc.importer._MockObject object>)¶ Bases:
object
Abstract Base Class for Sequence Labeling Inference.The BaseSeqLabelInference Inference provides a skeleton for concrete classes that would want to perform inference for a text classification task.
-
__init__
(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c146f5310>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14740690>, None] = <sphinx.ext.autodoc.importer._MockObject object>)¶ Parameters: - model (nn.Module) – A pytorch module
- model_filepath (str) – The path where the parameters for the best models are stored. This is usually
the
best_model.pt
while in an experiment directory - datasets_manager (DatasetsManager) – Any dataset that conforms to the pytorch Dataset specification
- device (Optional[Union[str, torch.device]]) – This is either a string like
cpu
,cuda:0
or a torch.device object
-
get_misclassified_sentences
(true_label_idx: int, pred_label_idx: int) → List[str]¶
-
get_true_label_indices_names
(labels: List[sciwing.data.seq_label.SeqLabel]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])¶ Given an list of labels, it returns the indices and the names of the label
Parameters: labels (Dict[str, Any]) – iter_dict
returned by a datasetReturns: A mapping between a label namespace and List of integers that represent the true class A mapping between a label namespace and a List of strings that represent the true class Return type: (Dict[str, List[int]], Dict[str, List[str]])
-
infer_batch
(lines: Union[List[sciwing.data.line.Line], List[str]]) → Dict[str, List[str]]¶
-
load_model
()¶ Loads the best_model from the model_filepath.
-
model_forward_on_lines
(lines: List[sciwing.data.line.Line])¶ Perform the model forward pass given an
iter_dict
Parameters: lines (List[Line]) – iter_dict
returned by a dataset
-
model_output_dict_to_prediction_indices_names
(model_output_dict: Dict[str, Any]) -> (typing.List[int], typing.List[str])¶ Given an
model_output_dict
, it returns the predicted class indices and namesParameters: model_output_dict (Dict[str, Any]) – output dictionary from a model Returns: List of integers that represent the predicted class List of strings that represent the predicted class Return type: (List[int], List[str])
-
on_user_input
(line: Union[sciwing.data.line.Line, str]) → Dict[str, List[str]]¶
-
print_confusion_matrix
()¶
-
report_metrics
()¶ Reports the metrics for returning the dataset
-
run_inference
()¶ Should Run inference on the test dataset
This method should run the model through the test dataset. It should perform inference and collect the appropriate metrics and data that is necessary for further use
Returns: Returns Return type: Dict[str, Any]
-
run_test
()¶
-
CONLL Inference¶
-
class
sciwing.infer.seq_label_inference.conll_inference.
Conll2003Inference
(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12178ed0>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c12178990>, None] = <sphinx.ext.autodoc.importer._MockObject object>, predicted_tags_namespace_prefix: str = 'predicted_tags')¶ Bases:
sciwing.infer.seq_label_inference.seq_label_inference.SequenceLabellingInference
-
generate_predictions_for
(task: str, test_filename: str, output_filename: str)¶ Parameters: - task (str) – Can be one of pos, dep or ner The task for which the predictions are made using the current model
- test_filename (str) – This is the eng.testb of the CoNLL 2003 dataset
- output_filename (str) – The file where you want to store predictions
Returns: - None – Writes the predictions to the output_filename
- The output file is meant to be used with conlleval.perl script
- ./conlleval < output_filename
- The file expects the correct tag and the predicted tag to be in the last
- two columns in that order
- The first column is the token for which the prediction is made
-
SeqLabel Inference¶
-
class
sciwing.infer.seq_label_inference.seq_label_inference.
SequenceLabellingInference
(model: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14681150>, model_filepath: str, datasets_manager: sciwing.data.datasets_manager.DatasetsManager, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c14681cd0>, None] = <sphinx.ext.autodoc.importer._MockObject object>, predicted_tags_namespace_prefix: str = 'predicted_tags')¶ Bases:
sciwing.infer.seq_label_inference.BaseSeqLabelInference.BaseSeqLabelInference
-
generate_scienceie_prediction_folder
(dev_folder: pathlib.Path, pred_folder: pathlib.Path)¶ Generates the predicted folder for the dataset in the test folder for ScienceIE. This is very specific to ScienceIE. Not meant to use with other tasks
ScienceIE is a SemEval Task that needs the files to be written into a folder and it reports metrics by reading files from that folder. This method generates the predicted folder given the dev folder
Parameters: - dev_folder (pathlib.Path) – The path where the dev files are present
- pred_folder (pathlib.Path) – The path where the predicted files will be written
-
get_misclassified_sentences
(true_label_idx: int, pred_label_idx: int)¶
-
get_true_label_indices_names
(labels: List[sciwing.data.seq_label.SeqLabel]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])¶ Given an list of labels, it returns the indices and the names of the label
Parameters: labels (Dict[str, Any]) – iter_dict
returned by a datasetReturns: A mapping between a label namespace and List of integers that represent the true class A mapping between a label namespace and a List of strings that represent the true class Return type: (Dict[str, List[int]], Dict[str, List[str]])
-
infer_batch
(lines: Union[List[sciwing.data.line.Line], List[str]]) → Dict[str, List[str]]¶
-
model_forward_on_lines
(lines: List[sciwing.data.line.Line])¶ Perform the model forward pass given an
iter_dict
Parameters: lines (List[Line]) – iter_dict
returned by a dataset
-
model_output_dict_to_prediction_indices_names
(model_output_dict: Dict[str, Any]) -> (typing.Dict[str, typing.List[int]], typing.Dict[str, typing.List[str]])¶ Given an
model_output_dict
, it returns the predicted class indices and namesParameters: model_output_dict (Dict[str, Any]) – output dictionary from a model Returns: List of integers that represent the predicted class List of strings that represent the predicted class Return type: (List[int], List[str])
-
on_user_input
(line: Union[sciwing.data.line.Line, str]) → Dict[str, List[str]]¶
-
print_confusion_matrix
()¶ This prints the confusion metrics for the entire dataset :returns: :rtype: None
-
report_metrics
()¶ Reports the metrics for returning the dataset
-
run_inference
()¶ Should Run inference on the test dataset
This method should run the model through the test dataset. It should perform inference and collect the appropriate metrics and data that is necessary for further use
Returns: Returns Return type: Dict[str, Any]
-
run_test
()¶
-