sciwing.metrics¶

BaseMetric¶

class sciwing.metrics.BaseMetric.BaseMetric(datasets_manager: sciwing.data.datasets_manager.DatasetsManager)¶

Bases: object

calc_metric(lines: List[sciwing.data.line.Line], labels: List[sciwing.data.label.Label], model_forward_dict: Dict[str, Any]) → None¶

Calculates the metric using the lines and labels returned by any dataset and model_forward_dict of a model. This is usually called for a batch of inputs and a forward pass. The state of the different metrics should be retained by the metric across an epoch before reset method can be called and all the metric related data can be reset for a new epoch

Parameters:	lines (List[Line]) – labels (List[Label]) – model_forward_dict (Dict[str, Any]) –

get_metric() → Dict[str, Any]¶

Returns the value of different metrics being tracked

Return anything that is being tracked by the metric. Return it as a dictionary that can be used by outside method for reporting purposes or repurposing it for the sake of reporting

Returns:	Metric/values being tracked by the metric
Return type:	Dict[str, Any]

report_metrics(report_type: str = None) → Any¶

A method to report the tracked metrics in a suitable form

Parameters:	report_type (str) – The type of report that will be returned by the method
Returns:	This method can return any suitable format for reporting. If it is ought to be printed, return a suitable string. If the report needs to be saved to a file, go ahead.
Return type:	Any

reset()¶: Should reset all the metrics/value being tracked by this metric This method is generally used at the end of a training/validation epoch to reset the values before starting another epoch

classification_metrics_utils¶

class sciwing.metrics.classification_metrics_utils.ClassificationMetricsUtils¶

Bases: object

The classification metrics like accuracy, precision recall and fmeasure are often used in supervised learning. This class provides a few utilities that helps in calculating these.

generate_table_report_from_counters(tp_counter: Dict[int, int], fp_counter: Dict[int, int], fn_counter: Dict[int, int], idx2labelname_mapping: Dict[int, str] = None) → str¶

Returns a table representation for Precision Recall and FMeasure

Parameters:	tp_counter (Dict[int, int]) – The mapping between class index and true positive count fp_counter (Dict[int, int]) – The mapping between class index and false positive count fn_counter (Dict[int, int]) – The mapping between class index and false negative count idx2labelname_mapping (Dict[int, str]) – The mapping between idx and label name
Returns:	Returns a string representing the table of precision recall and fmeasure for every class in the dataset
Return type:	str

static get_confusion_matrix_and_labels(predicted_tag_indices: List[List[int]], true_tag_indices: List[List[int]], true_masked_label_indices: List[List[int]], pred_labels_mask: List[List[int]] = None) -> (<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f3065310>, typing.List[int])¶

Gets the confusion matrix and the list of classes for which the confusion matrix is generated

Parameters:	predicted_tag_indices (List[List[int]]) – Predicted tag indices for a batch true_tag_indices (List[List[int]]) – True tag indices for a batch true_masked_label_indices (List[List[int]]) – Every integer is either a 0 or 1, where 1 will indicate that the label in true_tag_indices will be ignored

static get_macro_prf_from_prf_dicts(precision_dict: Dict[int, int], recall_dict: Dict[int, int], fscore_dict: Dict[int, int]) -> (<class 'int'>, <class 'int'>, <class 'int'>)¶

Calculates Macro Precision, Recall and FMeasure

Parameters:	precision_dict (Dict[int, int]) – Dictionary mapping betwen the class index and precision values recall_dict (Dict[int, int]) – Dictionary mapping between the class index and recall values fscore_dict (Dict[int, int]) – Dictionary mapping between the class index and fscore values
Returns:	The macro precision, macro recall and macro fscore measures
Return type:	int, int, int

get_micro_prf_from_counters(tp_counter: Dict[int, int], fp_counter: Dict[int, int], fn_counter: Dict[int, int]) -> (<class 'int'>, <class 'int'>, <class 'int'>)¶

This calculates the micro precision recall and fmeasure from different counters. The counters contain a mapping from a class index to the particular number

Parameters:	tp_counter (Dict[int, int]) – Mapping from class index to true positive count fp_counter (Dict[int, int]) – Mapping from class index to false posiive count fn_counter (Dict[int, int]) – Mapping from class index to false negative count
Returns:	Micro precision, Micro Recall and Micro fmeasure
Return type:	int, int, int

get_prf_from_counters(tp_counter: Dict[int, int], fp_counter: Dict[int, int], fn_counter: Dict[int, int])¶

This calculates the precision recall f-measure from different counters. The counters contain a mapping from a class index to the particular number

Parameters:	tp_counter (Dict[int, int]) – Mapping from class index to true positive count fp_counter (Dict[int, int]) – Mapping from class index to false posiive count fn_counter (Dict[int, int]) – Mapping from class index to false negative count
Returns:	Three dictionaries representing the Precision Recall and Fmeasure for all the different classes
Return type:	Dict[int, int], Dict[int, int], Dict[int, int]

precision_recall_measure¶

class sciwing.metrics.precision_recall_fmeasure.PrecisionRecallFMeasure(datasets_manager: sciwing.data.datasets_manager.DatasetsManager)¶

Bases: sciwing.metrics.BaseMetric.BaseMetric, sciwing.utils.class_nursery.ClassNursery

__init__(datasets_manager: sciwing.data.datasets_manager.DatasetsManager)¶

Parameters:	datasets_manager (DatasetsManager) – The dataset manager managing the labels and other information

calc_metric(lines: List[sciwing.data.line.Line], labels: List[sciwing.data.label.Label], model_forward_dict: Dict[str, Any]) → None¶

Updates the values being tracked for calculating the metric

For Precision Recall FMeasure we update the true positive, false positive and false negative of the different classes being tracked

Parameters:

lines (List[Line]) – A list of lines
labels (List[Label]) – A list of labels. This has to be the label used for classification Refer to the documentation of Label for more information
model_forward_dict (Dict[str, Any]) – The dictionary obtained after a forward pass The model_forward_pass is expected to have normalized_probs that usually is of the size [batch_size, num_classes]

get_metric() → Dict[str, Any]¶

Returns different values being tracked to calculate Precision Recall FMeasure

Returns:

Returns a dictionary with the following key value pairs for every namespace

precision: Dict[str, float]: The precision for different classes
recall: Dict[str, float]: The recall values for different classes
fscore: Dict[str, float]: The fscore values for different classes,
num_tp: Dict[str, int]: The number of true positives for different classes,
num_fp: Dict[str, int]: The number of false positives for different classes,
num_fn: Dict[str, int]: The number of false negatives for different classes
”macro_precision”: float: The macro precision value considering all different classes,
macro_recall: float: The macro recall value considering all different classes
macro_fscore: float: The macro fscore value considering all different classes
micro_precision: float: The micro precision value considering all different classes,
micro_recall: float: The micro recall value considering all different classes.
micro_fscore: float: The micro fscore value considering all different classes

Return type: Dict[str, Any]

print_confusion_metrics(predicted_probs: <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f39b8450>, labels: <sphinx.ext.autodoc.importer._MockObject object at 0x7f32f39b89d0>, labels_mask: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f39b8a90>] = None) → None¶

Prints confusion matrix

Parameters:	predicted_probs (torch.FloatTensor) – Predicted Probabilities `[batch_size, num_classes]` labels (torch.FloatTensor) – True labels of the size `[batch_size, 1]` labels_mask (Optional[torch.ByteTensor]) – Labels mask indicating 1 in thos places where the true label is ignored Otherwise 0. It should be of same size as labels

report_metrics(report_type='wasabi')¶

Reports metrics in a printable format

Parameters:	report_type (type) – Select one of `[wasabi, paper]` If wasabi, then we return a printable table that represents the precision recall and fmeasures for different classes

reset() → None¶

Resets all the counters

Resets the tp_counter which is the true positive counter Resets the fp_counter which is the false positive counter Resets the fn_counter - which is the false negative counter Resets the tn_counter - which is the true nagative counter

token_cls_accuracy¶

class sciwing.metrics.token_cls_accuracy.TokenClassificationAccuracy(datasets_manager: sciwing.data.datasets_manager.DatasetsManager = None, predicted_tags_namespace_prefix='predicted_tags')¶

Bases: sciwing.metrics.BaseMetric.BaseMetric, sciwing.utils.class_nursery.ClassNursery

calc_metric(lines: List[sciwing.data.line.Line], labels: List[sciwing.data.seq_label.SeqLabel], model_forward_dict: Dict[str, Any]) → None¶

Parameters:

lines (List[Line]) – The list of lines
labels (List[Label]) – The list of sequence labels
model_forward_dict (Dict[str, Any]) – The model_forward_dict should have predicted tags for every namespace The predicted_tags are the best possible predicted tags for the batch They are List[List[int]] where the size is [batch_size, time_steps] We expect that the predicted tags are

get_metric() → Dict[str, Union[Dict[str, float], float]]¶

Returns different values being tracked to calculate Precision Recall FMeasure :returns: Returns a dictionary with following key value pairs for every namespace

precision: Dict[str, float]

The precision for different classes

recall: Dict[str, float]

The recall values for different classes

“fscore”: Dict[str, float]

The fscore values for different classes,

num_tp: Dict[str, int]

The number of true positives for different classes,

num_fp: Dict[str, int]

The number of false positives for different classes,

num_fn: Dict[str, int]

The number of false negatives for different classes

“macro_precision”: float

The macro precision value considering all different classes,

macro_recall: float

The macro recall value considering all different classes

macro_fscore: float

The macro fscore value considering all different classes

micro_precision: float

The micro precision value considering all different classes,

micro_recall: float

The micro recall value considering all different classes.

micro_fscore: float

The micro fscore value considering all different classes

Return type:	Dict[str, Any]

print_confusion_metrics(predicted_tag_indices: List[List[int]], true_tag_indices: List[List[int]], labels_mask: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f32f3065b90>] = None) → None¶

Prints confusion matrics for a batch of tag indices. It assumes that the batch is padded and every instance is of similar length

Parameters:	predicted_tag_indices (List[List[int]]) – Predicted tag indices for a batch of sentences true_tag_indices (List[List[int]]) – True tag indices for a batch of sentences labels_mask (Optional[torch.ByteTensor]) – The labels mask which has the same as `true_tag_indices`. 0 in a position indicates that there is no masking 1 indicates that there is a masking

report_metrics(report_type='wasabi') → Any¶

Reports metrics in a printable format

Parameters:	report_type (type) – Select one of `[wasabi, paper]` If wasabi, then we return a printable table that represents the precision recall and fmeasures for different classes

reset()¶: Should reset all the metrics/value being tracked by this metric This method is generally used at the end of a training/validation epoch to reset the values before starting another epoch