sciwing.metrics

BaseMetric

class sciwing.metrics.BaseMetric.BaseMetric(datasets_manager: sciwing.data.datasets_manager.DatasetsManager)

Bases: object

calc_metric(lines: List[sciwing.data.line.Line], labels: List[sciwing.data.label.Label], model_forward_dict: Dict[str, Any]) → None

Calculates the metric using the lines and labels returned by any dataset and model_forward_dict of a model. This is usually called for a batch of inputs and a forward pass. The state of the different metrics should be retained by the metric across an epoch before reset method can be called and all the metric related data can be reset for a new epoch

Parameters:
  • lines (List[Line]) –
  • labels (List[Label]) –
  • model_forward_dict (Dict[str, Any]) –
get_metric() → Dict[str, Any]

Returns the value of different metrics being tracked

Return anything that is being tracked by the metric. Return it as a dictionary that can be used by outside method for reporting purposes or repurposing it for the sake of reporting

Returns:Metric/values being tracked by the metric
Return type:Dict[str, Any]
report_metrics(report_type: str = None) → Any

A method to report the tracked metrics in a suitable form

Parameters:report_type (str) – The type of report that will be returned by the method
Returns:This method can return any suitable format for reporting. If it is ought to be printed, return a suitable string. If the report needs to be saved to a file, go ahead.
Return type:Any
reset()

Should reset all the metrics/value being tracked by this metric This method is generally used at the end of a training/validation epoch to reset the values before starting another epoch

classification_metrics_utils

class sciwing.metrics.classification_metrics_utils.ClassificationMetricsUtils

Bases: object

The classification metrics like accuracy, precision recall and fmeasure are often used in supervised learning. This class provides a few utilities that helps in calculating these.

generate_table_report_from_counters(tp_counter: Dict[int, int], fp_counter: Dict[int, int], fn_counter: Dict[int, int], idx2labelname_mapping: Dict[int, str] = None) → str

Returns a table representation for Precision Recall and FMeasure

Parameters:
  • tp_counter (Dict[int, int]) – The mapping between class index and true positive count
  • fp_counter (Dict[int, int]) – The mapping between class index and false positive count
  • fn_counter (Dict[int, int]) – The mapping between class index and false negative count
  • idx2labelname_mapping (Dict[int, str]) – The mapping between idx and label name
Returns:

Returns a string representing the table of precision recall and fmeasure for every class in the dataset

Return type:

str

static get_confusion_matrix_and_labels(predicted_tag_indices: List[List[int]], true_tag_indices: List[List[int]], true_masked_label_indices: List[List[int]], pred_labels_mask: List[List[int]] = None) -> (<sphinx.ext.autodoc.importer._MockObject object at 0x7f2a79e491d0>, typing.List[int])

Gets the confusion matrix and the list of classes for which the confusion matrix is generated

Parameters:
  • predicted_tag_indices (List[List[int]]) – Predicted tag indices for a batch
  • true_tag_indices (List[List[int]]) – True tag indices for a batch
  • true_masked_label_indices (List[List[int]]) – Every integer is either a 0 or 1, where 1 will indicate that the label in true_tag_indices will be ignored
static get_macro_prf_from_prf_dicts(precision_dict: Dict[int, int], recall_dict: Dict[int, int], fscore_dict: Dict[int, int]) -> (<class 'int'>, <class 'int'>, <class 'int'>)

Calculates Macro Precision, Recall and FMeasure

Parameters:
  • precision_dict (Dict[int, int]) – Dictionary mapping betwen the class index and precision values
  • recall_dict (Dict[int, int]) – Dictionary mapping between the class index and recall values
  • fscore_dict (Dict[int, int]) – Dictionary mapping between the class index and fscore values
Returns:

The macro precision, macro recall and macro fscore measures

Return type:

int, int, int

get_micro_prf_from_counters(tp_counter: Dict[int, int], fp_counter: Dict[int, int], fn_counter: Dict[int, int]) -> (<class 'int'>, <class 'int'>, <class 'int'>)

This calculates the micro precision recall and fmeasure from different counters. The counters contain a mapping from a class index to the particular number

Parameters:
  • tp_counter (Dict[int, int]) – Mapping from class index to true positive count
  • fp_counter (Dict[int, int]) – Mapping from class index to false posiive count
  • fn_counter (Dict[int, int]) – Mapping from class index to false negative count
Returns:

Micro precision, Micro Recall and Micro fmeasure

Return type:

int, int, int

get_prf_from_counters(tp_counter: Dict[int, int], fp_counter: Dict[int, int], fn_counter: Dict[int, int])

This calculates the precision recall f-measure from different counters. The counters contain a mapping from a class index to the particular number

Parameters:
  • tp_counter (Dict[int, int]) – Mapping from class index to true positive count
  • fp_counter (Dict[int, int]) – Mapping from class index to false posiive count
  • fn_counter (Dict[int, int]) – Mapping from class index to false negative count
Returns:

Three dictionaries representing the Precision Recall and Fmeasure for all the different classes

Return type:

Dict[int, int], Dict[int, int], Dict[int, int]

precision_recall_measure

class sciwing.metrics.precision_recall_fmeasure.PrecisionRecallFMeasure(datasets_manager: sciwing.data.datasets_manager.DatasetsManager)

Bases: sciwing.metrics.BaseMetric.BaseMetric, sciwing.utils.class_nursery.ClassNursery

__init__(datasets_manager: sciwing.data.datasets_manager.DatasetsManager)
Parameters:datasets_manager (DatasetsManager) – The dataset manager managing the labels and other information
calc_metric(lines: List[sciwing.data.line.Line], labels: List[sciwing.data.label.Label], model_forward_dict: Dict[str, Any]) → None

Updates the values being tracked for calculating the metric

For Precision Recall FMeasure we update the true positive, false positive and false negative of the different classes being tracked

Parameters:
  • lines (List[Line]) – A list of lines
  • labels (List[Label]) – A list of labels. This has to be the label used for classification Refer to the documentation of Label for more information
  • model_forward_dict (Dict[str, Any]) – The dictionary obtained after a forward pass The model_forward_pass is expected to have normalized_probs that usually is of the size [batch_size, num_classes]
get_metric() → Dict[str, Any]

Returns different values being tracked to calculate Precision Recall FMeasure

Returns:Returns a dictionary with the following key value pairs for every namespace
precision: Dict[str, float]
The precision for different classes
recall: Dict[str, float]
The recall values for different classes
fscore: Dict[str, float]
The fscore values for different classes,
num_tp: Dict[str, int]
The number of true positives for different classes,
num_fp: Dict[str, int]
The number of false positives for different classes,
num_fn: Dict[str, int]
The number of false negatives for different classes
”macro_precision”: float
The macro precision value considering all different classes,
macro_recall: float
The macro recall value considering all different classes
macro_fscore: float
The macro fscore value considering all different classes
micro_precision: float
The micro precision value considering all different classes,
micro_recall: float
The micro recall value considering all different classes.
micro_fscore: float
The micro fscore value considering all different classes
Return type:Dict[str, Any]
print_confusion_metrics(predicted_probs: <sphinx.ext.autodoc.importer._MockObject object at 0x7f2a7a5d5310>, labels: <sphinx.ext.autodoc.importer._MockObject object at 0x7f2a7a5d5890>, labels_mask: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f2a7a5d5950>] = None) → None

Prints confusion matrix

Parameters:
  • predicted_probs (torch.FloatTensor) – Predicted Probabilities [batch_size, num_classes]
  • labels (torch.FloatTensor) – True labels of the size [batch_size, 1]
  • labels_mask (Optional[torch.ByteTensor]) – Labels mask indicating 1 in thos places where the true label is ignored Otherwise 0. It should be of same size as labels
report_metrics(report_type='wasabi')

Reports metrics in a printable format

Parameters:report_type (type) – Select one of [wasabi, paper] If wasabi, then we return a printable table that represents the precision recall and fmeasures for different classes
reset() → None

Resets all the counters

Resets the tp_counter which is the true positive counter Resets the fp_counter which is the false positive counter Resets the fn_counter - which is the false negative counter Resets the tn_counter - which is the true nagative counter

token_cls_accuracy

class sciwing.metrics.token_cls_accuracy.TokenClassificationAccuracy(datasets_manager: sciwing.data.datasets_manager.DatasetsManager = None, predicted_tags_namespace_prefix='predicted_tags')

Bases: sciwing.metrics.BaseMetric.BaseMetric, sciwing.utils.class_nursery.ClassNursery

calc_metric(lines: List[sciwing.data.line.Line], labels: List[sciwing.data.seq_label.SeqLabel], model_forward_dict: Dict[str, Any]) → None
Parameters:
  • lines (List[Line]) – The list of lines
  • labels (List[Label]) – The list of sequence labels
  • model_forward_dict (Dict[str, Any]) – The model_forward_dict should have predicted tags for every namespace The predicted_tags are the best possible predicted tags for the batch They are List[List[int]] where the size is [batch_size, time_steps] We expect that the predicted tags are
get_metric() → Dict[str, Union[Dict[str, float], float]]

Returns different values being tracked to calculate Precision Recall FMeasure :returns: Returns a dictionary with following key value pairs for every namespace

precision: Dict[str, float]
The precision for different classes
recall: Dict[str, float]
The recall values for different classes
“fscore”: Dict[str, float]
The fscore values for different classes,
num_tp: Dict[str, int]
The number of true positives for different classes,
num_fp: Dict[str, int]
The number of false positives for different classes,
num_fn: Dict[str, int]
The number of false negatives for different classes
“macro_precision”: float
The macro precision value considering all different classes,
macro_recall: float
The macro recall value considering all different classes
macro_fscore: float
The macro fscore value considering all different classes
micro_precision: float
The micro precision value considering all different classes,
micro_recall: float
The micro recall value considering all different classes.
micro_fscore: float
The micro fscore value considering all different classes
Return type:Dict[str, Any]
print_confusion_metrics(predicted_tag_indices: List[List[int]], true_tag_indices: List[List[int]], labels_mask: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f2a79e49a50>] = None) → None

Prints confusion matrics for a batch of tag indices. It assumes that the batch is padded and every instance is of similar length

Parameters:
  • predicted_tag_indices (List[List[int]]) – Predicted tag indices for a batch of sentences
  • true_tag_indices (List[List[int]]) – True tag indices for a batch of sentences
  • labels_mask (Optional[torch.ByteTensor]) – The labels mask which has the same as true_tag_indices. 0 in a position indicates that there is no masking 1 indicates that there is a masking
report_metrics(report_type='wasabi') → Any

Reports metrics in a printable format

Parameters:report_type (type) – Select one of [wasabi, paper] If wasabi, then we return a printable table that represents the precision recall and fmeasures for different classes
reset()

Should reset all the metrics/value being tracked by this metric This method is generally used at the end of a training/validation epoch to reset the values before starting another epoch