sciwing.modules¶
bow_encoder¶
-
class
sciwing.modules.bow_encoder.
BOW_Encoder
(embedder=None, dropout_value: float = 0, aggregation_type='sum', device: Union[<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154e3f90>, str] = <sphinx.ext.autodoc.importer._MockObject object>)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
,sciwing.utils.class_nursery.ClassNursery
-
__init__
(embedder=None, dropout_value: float = 0, aggregation_type='sum', device: Union[<sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154e3f90>, str] = <sphinx.ext.autodoc.importer._MockObject object>)¶ Bag of Words Encoder
Parameters: - embedder (nn.Module) – Any embedder that you would want to use
- dropout_value (float) – The input dropout value that you would want to use
- aggregation_type (str) –
- The strategy for aggregating words
- sum
- Aggregate word embedding by summing them
- average
- Aggregate word embedding by averaging them
- device (Union[torch.device, str]) – The device where the embeddings are stored
-
forward
(lines: List[sciwing.data.line.Line]) → <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c90d0>¶ Parameters: lines (Dict[str, Any]) – The iter_dict returned by a dataset Returns: The bag of words encoded embedding either average or summed The size is [batch_size, embedding_dimension] Return type: torch.FloatTensor
-
charlstm_encoder¶
-
class
sciwing.modules.charlstm_encoder.
CharLSTMEncoder
(char_embedder: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154d5050>, char_emb_dim: int, hidden_dim: int = 1024, bidirectional: bool = False, combine_strategy: str = 'concat', device: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154d5090> = <sphinx.ext.autodoc.importer._MockObject object>)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
,sciwing.utils.class_nursery.ClassNursery
-
__init__
(char_embedder: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154d5050>, char_emb_dim: int, hidden_dim: int = 1024, bidirectional: bool = False, combine_strategy: str = 'concat', device: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154d5090> = <sphinx.ext.autodoc.importer._MockObject object>)¶ Encodes character tokens using lstms
Parameters: - char_embedder (nn.Module) – An embedder that embeds character tokens
- char_emb_dim (int) – The embedding of characters
- hidden_dim (int) – Hidden dimension of the LSTM
- bidirectional (bool) – Should the LSTM be bi-directional
- combine_strategy (str) – Combine strategy for the lstm hidden dimensions
- device (torch.device("cpu)) – The device on which the lstm will run
-
forward
(iter_dict: Dict[str, Any])¶ Parameters: iter_dict (Dict[str, Any]) – expects char_tokens to be present in the iter_dict
from any datasetReturns: [batch_size, num_time_steps, hidden_dim]
The hidden dimension is the hidden dimension of the LSTM if it is bidirectional and concat thenhidden_dim
will be 2 * self.hidden_dimReturn type: torch.Tensor
-
lstm2seqencoder¶
-
class
sciwing.modules.lstm2seqencoder.
Lstm2SeqEncoder
(embedder: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9990>, dropout_value: float = 0.0, hidden_dim: int = 1024, bidirectional: bool = False, num_layers: int = 1, combine_strategy: str = 'concat', rnn_bias: bool = False, device: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9a50> = <sphinx.ext.autodoc.importer._MockObject object>, add_projection_layer: bool = True, projection_activation: str = 'Tanh')¶ Bases:
sphinx.ext.autodoc.importer._MockObject
,sciwing.utils.class_nursery.ClassNursery
-
__init__
(embedder: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9990>, dropout_value: float = 0.0, hidden_dim: int = 1024, bidirectional: bool = False, num_layers: int = 1, combine_strategy: str = 'concat', rnn_bias: bool = False, device: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9a50> = <sphinx.ext.autodoc.importer._MockObject object>, add_projection_layer: bool = True, projection_activation: str = 'Tanh')¶ Encodes a set of tokens to a set of hidden states.
Parameters: - embedder (nn.Module) – Any embedder can be used for this purpose
- dropout_value (float) – The dropout value for the embedding
- hidden_dim (int) – The hidden dimensions for the LSTM
- bidirectional (bool) – Whether the LSTM is bidirectional
- num_layers (int) – The number of layers of the LSTM
- combine_strategy (str) –
The strategy to combine the different layers of the LSTM This can be one of
- sum
- Sum the different layers of the embedding
- concat
- Concat the layers of the embedding
- rnn_bias (bool) – Set this to false only for debugging purposes
- device (torch.device) –
- add_projection_layer (bool) – Adds a projection layer after the lstm over the hidden activation
- projection_activation (str) – Refer to torch.nn activations. Use any class name as a projection here
-
forward
(lines: List[sciwing.data.line.Line], c0: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9bd0> = None, h0: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9c10> = None) → <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9810>¶ Parameters: - lines (List[Line]) – A list of lines
- c0 (torch.FloatTensor) – The initial state vector for the LSTM
- h0 (torch.FloatTensor) – The initial hidden state for the LSTM
Returns: Returns the vector encoding of the set of instances [batch_size, seq_len, hidden_dim] if single direction [batch_size, seq_len, 2*hidden_dim] if bidirectional
Return type: torch.Tensor
-
lstm2vecencoder¶
-
class
sciwing.modules.lstm2vecencoder.
LSTM2VecEncoder
(embedder, dropout_value: float = 0.0, hidden_dim: int = 1024, bidirectional: bool = False, combine_strategy: str = 'concat', rnn_bias: bool = True, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9490>] = <sphinx.ext.autodoc.importer._MockObject object>)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
,sciwing.utils.class_nursery.ClassNursery
-
__init__
(embedder, dropout_value: float = 0.0, hidden_dim: int = 1024, bidirectional: bool = False, combine_strategy: str = 'concat', rnn_bias: bool = True, device: Union[str, <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9490>] = <sphinx.ext.autodoc.importer._MockObject object>)¶ LSTM2Vec encoder that encodes a series of tokens to a single vector representation
Parameters: - embedder (nn.Module) – Any embedder can be passed
- dropout_value (float) – The dropout value for input embeddings
- hidden_dim (int) – The hidden dimension for the LSTM
- bidirectional (bool) – Whether the LSTM is bidirectional or no
- combine_strategy (str) – Strategy to combine the vectors from two different directions
- rnn_bias (str) – Whether to use the bias layer in RNN. Should be set to false only for debugging purposes
- device (Union[str, torch.device]) – The device on which the model is run
-
forward
(lines: List[sciwing.data.line.Line], c0: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c9290> = None, h0: <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c91d0> = None) → <sphinx.ext.autodoc.importer._MockObject object at 0x7f6c154c96d0>¶ Parameters: - lines (List[Line]) – A list of lines to be encoder
- c0 (torch.FloatTensor) – The initial state vector for the LSTM
- h0 (torch.FloatTensor) – The initial hidden state for the LSTM
Returns: Returns the vector encoding of the set of instances [batch_size, hidden_dim] if single direction [batch_size, 2*hidden_dim] if bidirectional
Return type: torch.Tensor
Gets the initial hidden states of the LSTM2Vec encoder
Parameters: batch_size (int) – The batch size of the current forward pass Returns: Return type: torch.Tensor, torch.Tensor
-