lore_sa.encoder_decoder.ColumnTransformerEnc
- class lore_sa.encoder_decoder.ColumnTransformerEnc(descriptor: dict)[source]
It provides an interface to access One Hot enconding (https://en.wikipedia.org/wiki/One-hot) functions. It relies on OneHotEncoder class from sklearn
- __init__(descriptor: dict)[source]
Initialize the encoder/decoder.
- Parameters:
dataset_descriptor (dict) – Dictionary containing feature information including ‘numeric’, ‘categorical’, and ‘ordinal’ feature descriptors
Methods
__init__(descriptor)Initialize the encoder/decoder.
decode(Z)Decode the array staring from the original descriptor
Decode the target class
encode(X)It applies the encoder to the input features
Encode the target class :param X: :return:
Get a mapping of encoded feature indices to feature names.
Get index intervals for each original feature in the encoded space.
- decode(Z: numpy.array)[source]
Decode the array staring from the original descriptor
- Parameters:
x ([Numpy array]) – Array to decode
- Return [Numpy array]:
Decoded array
- decode_target_class(Z: numpy.array)[source]
Decode the target class
- Parameters:
x ([Numpy array]) – Array containing the target class values to be decoded
- encode(X: numpy.array)[source]
It applies the encoder to the input features
- Parameters:
x ([Numpy array]) – Array to encode
- Return [Numpy array]:
Encoded array
- get_encoded_features()[source]
Get a mapping of encoded feature indices to feature names.
- Returns:
- Dictionary mapping encoded feature indices to descriptive names.
For one-hot encoded features, names include the category value (e.g., ‘color=red’, ‘color=blue’).
- Return type:
dict
Example
>>> features = encoder.get_encoded_features() >>> # {0: 'age', 1: 'color=red', 2: 'color=blue', 3: 'color=green'}
- get_encoded_intervals()[source]
Get index intervals for each original feature in the encoded space.
This method returns a list of (start, end) tuples indicating the range of encoded indices that correspond to each original feature. This is useful when an original categorical feature is one-hot encoded into multiple columns.
- Returns:
- List of (start_idx, end_idx) tuples, one for each original feature.
For numerical features, start_idx == end_idx. For one-hot encoded categorical features, the interval spans multiple indices.
- Return type:
list
Example
>>> intervals = encoder.get_encoded_intervals() >>> # [(0, 1), (1, 4), (4, 5)] # age (1 col), color (3 cols), income (1 col)