lore_sa.encoder_decoder.ColumnTransformerEnc

class lore_sa.encoder_decoder.ColumnTransformerEnc(descriptor: dict)[source]

It provides an interface to access One Hot enconding (https://en.wikipedia.org/wiki/One-hot) functions. It relies on OneHotEncoder class from sklearn

__init__(descriptor: dict)[source]

Initialize the encoder/decoder.

Parameters:: dataset_descriptor (dict) – Dictionary containing feature information including ‘numeric’, ‘categorical’, and ‘ordinal’ feature descriptors

Methods

`__init__`(descriptor)	Initialize the encoder/decoder.
`decode`(Z)	Decode the array staring from the original descriptor
`decode_target_class`(Z)	Decode the target class
`encode`(X)	It applies the encoder to the input features
`encode_target_class`(X)	Encode the target class :param X: :return:
`get_encoded_features`()	Get a mapping of encoded feature indices to feature names.
`get_encoded_intervals`()	Get index intervals for each original feature in the encoded space.

decode(Z: numpy.array)[source]

Decode the array staring from the original descriptor

Parameters:: x ([Numpy array]) – Array to decode
Return [Numpy array]:: Decoded array

decode_target_class(Z: numpy.array)[source]

Decode the target class

Parameters:: x ([Numpy array]) – Array containing the target class values to be decoded

encode(X: numpy.array)[source]

It applies the encoder to the input features

Parameters:: x ([Numpy array]) – Array to encode
Return [Numpy array]:: Encoded array

encode_target_class(X: numpy.array)[source]: Encode the target class :param X: :return:

get_encoded_features()[source]

Get a mapping of encoded feature indices to feature names.

Returns:

Dictionary mapping encoded feature indices to descriptive names.: For one-hot encoded features, names include the category value (e.g., ‘color=red’, ‘color=blue’).

Return type:

dict

Example

>>> features = encoder.get_encoded_features()
>>> # {0: 'age', 1: 'color=red', 2: 'color=blue', 3: 'color=green'}

get_encoded_intervals()[source]

Get index intervals for each original feature in the encoded space.

This method returns a list of (start, end) tuples indicating the range of encoded indices that correspond to each original feature. This is useful when an original categorical feature is one-hot encoded into multiple columns.

Returns:

List of (start_idx, end_idx) tuples, one for each original feature.: For numerical features, start_idx == end_idx. For one-hot encoded categorical features, the interval spans multiple indices.

Return type:

list

Example

>>> intervals = encoder.get_encoded_intervals()
>>> # [(0, 1), (1, 4), (4, 5)]  # age (1 col), color (3 cols), income (1 col)