lore_sa.neighgen.GeneticGenerator

class lore_sa.neighgen.GeneticGenerator(bbox=None, dataset=None, encoder=None, ocr=0.1, alpha1=0.5, alpha2=0.5, metric=<function neuclidean>, ngen=30, mutpb=0.2, cxpb=0.5, tournsize=3, halloffame_ratio=0.1, random_seed=None)[source]

Random Generator creates neighbor instances by generating random values starting from an input instance and pruning the generation around a fitness function based on proximity to the instance to explain

__init__(bbox=None, dataset=None, encoder=None, ocr=0.1, alpha1=0.5, alpha2=0.5, metric=<function neuclidean>, ngen=30, mutpb=0.2, cxpb=0.5, tournsize=3, halloffame_ratio=0.1, random_seed=None)[source]

Parameters:

bbox – the Black Box model to explain
dataset – the dataset with the descriptor of the original dataset
encoder – an encoder to transfrom the data from/to the black box model
ocr – acronym for One Class Ratio, it is the ratio of the number of instances of the minority class
alpha1 – the weight of the similarity of the features from the given instance. The sum of alpha1 and alpha2 must be 1
alpha2 – the weight of the similiarity of the target class from the given instance. The sum of alpha1 and alpha2 must be 1
metric – the distance metric to use to compute the distance between instances
ngen – the number of generations to run
mutpb – probability of mutation of a specific feature
cxpb –
tournsize –
halloffame_ratio –
random_seed – initial seed for the random number generator

Methods

`__init__`([bbox, dataset, encoder, ocr, ...])	param bbox: the Black Box model to explain
`add_halloffame`(population, halloffame)
`balance_neigh`(z, Z, num_samples)
`check_generated`([filter_function, check_fuction])	It contains the logic to check the requirements for generated data
`clone`(x)
`eaSimple`(toolbox, cxpb, mutpb, ngen[, ...])	This algorithm reproduce the simplest evolutionary algorithm as presented in chapter 7 of [Back2000].
`fit`(toolbox, population_size)
`fitness_equal`(z, z1)
`fitness_notequal`(z, z1)
`generate`(z, num_instances, descriptor, encoder)	The generation is based on the strategy of generating a number of instances for the same class as the input instance and a number of instances for a different class.
`generate_synthetic_instance`([from_z, mutpb])	Generate a single synthetic instance.
`mate`(ind1, ind2)	Executes a two-point crossover on the input sequence individuals.
`mutate`(toolbox, x)
`population_fitness_equal`(z)	This fitness function evaluate the feature_similarity and the target_similarity of a population against a given instance z.
`population_fitness_notequal`(z)
`random_init`()
`record_init`(x)	This function is used to generate a random instance to start the evolutionary algorithm.
`setup_toolbox`(x, evaluate, population_size)
`setup_toolbox_noteq`(x, x1, evaluate, ...)

abstract check_generated(filter_function=None, check_fuction=None): It contains the logic to check the requirements for generated data

eaSimple(toolbox, cxpb, mutpb, ngen, stats=None, halloffame=None, verbose=True)[source]

This algorithm reproduce the simplest evolutionary algorithm as presented in chapter 7 of [Back2000].

Parameters:

population – A list of individuals.
toolbox – A Toolbox that contains the evolution operators.
cxpb – The probability of mating two individuals.
mutpb – The probability of mutating an individual.
ngen – The number of generation.
stats – A Statistics object that is updated inplace, optional.
halloffame – A HallOfFame object that will contain the best individuals, optional.
verbose – Whether or not to log the statistics.

Returns:

The final population

Returns:

A class:~deap.tools.Logbook with the statistics of the evolution

This implementation is an adaptation of the original algorithm implemented in the DEAP library.

[Back2000] (1,2)

Back, Fogel and Michalewicz, “Evolutionary Computation 1 : Basic Algorithms and Operators”, 2000.

generate(z, num_instances, descriptor, encoder)[source]

The generation is based on the strategy of generating a number of instances for the same class as the input instance and a number of instances for a different class. The generation of the instances for each subgroup is done through a genetic algorithm based on two fitness fuctions: one for the same class and one for the different class. :param z: the input instance, from which the generation starts :param num_instances: how many elements to generate :param descriptor: the descriptor of the dataset. This provides the metadata of each feature to guide the generation :param encoder: the encoder to transform the data from/to the black box model

Returns:: a new set of instances generated from the input instance. The first element is the input instance

generate_synthetic_instance(from_z=None, mutpb=1.0)

Generate a single synthetic instance.

This method creates one synthetic instance by randomly sampling or mutating feature values. For categorical features, it randomly selects from valid values. For numerical features, it samples from the feature’s range.

Parameters:

from_z (np.array, optional) – Starting instance in encoded space to mutate. If None, generates a completely random instance. If provided, features are mutated with probability mutpb.
mutpb (float, optional) – Mutation probability for each feature (0 to 1). Only used when from_z is provided. Default is 1.0 (mutate all features).

Returns:

A single synthetic instance in encoded space, shape (n_encoded_features,)

Return type:

np.array

Note

The method respects feature types and valid ranges from the dataset descriptor. For categorical features, it ensures the one-hot encoding constraint (exactly one category is active).

mate(ind1, ind2)

Executes a two-point crossover on the input sequence individuals. The two individuals are modified in place and both keep their original length. This implementation uses the original implementation of the DEAP library. It adds a special case for the one-hot encoding, where the crossover is done taking into account the intervals of values imposed by the one-hot encoding.

Parameters:

ind1 – The first individual participating in the crossover.
ind2 – The second individual participating in the crossover.

Returns:

A tuple of two individuals.

This function uses the randint() function from the Python base random module.

population_fitness_equal(z)[source]: This fitness function evaluate the feature_similarity and the target_similarity of a population against a given instance z. The two similarities are computed using optimezed functions of numpy and scipy libraries. This improves the performance of the algorithm.

record_init(x)

This function is used to generate a random instance to start the evolutionary algorithm. In this case we repeat the input instance x for all the initial population

Returns:: a (not so) random instance