🧩 Layers API Reference
Welcome to the KerasFactory Layers documentation! All layers are designed to work exclusively with Keras 3 and provide specialized implementations for advanced tabular data processing, feature engineering, attention mechanisms, and time series forecasting.
What You'll Find Here
Each layer includes detailed documentation with: - ✨ Complete parameter descriptions with types and defaults - 🎯 Usage examples showing real-world applications - ⚡ Best practices and performance considerations - 🎨 When to use guidance for each layer - 🔧 Implementation notes for developers
Modular & Composable
These layers can be combined together to create complex neural network architectures tailored to your specific needs.
Keras 3 Compatible
All layers are built on top of Keras base classes and are fully compatible with Keras 3.
⏱️ Time Series & Forecasting
📍 PositionalEmbedding
Fixed sinusoidal positional encoding for transformers and sequence models.
kerasfactory.layers.PositionalEmbedding
Positional Embedding layer for transformer-based models.
Classes
PositionalEmbedding
1 2 3 4 5 6 | |
Sinusoidal positional encoding layer.
Generates fixed positional encodings using sine and cosine functions with different frequencies. These are added to input embeddings to provide positional information to the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_model |
int
|
Dimension of the positional embeddings. |
required |
max_len |
int
|
Maximum length of sequences (default: 5000). |
5000
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, seq_len, ...)
Output shape
(1, seq_len, d_model)
Example
1 2 3 4 5 6 7 | |
Initialize the PositionalEmbedding layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_model |
int
|
Dimension of positional embeddings. |
required |
max_len |
int
|
Maximum sequence length. |
5000
|
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/PositionalEmbedding.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
🔧 FixedEmbedding
Non-trainable sinusoidal embeddings for discrete indices (months, days, hours, etc.).
kerasfactory.layers.FixedEmbedding
Fixed Embedding layer for temporal position encoding.
Classes
FixedEmbedding
1 2 3 4 5 6 | |
Fixed sinusoidal embedding layer.
Provides fixed (non-trainable) sinusoidal embeddings for discrete indices, commonly used for encoding temporal features or positions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_features |
int
|
Number of features/vocabulary size. |
required |
d_model |
int
|
Dimension of the embedding vectors. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, seq_len) - integer indices
Output shape
(batch_size, seq_len, d_model)
Example
1 2 3 4 5 6 7 8 | |
Initialize the FixedEmbedding layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_features |
int
|
Number of discrete features/positions. |
required |
d_model |
int
|
Dimension of embedding vectors. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/FixedEmbedding.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
🎫 TokenEmbedding
1D convolution-based embedding layer for time series values.
kerasfactory.layers.TokenEmbedding
Token Embedding layer for time series using 1D convolution.
Classes
TokenEmbedding
1 2 3 4 5 6 | |
Embeds time series values using 1D convolution.
Uses a conv1d layer with circular padding to create embeddings from raw values. Kaiming normal initialization is applied for proper training dynamics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
c_in |
int
|
Number of input channels. |
required |
d_model |
int
|
Dimension of output embeddings. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, time_steps, channels)
Output shape
(batch_size, time_steps, d_model)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the TokenEmbedding layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
c_in |
int
|
Number of input channels. |
required |
d_model |
int
|
Dimension of output embeddings. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/TokenEmbedding.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
⏰ TemporalEmbedding
Embedding layer for temporal features (month, day, weekday, hour, minute).
kerasfactory.layers.TemporalEmbedding
Temporal Embedding layer for time feature encoding.
Classes
TemporalEmbedding
1 2 3 4 5 6 7 | |
Embeds temporal features (month, day, weekday, hour, minute).
Creates embeddings for calendar features to capture temporal patterns. Supports both fixed and trainable embedding modes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_model |
int
|
Dimension of embeddings. |
required |
embed_type |
str
|
Type of embedding - 'fixed' or 'learned' (default: 'fixed'). |
'fixed'
|
freq |
str
|
Frequency - 't' (minute level) or 'h' (hour level) (default: 'h'). |
'h'
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, seq_len, 5) - with encoded [month, day, weekday, hour, minute]
Output shape
(batch_size, seq_len, d_model)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the TemporalEmbedding layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_model |
int
|
Dimension of embeddings. |
required |
embed_type |
str
|
Type of embedding ('fixed' or 'learned'). |
'fixed'
|
freq |
str
|
Frequency ('t' or 'h'). |
'h'
|
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/TemporalEmbedding.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | |
🎯 DataEmbeddingWithoutPosition
Combined token and temporal embedding layer for comprehensive feature representation.
kerasfactory.layers.DataEmbeddingWithoutPosition
Data Embedding layer combining value and temporal embeddings.
Classes
DataEmbeddingWithoutPosition
1 2 3 4 5 6 7 8 9 | |
Combines token (value) and temporal embeddings.
Embeds time series values using token embedding and optionally adds temporal features. Applies dropout after combining embeddings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
c_in |
int
|
Number of input channels. |
required |
d_model |
int
|
Dimension of embeddings. |
required |
embed_type |
str
|
Type of temporal embedding ('fixed' or 'learned'). |
'fixed'
|
freq |
str
|
Frequency for temporal features ('t' or 'h'). |
'h'
|
dropout |
float
|
Dropout rate (default: 0.1). |
0.1
|
name |
str | None
|
Optional name for the layer. |
None
|
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the DataEmbeddingWithoutPosition layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
c_in |
int
|
Number of input channels. |
required |
d_model |
int
|
Dimension of embeddings. |
required |
embed_type |
str
|
Type of temporal embedding. |
'fixed'
|
freq |
str
|
Frequency for temporal embedding. |
'h'
|
dropout |
float
|
Dropout rate. |
0.1
|
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DataEmbeddingWithoutPosition.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
🏃 MovingAverage
Trend extraction layer using moving average filtering for time series.
kerasfactory.layers.MovingAverage
Moving Average layer for time series trend extraction.
Classes
MovingAverage
1 2 3 | |
Extracts the trend component using moving average.
This layer computes a moving average over time series to extract the trend component. It applies padding at both ends to maintain the temporal dimension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kernel_size |
int
|
Size of the moving average window. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, time_steps, channels)
Output shape
(batch_size, time_steps, channels)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the MovingAverage layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kernel_size |
int
|
Size of the moving average kernel. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/MovingAverage.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
🔀 SeriesDecomposition
Trend-seasonal decomposition using moving average.
kerasfactory.layers.SeriesDecomposition
Series Decomposition layer for time series trend-seasonal separation.
Classes
SeriesDecomposition
1 2 3 | |
Decomposes time series into trend and seasonal components.
Uses moving average to extract the trend component, then computes seasonal as the residual (input - trend).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kernel_size |
int
|
Size of the moving average window. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, time_steps, channels)
Output shape
- seasonal: (batch_size, time_steps, channels)
- trend: (batch_size, time_steps, channels)
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the SeriesDecomposition layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kernel_size |
int
|
Size of the moving average window. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/SeriesDecomposition.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
📊 DFTSeriesDecomposition
Frequency-based series decomposition using Discrete Fourier Transform.
kerasfactory.layers.DFTSeriesDecomposition
DFT-based Series Decomposition layer using frequency domain analysis.
Classes
DFTSeriesDecomposition
1 2 3 | |
Decomposes time series using DFT (Discrete Fourier Transform).
Extracts seasonal components by selecting top-k frequencies in the frequency domain, then computes trend as the residual. This method captures periodic patterns more explicitly than moving average.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
top_k |
int
|
Number of top frequencies to keep as seasonal component. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
(batch_size, time_steps, channels)
Output shape
- seasonal: (batch_size, time_steps, channels)
- trend: (batch_size, time_steps, channels)
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the DFTSeriesDecomposition layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
top_k |
int
|
Number of top frequencies to keep. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DFTSeriesDecomposition.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
🔄 ReversibleInstanceNorm
Reversible instance normalization with optional denormalization for time series.
kerasfactory.layers.ReversibleInstanceNorm
Reversible Instance Normalization layer for time series.
Classes
ReversibleInstanceNorm
1 2 3 4 5 6 7 8 9 | |
Reversible Instance Normalization (RevIN) for time series.
Normalizes each series independently and enables reversible denormalization. This is useful for improving model performance by removing distributional shifts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of features/channels. |
required |
eps |
float
|
Small value for numerical stability (default: 1e-5). |
1e-05
|
affine |
bool
|
Whether to use learnable scale and shift (default: False). |
False
|
subtract_last |
bool
|
If True, normalize by last value instead of mean (default: False). |
False
|
non_norm |
bool
|
If True, no normalization is applied (default: False). |
False
|
name |
str | None
|
Optional name for the layer. |
None
|
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the ReversibleInstanceNorm layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of features. |
required |
eps |
float
|
Epsilon for numerical stability. |
1e-05
|
affine |
bool
|
Whether to use learnable affine transformation. |
False
|
subtract_last |
bool
|
Whether to normalize by last value. |
False
|
non_norm |
bool
|
Whether to skip normalization. |
False
|
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/ReversibleInstanceNorm.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | |
🏗️ ReversibleInstanceNormMultivariate
Multivariate version of reversible instance normalization.
kerasfactory.layers.ReversibleInstanceNormMultivariate
Multivariate Reversible Instance Normalization layer.
Classes
ReversibleInstanceNormMultivariate
1 2 3 4 5 6 7 | |
Reversible Instance Normalization for multivariate time series.
Normalizes each series independently across the time dimension, enabling reversible denormalization. Designed for multivariate data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of features/channels. |
required |
eps |
float
|
Small value for numerical stability (default: 1e-5). |
1e-05
|
affine |
bool
|
Whether to use learnable scale and shift (default: False). |
False
|
name |
str | None
|
Optional name for the layer. |
None
|
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the ReversibleInstanceNormMultivariate layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of features. |
required |
eps |
float
|
Epsilon for numerical stability. |
1e-05
|
affine |
bool
|
Whether to use learnable affine transformation. |
False
|
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/ReversibleInstanceNormMultivariate.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | |
🌊 MultiScaleSeasonMixing
Bottom-up multi-scale seasonal pattern mixing.
kerasfactory.layers.MultiScaleSeasonMixing
Multi-Scale Season Mixing layer for hierarchical seasonal pattern mixing.
Classes
MultiScaleSeasonMixing
1 2 3 4 5 6 7 | |
Mixes seasonal patterns across multiple scales bottom-up.
Processes seasonal components at different temporal resolutions, mixing information from coarse to fine scales.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_len |
int
|
Input sequence length. |
required |
down_sampling_window |
int
|
Window size for downsampling. |
required |
down_sampling_layers |
int
|
Number of downsampling layers. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the MultiScaleSeasonMixing layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_len |
int
|
Sequence length. |
required |
down_sampling_window |
int
|
Downsampling window size. |
required |
down_sampling_layers |
int
|
Number of downsampling layers. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/MultiScaleSeasonMixing.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
📈 MultiScaleTrendMixing
Top-down multi-scale trend pattern mixing.
kerasfactory.layers.MultiScaleTrendMixing
Multi-Scale Trend Mixing layer for hierarchical trend pattern mixing.
Classes
MultiScaleTrendMixing
1 2 3 4 5 6 7 | |
Mixes trend patterns across multiple scales top-down.
Processes trend components at different temporal resolutions, mixing information from fine to coarse scales.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_len |
int
|
Input sequence length. |
required |
down_sampling_window |
int
|
Window size for downsampling. |
required |
down_sampling_layers |
int
|
Number of downsampling layers. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the MultiScaleTrendMixing layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_len |
int
|
Sequence length. |
required |
down_sampling_window |
int
|
Downsampling window size. |
required |
down_sampling_layers |
int
|
Number of downsampling layers. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/MultiScaleTrendMixing.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
🔀 PastDecomposableMixing
Past decomposable mixing encoder block combining decomposition and multi-scale mixing.
kerasfactory.layers.PastDecomposableMixing
Past Decomposable Mixing layer for time series encoder blocks.
Classes
PastDecomposableMixing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Past Decomposable Mixing block for TimeMixer encoder.
Decomposes time series, applies multi-scale mixing to trend and seasonal components, then reconstructs the signal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_len |
int
|
Sequence length. |
required |
pred_len |
int
|
Prediction length. |
required |
down_sampling_window |
int
|
Downsampling window size. |
required |
down_sampling_layers |
int
|
Number of downsampling layers. |
required |
d_model |
int
|
Model dimension. |
required |
dropout |
float
|
Dropout rate. |
required |
channel_independence |
int
|
Whether to use channel-independent processing. |
required |
decomp_method |
str
|
Decomposition method ('moving_avg' or 'dft_decomp'). |
required |
d_ff |
int
|
Feed-forward dimension. |
required |
moving_avg |
int
|
Window size for moving average. |
required |
top_k |
int
|
Top-k frequencies for DFT. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Example
1 2 3 4 5 6 7 8 9 10 11 | |
Initialize the PastDecomposableMixing layer.
Source code in kerasfactory/layers/PastDecomposableMixing.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | |
⏱️ TemporalMixing
MLP-based temporal mixing layer for TSMixer that applies transformations across the time dimension.
kerasfactory.layers.TemporalMixing
Temporal Mixing layer for TSMixer model.
Classes
TemporalMixing
1 2 3 4 5 6 7 | |
Temporal mixing layer using MLP on time dimension.
Applies batch normalization and linear transformation across the time dimension to mix temporal information while preserving the multivariate structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_series |
int
|
Number of time series (channels/features). |
required |
input_size |
int
|
Length of the time series (sequence length). |
required |
dropout |
float
|
Dropout rate between 0 and 1. |
required |
Input shape
(batch_size, input_size, n_series)
Output shape
(batch_size, input_size, n_series)
Example
layer = TemporalMixing(n_series=7, input_size=96, dropout=0.1) x = keras.random.normal((32, 96, 7)) output = layer(x) output.shape (32, 96, 7)
Initialize the TemporalMixing layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_series |
int
|
Number of time series. |
required |
input_size |
int
|
Length of time series. |
required |
dropout |
float
|
Dropout rate. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/TemporalMixing.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
🔀 FeatureMixing
Feed-forward network mixing layer for TSMixer that learns cross-series correlations across feature dimension.
kerasfactory.layers.FeatureMixing
Feature Mixing layer for TSMixer model.
Classes
FeatureMixing
1 2 3 4 5 6 7 8 | |
Feature mixing layer using MLP on channel dimension.
Applies batch normalization and feed-forward network across the feature (channel) dimension to mix information between different time series while preserving temporal structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_series |
int
|
Number of time series (channels/features). |
required |
input_size |
int
|
Length of the time series (sequence length). |
required |
dropout |
float
|
Dropout rate between 0 and 1. |
required |
ff_dim |
int
|
Dimension of the hidden layer in the feed-forward network. |
required |
Input shape
(batch_size, input_size, n_series)
Output shape
(batch_size, input_size, n_series)
Example
layer = FeatureMixing(n_series=7, input_size=96, dropout=0.1, ff_dim=64) x = keras.random.normal((32, 96, 7)) output = layer(x) output.shape (32, 96, 7)
Initialize the FeatureMixing layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_series |
int
|
Number of time series. |
required |
input_size |
int
|
Length of time series. |
required |
dropout |
float
|
Dropout rate. |
required |
ff_dim |
int
|
Feed-forward hidden dimension. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/FeatureMixing.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | |
🔀 MixingLayer
Core mixing block combining TemporalMixing and FeatureMixing for the TSMixer architecture.
kerasfactory.layers.MixingLayer
Mixing Layer combining temporal and feature mixing for TSMixer.
Classes
MixingLayer
1 2 3 4 5 6 7 8 | |
Mixing layer combining temporal and feature mixing.
A mixing layer consists of sequential temporal and feature MLPs that jointly learn temporal and cross-sectional representations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_series |
int
|
Number of time series (channels/features). |
required |
input_size |
int
|
Length of the time series (sequence length). |
required |
dropout |
float
|
Dropout rate between 0 and 1. |
required |
ff_dim |
int
|
Dimension of the hidden layer in the feed-forward network. |
required |
Input shape
(batch_size, input_size, n_series)
Output shape
(batch_size, input_size, n_series)
Example
layer = MixingLayer(n_series=7, input_size=96, dropout=0.1, ff_dim=64) x = keras.random.normal((32, 96, 7)) output = layer(x) output.shape (32, 96, 7)
Initialize the MixingLayer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_series |
int
|
Number of time series. |
required |
input_size |
int
|
Length of time series. |
required |
dropout |
float
|
Dropout rate. |
required |
ff_dim |
int
|
Feed-forward hidden dimension. |
required |
name |
str | None
|
Optional layer name. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/MixingLayer.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
🎯 Feature Selection & Gating
🔀 VariableSelection
Dynamic feature selection using gated residual networks with optional context conditioning.
kerasfactory.layers.VariableSelection
This module implements a VariableSelection layer that applies a gated residual network to each feature independently and learns feature weights through a softmax layer. It's particularly useful for dynamic feature selection in time series and tabular models.
Classes
VariableSelection
1 2 3 4 5 6 7 8 | |
Layer for dynamic feature selection using gated residual networks.
This layer applies a gated residual network to each feature independently and learns feature weights through a softmax layer. It can optionally use a context vector to condition the feature selection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nr_features |
int
|
Number of input features |
required |
units |
int
|
Number of hidden units in the gated residual network |
required |
dropout_rate |
float
|
Dropout rate for regularization |
0.1
|
use_context |
bool
|
Whether to use a context vector for conditioning |
False
|
name |
str
|
Name for the layer |
None
|
Input shape
If use_context is False:
- Single tensor with shape: (batch_size, nr_features, feature_dim)
If use_context is True:
- List of two tensors:
- Features tensor with shape: (batch_size, nr_features, feature_dim)
- Context tensor with shape: (batch_size, context_dim)
Output shape
Tuple of two tensors:
- Selected features: (batch_size, feature_dim)
- Feature weights: (batch_size, nr_features)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Initialize the VariableSelection layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nr_features |
int
|
Number of input features. |
required |
units |
int
|
Number of units in the selection network. |
required |
dropout_rate |
float
|
Dropout rate. |
0.1
|
use_context |
bool
|
Whether to use context for selection. |
False
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/VariableSelection.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...] | list[tuple[int, ...]]
|
Shape of the input tensor or list of shapes if using context. |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[int, ...]]
|
List of shapes for the output tensors. |
Source code in kerasfactory/layers/VariableSelection.py
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | |
🚪 GatedFeatureSelection
Feature selection layer using gating mechanisms for conditional feature routing.
kerasfactory.layers.GatedFeatureSelection
1 2 3 4 5 | |
Gated feature selection layer with residual connection.
This layer implements a learnable feature selection mechanism using a gating network. Each feature is assigned a dynamic importance weight between 0 and 1 through a multi-layer gating network. The gating network includes batch normalization and ReLU activations for stable training. A small residual connection (0.1) is added to maintain gradient flow.
The layer is particularly useful for: 1. Dynamic feature importance learning 2. Feature selection in time-series data 3. Attention-like mechanisms for tabular data 4. Reducing noise in input features
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Dimension of the input features |
required |
reduction_ratio |
int
|
Ratio to reduce the hidden dimension of the gating network. A higher ratio means fewer parameters but potentially less expressive gates. Default is 4, meaning the hidden dimension will be input_dim // 4. |
4
|
Initialize the gated feature selection layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Dimension of the input features. Must match the last dimension of the input tensor. |
required |
reduction_ratio |
int
|
Ratio to reduce the hidden dimension of the gating network. The hidden dimension will be max(input_dim // reduction_ratio, 1). Default is 4. |
4
|
**kwargs |
dict[str, Any]
|
Additional layer arguments passed to the parent Layer class. |
{}
|
Source code in kerasfactory/layers/GatedFeaturesSelection.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
Functions
from_config
classmethod
1 2 3 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
GatedFeatureSelection
|
GatedFeatureSelection instance |
Source code in kerasfactory/layers/GatedFeaturesSelection.py
146 147 148 149 150 151 152 153 154 155 156 | |
🌊 GatedFeatureFusion
Combines and fuses features using gated mechanisms for adaptive feature integration.
kerasfactory.layers.GatedFeatureFusion
This module implements a GatedFeatureFusion layer that combines two feature representations through a learned gating mechanism. It's particularly useful for tabular datasets with multiple representations (e.g., raw numeric features alongside embeddings).
Classes
GatedFeatureFusion
1 2 3 4 5 | |
Gated feature fusion layer for combining two feature representations.
This layer takes two inputs (e.g., numerical features and their embeddings) and fuses them using a learned gate to balance their contributions. The gate is computed using a dense layer with sigmoid activation, applied to the concatenation of both inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
activation |
str
|
Activation function to use for the gate. Default is 'sigmoid'. |
'sigmoid'
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
A list of 2 tensors with shape: [(batch_size, ..., features), (batch_size, ..., features)]
Both inputs must have the same shape.
Output shape
Tensor with shape: (batch_size, ..., features), same as each input.
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the GatedFeatureFusion layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
activation |
str
|
Activation function for the gate. |
'sigmoid'
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/GatedFeatureFusion.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
📍 GatedLinearUnit
Gated linear transformation for controlling information flow in neural networks.
kerasfactory.layers.GatedLinearUnit
This module implements a GatedLinearUnit layer that applies a gated linear transformation to input tensors. It's particularly useful for controlling information flow in neural networks.
Classes
GatedLinearUnit
1 2 3 | |
GatedLinearUnit is a custom Keras layer that implements a gated linear unit.
This layer applies a dense linear transformation to the input tensor and multiplies the result with the output of a dense sigmoid transformation. The result is a tensor where the input data is filtered based on the learned weights and biases of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
units |
int
|
Positive integer, dimensionality of the output space. |
required |
name |
str
|
Name for the layer. |
None
|
Input shape
Tensor with shape: (batch_size, ..., input_dim)
Output shape
Tensor with shape: (batch_size, ..., units)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the GatedLinearUnit layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
units |
int
|
Number of units in the layer. |
required |
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/GatedLinearUnit.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
🔗 GatedResidualNetwork
Gated residual network architecture for feature processing with residual connections.
kerasfactory.layers.GatedResidualNetwork
This module implements a GatedResidualNetwork layer that combines residual connections with gated linear units for improved gradient flow and feature transformation.
Classes
GatedResidualNetwork
1 2 3 4 5 6 | |
GatedResidualNetwork is a custom Keras layer that implements a gated residual network.
This layer applies a series of transformations to the input tensor and combines the result with the input using a residual connection. The transformations include a dense layer with ELU activation, a dense linear layer, a dropout layer, a gated linear unit layer, layer normalization, and a final dense layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
units |
int
|
Positive integer, dimensionality of the output space. |
required |
dropout_rate |
float
|
Dropout rate for regularization. Defaults to 0.2. |
0.2
|
name |
str
|
Name for the layer. |
None
|
Input shape
Tensor with shape: (batch_size, ..., input_dim)
Output shape
Tensor with shape: (batch_size, ..., units)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the GatedResidualNetwork.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
units |
int
|
Number of units in the network. |
required |
dropout_rate |
float
|
Dropout rate. |
0.2
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/GatedResidualNetwork.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
👁️ Attention Mechanisms
🎯 TabularAttention
Dual attention mechanism for tabular data with inter-feature and inter-sample attention.
kerasfactory.layers.TabularAttention
This module implements a TabularAttention layer that applies inter-feature and inter-sample attention mechanisms for tabular data. It's particularly useful for capturing complex relationships between features and samples in tabular datasets.
Classes
TabularAttention
1 2 3 4 5 6 7 | |
Custom layer to apply inter-feature and inter-sample attention for tabular data.
This layer implements a dual attention mechanism: 1. Inter-feature attention: Captures dependencies between features for each sample 2. Inter-sample attention: Captures dependencies between samples for each feature
The layer uses MultiHeadAttention for both attention mechanisms and includes layer normalization, dropout, and a feed-forward network.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_heads |
int
|
Number of attention heads |
required |
d_model |
int
|
Dimensionality of the attention model |
required |
dropout_rate |
float
|
Dropout rate for regularization |
0.1
|
name |
str
|
Name for the layer |
None
|
Input shape
Tensor with shape: (batch_size, num_samples, num_features)
Output shape
Tensor with shape: (batch_size, num_samples, d_model)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the TabularAttention layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_heads |
int
|
Number of attention heads. |
required |
d_model |
int
|
Model dimension. |
required |
dropout_rate |
float
|
Dropout rate. |
0.1
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/TabularAttention.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...]
|
Shape of the input tensor. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Shape of the output tensor. |
Source code in kerasfactory/layers/TabularAttention.py
223 224 225 226 227 228 229 230 231 232 | |
📊 MultiResolutionTabularAttention
Multi-resolution attention mechanism for capturing features at different scales.
kerasfactory.layers.MultiResolutionTabularAttention
This module implements a MultiResolutionTabularAttention layer that applies separate attention mechanisms for numerical and categorical features, along with cross-attention between them. It's particularly useful for mixed-type tabular data.
Classes
MultiResolutionTabularAttention
1 2 3 4 5 6 7 | |
Custom layer to apply multi-resolution attention for mixed-type tabular data.
This layer implements separate attention mechanisms for numerical and categorical features, along with cross-attention between them. It's designed to handle the different characteristics of numerical and categorical features in tabular data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_heads |
int
|
Number of attention heads |
required |
d_model |
int
|
Dimensionality of the attention model |
required |
dropout_rate |
float
|
Dropout rate for regularization |
0.1
|
name |
str
|
Name for the layer |
None
|
Input shape
List of two tensors:
- Numerical features: (batch_size, num_samples, num_numerical_features)
- Categorical features: (batch_size, num_samples, num_categorical_features)
Output shape
List of two tensors with shapes:
- (batch_size, num_samples, d_model) (numerical features)
- (batch_size, num_samples, d_model) (categorical features)
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the MultiResolutionTabularAttention.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_heads |
int
|
Number of attention heads. |
required |
d_model |
int
|
Model dimension. |
required |
dropout_rate |
float
|
Dropout rate. |
0.1
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/MultiResolutionTabularAttention.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
list[tuple[int, ...]]
|
List of shapes of the input tensors. |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[int, ...]]
|
List of shapes of the output tensors. |
Source code in kerasfactory/layers/MultiResolutionTabularAttention.py
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 | |
🔍 InterpretableMultiHeadAttention
Interpretable multi-head attention layer with explainability features.
kerasfactory.layers.InterpretableMultiHeadAttention
Interpretable Multi-Head Attention layer implementation.
Classes
InterpretableMultiHeadAttention
1 2 3 4 5 6 | |
Interpretable Multi-Head Attention layer.
This layer wraps Keras MultiHeadAttention and stores the attention scores
for interpretability purposes. The attention scores can be accessed via
the attention_scores attribute after calling the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_model |
int
|
Size of each attention head for query, key, value. |
required |
n_head |
int
|
Number of attention heads. |
required |
dropout_rate |
float
|
Dropout probability. Default: 0.1. |
0.1
|
**kwargs |
dict[str, Any]
|
Additional arguments passed to MultiHeadAttention. Supported arguments: - value_dim: Size of each attention head for value. - use_bias: Whether to use bias. Default: True. - output_shape: Expected output shape. Default: None. - attention_axes: Axes for attention. Default: None. - kernel_initializer: Initializer for kernels. Default: 'glorot_uniform'. - bias_initializer: Initializer for biases. Default: 'zeros'. - kernel_regularizer: Regularizer for kernels. Default: None. - bias_regularizer: Regularizer for biases. Default: None. - activity_regularizer: Regularizer for activity. Default: None. - kernel_constraint: Constraint for kernels. Default: None. - bias_constraint: Constraint for biases. Default: None. - seed: Random seed for dropout. Default: None. |
{}
|
Call Args
query: Query tensor of shape (B, S, E) where B is batch size,
S is sequence length, and E is the feature dimension.
key: Key tensor of shape (B, S, E).
value: Value tensor of shape (B, S, E).
training: Python boolean indicating whether the layer should behave in
training mode (applying dropout) or in inference mode (no dropout).
Returns:
| Name | Type | Description |
|---|---|---|
output |
Attention output of shape |
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Initialize the layer.
Source code in kerasfactory/layers/InterpretableMultiHeadAttention.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | |
Functions
classmethod
1 2 3 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
InterpretableMultiHeadAttention
|
Layer instance |
Source code in kerasfactory/layers/InterpretableMultiHeadAttention.py
152 153 154 155 156 157 158 159 160 161 162 | |
🧠 TransformerBlock
Complete transformer block combining self-attention and feed-forward networks.
kerasfactory.layers.TransformerBlock
This module implements a TransformerBlock layer that applies transformer-style self-attention and feed-forward processing to input tensors. It's particularly useful for capturing complex relationships in tabular data.
Classes
TransformerBlock
1 2 3 4 5 6 7 8 | |
Transformer block with multi-head attention and feed-forward layers.
This layer implements a standard transformer block with multi-head self-attention followed by a feed-forward network, with residual connections and layer normalization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim_model |
int
|
Dimensionality of the model. |
32
|
num_heads |
int
|
Number of attention heads. |
3
|
ff_units |
int
|
Number of units in the feed-forward network. |
16
|
dropout_rate |
float
|
Dropout rate for regularization. |
0.2
|
name |
str
|
Name for the layer. |
None
|
Input shape
Tensor with shape: (batch_size, sequence_length, dim_model) or
(batch_size, dim_model) which will be automatically reshaped.
Output shape
Tensor with shape: (batch_size, sequence_length, dim_model) or
(batch_size, dim_model) matching the input shape.
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the TransformerBlock layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim_model |
int
|
Model dimension. |
32
|
num_heads |
int
|
Number of attention heads. |
3
|
ff_units |
int
|
Feed-forward units. |
16
|
dropout_rate |
float
|
Dropout rate. |
0.2
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/TransformerBlock.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...]
|
Shape of the input tensor. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Shape of the output tensor. |
Source code in kerasfactory/layers/TransformerBlock.py
198 199 200 201 202 203 204 205 206 207 | |
📌 ColumnAttention
Attention mechanism focused on inter-column (feature) relationships.
kerasfactory.layers.ColumnAttention
Column attention mechanism for weighting features dynamically.
Classes
ColumnAttention
1 2 3 4 5 | |
Column attention mechanism to weight features dynamically.
This layer applies attention weights to each feature (column) in the input tensor. The attention weights are computed using a two-layer neural network that takes the input features and outputs attention weights for each feature.
Example
1 2 3 4 5 6 7 8 9 10 11 | |
Initialize column attention.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Input dimension |
required |
hidden_dim |
int | None
|
Hidden layer dimension. If None, uses input_dim // 2 |
None
|
**kwargs |
dict[str, Any]
|
Additional layer arguments |
{}
|
Source code in kerasfactory/layers/ColumnAttention.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
Functions
classmethod
1 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
ColumnAttention
|
ColumnAttention instance |
Source code in kerasfactory/layers/ColumnAttention.py
111 112 113 114 115 116 117 118 119 120 121 | |
📍 RowAttention
Attention mechanism focused on inter-row (sample) relationships.
kerasfactory.layers.RowAttention
Row attention mechanism for weighting samples in a batch.
Classes
RowAttention
1 2 3 4 5 | |
Row attention mechanism to weight samples dynamically.
This layer applies attention weights to each sample (row) in the input tensor. The attention weights are computed using a two-layer neural network that takes each sample as input and outputs a scalar attention weight.
Example
1 2 3 4 5 6 7 8 9 10 11 | |
Initialize row attention.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_dim |
int
|
Number of input features |
required |
hidden_dim |
int | None
|
Hidden layer dimension. If None, uses feature_dim // 2 |
None
|
**kwargs |
dict[str, Any]
|
Additional layer arguments |
{}
|
Source code in kerasfactory/layers/RowAttention.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
Functions
classmethod
1 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
RowAttention
|
RowAttention instance |
Source code in kerasfactory/layers/RowAttention.py
114 115 116 117 118 119 120 121 122 123 124 | |
📊 Data Preprocessing & Transformation
🔄 DistributionTransformLayer
Transforms data distributions (log, Box-Cox, Yeo-Johnson, etc.) for improved analysis.
kerasfactory.layers.DistributionTransformLayer
This module implements a DistributionTransformLayer that applies various transformations to make data more normally distributed or to handle specific distribution types better. It's particularly useful for preprocessing data before anomaly detection or other statistical analyses.
Classes
DistributionTransformLayer
1 2 3 4 5 6 7 8 9 10 11 | |
Layer for transforming data distributions to improve anomaly detection.
This layer applies various transformations to make data more normally distributed or to handle specific distribution types better. Supported transformations include log, square root, Box-Cox, Yeo-Johnson, arcsinh, cube-root, logit, quantile, robust-scale, and min-max.
When transform_type is set to 'auto', the layer automatically selects the most appropriate transformation based on the data characteristics during training.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform_type |
str
|
Type of transformation to apply. Options are 'none', 'log', 'sqrt', 'box-cox', 'yeo-johnson', 'arcsinh', 'cube-root', 'logit', 'quantile', 'robust-scale', 'min-max', or 'auto'. Default is 'none'. |
'none'
|
lambda_param |
float
|
Parameter for parameterized transformations like Box-Cox and Yeo-Johnson. Default is 0.0. |
0.0
|
epsilon |
float
|
Small value added to prevent numerical issues like log(0). Default is 1e-10. |
1e-10
|
min_value |
float
|
Minimum value for min-max scaling. Default is 0.0. |
0.0
|
max_value |
float
|
Maximum value for min-max scaling. Default is 1.0. |
1.0
|
clip_values |
bool
|
Whether to clip values to the specified range in min-max scaling. Default is True. |
True
|
auto_candidates |
list[str] | None
|
list of transformation types to consider when transform_type is 'auto'. If None, all available transformations will be considered. Default is None. |
None
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
N-D tensor with shape: (batch_size, ..., features)
Output shape
Same shape as input: (batch_size, ..., features)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | |
Initialize the DistributionTransformLayer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform_type |
str
|
Type of transformation to apply. |
'none'
|
lambda_param |
float
|
Lambda parameter for Box-Cox transformation. |
0.0
|
epsilon |
float
|
Small value to avoid division by zero. |
1e-10
|
min_value |
float
|
Minimum value for clipping. |
0.0
|
max_value |
float
|
Maximum value for clipping. |
1.0
|
clip_values |
bool
|
Whether to clip values. |
True
|
auto_candidates |
list[str] | None
|
List of candidate transformations for auto mode. |
None
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DistributionTransformLayer.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
🎓 DistributionAwareEncoder
Encodes features while accounting for their underlying distributions.
kerasfactory.layers.DistributionAwareEncoder
This module implements a DistributionAwareEncoder layer that automatically detects the distribution type of input data and applies appropriate transformations and encodings. It builds upon the DistributionTransformLayer but adds more sophisticated distribution detection and specialized encoding for different distribution types.
Classes
DistributionAwareEncoder
1 2 3 4 5 6 7 8 9 | |
Layer that automatically detects and encodes data based on its distribution.
This layer first detects the distribution type of the input data and then applies appropriate transformations and encodings. It builds upon the DistributionTransformLayer but adds more sophisticated distribution detection and specialized encoding for different distribution types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedding_dim |
int | None
|
Dimension of the output embedding. If None, the output will have the same dimension as the input. Default is None. |
None
|
auto_detect |
bool
|
Whether to automatically detect the distribution type. If False, the layer will use the specified distribution_type. Default is True. |
True
|
distribution_type |
str
|
The distribution type to use if auto_detect is False. Options are "normal", "exponential", "lognormal", "uniform", "beta", "bimodal", "heavy_tailed", "mixed", "bounded", "unknown". Default is "unknown". |
'unknown'
|
transform_type |
str
|
The transformation type to use. If "auto", the layer will automatically select the best transformation based on the detected distribution. See DistributionTransformLayer for available options. Default is "auto". |
'auto'
|
add_distribution_embedding |
bool
|
Whether to add a learned embedding of the distribution type to the output. Default is False. |
False
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
N-D tensor with shape: (batch_size, ..., features).
Output shape
If embedding_dim is None, same shape as input: (batch_size, ..., features).
If embedding_dim is specified: (batch_size, ..., embedding_dim).
If add_distribution_embedding is True, the output will have an additional
dimension for the distribution embedding.
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
Initialize the DistributionAwareEncoder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedding_dim |
int | None
|
Embedding dimension. |
None
|
auto_detect |
bool
|
Whether to auto-detect distribution type. |
True
|
distribution_type |
str
|
Type of distribution. |
'unknown'
|
transform_type |
str
|
Type of transformation to apply. |
'auto'
|
add_distribution_embedding |
bool
|
Whether to add distribution embedding. |
False
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DistributionAwareEncoder.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | |
📈 AdvancedNumericalEmbedding
Advanced numerical embedding layer for rich feature representations.
kerasfactory.layers.AdvancedNumericalEmbedding
This module implements an AdvancedNumericalEmbedding layer that embeds continuous numerical features into a higher-dimensional space using a combination of continuous and discrete branches.
Classes
AdvancedNumericalEmbedding
1 2 3 4 5 6 7 8 9 10 11 | |
Advanced numerical embedding layer for continuous features.
This layer embeds each continuous numerical feature into a higher-dimensional space by combining two branches:
- Continuous Branch: Each feature is processed via a small MLP.
- Discrete Branch: Each feature is discretized into bins using learnable min/max boundaries and then an embedding is looked up for its bin.
A learnable gate combines the two branch outputs per feature and per embedding dimension. Additionally, the continuous branch uses a residual connection and optional batch normalization to improve training stability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedding_dim |
int
|
Output embedding dimension per feature. |
8
|
mlp_hidden_units |
int
|
Hidden units for the continuous branch MLP. |
16
|
num_bins |
int
|
Number of bins for discretization. |
10
|
init_min |
float or list
|
Initial minimum values for discretization boundaries. If a scalar is provided, it is applied to all features. |
-3.0
|
init_max |
float or list
|
Initial maximum values for discretization boundaries. |
3.0
|
dropout_rate |
float
|
Dropout rate applied to the continuous branch. |
0.1
|
use_batch_norm |
bool
|
Whether to apply batch normalization to the continuous branch. |
True
|
name |
str
|
Name for the layer. |
None
|
Input shape
Tensor with shape: (batch_size, num_features)
Output shape
Tensor with shape: (batch_size, num_features, embedding_dim) or
(batch_size, embedding_dim) if num_features=1
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Initialize the AdvancedNumericalEmbedding layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedding_dim |
int
|
Embedding dimension. |
8
|
mlp_hidden_units |
int
|
Hidden units in MLP. |
16
|
num_bins |
int
|
Number of bins for discretization. |
10
|
init_min |
float | list[float]
|
Minimum initialization value. |
-3.0
|
init_max |
float | list[float]
|
Maximum initialization value. |
3.0
|
dropout_rate |
float
|
Dropout rate. |
0.1
|
use_batch_norm |
bool
|
Whether to use batch normalization. |
True
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/AdvancedNumericalEmbedding.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...]
|
Shape of the input tensor. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Shape of the output tensor. |
Source code in kerasfactory/layers/AdvancedNumericalEmbedding.py
334 335 336 337 338 339 340 341 342 343 344 345 346 | |
📅 DateParsingLayer
Parses and processes date/time features.
kerasfactory.layers.DateParsingLayer
Date Parsing Layer for Keras 3.
This module provides a layer for parsing date strings into numerical components.
Classes
DateParsingLayer
1 | |
Layer for parsing date strings into numerical components.
This layer takes date strings in a specified format and returns a tensor containing the year, month, day of the month, and day of the week.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
date_format |
str
|
Format of the date strings. Currently supports 'YYYY-MM-DD' and 'YYYY/MM/DD'. Default is 'YYYY-MM-DD'. |
'YYYY-MM-DD'
|
**kwargs |
Additional keyword arguments to pass to the base layer. |
{}
|
Input shape
String tensor of any shape.
Output shape
Same as input shape with an additional dimension of size 4 appended. For example, if input shape is [batch_size], output shape will be [batch_size, 4].
Initialize the layer.
Source code in kerasfactory/layers/DateParsingLayer.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
Functions
1 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
Shape of the input tensor. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Shape of the output tensor. |
Source code in kerasfactory/layers/DateParsingLayer.py
165 166 167 168 169 170 171 172 173 174 | |
🕐 DateEncodingLayer
Encodes dates into learnable embeddings for temporal features.
kerasfactory.layers.DateEncodingLayer
DateEncodingLayer for encoding date components into cyclical features.
This layer takes date components (year, month, day, day of week) and encodes them into cyclical features using sine and cosine transformations.
Classes
DateEncodingLayer
1 2 3 | |
Layer for encoding date components into cyclical features.
This layer takes date components (year, month, day, day of week) and encodes them into cyclical features using sine and cosine transformations. The year is normalized to a range between 0 and 1 based on min_year and max_year.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_year |
int
|
Minimum year for normalization (default: 1900) |
1900
|
max_year |
int
|
Maximum year for normalization (default: 2100) |
2100
|
**kwargs |
Additional layer arguments |
{}
|
Input shape
Tensor with shape: (..., 4) containing [year, month, day, day_of_week]
Output shape
Tensor with shape: (..., 8) containing cyclical encodings:
[year_sin, year_cos, month_sin, month_cos, day_sin, day_cos, dow_sin, dow_cos]
Initialize the layer.
Source code in kerasfactory/layers/DateEncodingLayer.py
34 35 36 37 38 39 40 41 42 43 44 | |
Functions
1 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
Shape of the input tensor |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Output shape |
Source code in kerasfactory/layers/DateEncodingLayer.py
97 98 99 100 101 102 103 104 105 106 | |
🌙 SeasonLayer
Extracts and processes seasonal patterns from temporal data.
kerasfactory.layers.SeasonLayer
SeasonLayer for adding seasonal information based on month.
This layer adds seasonal information based on the month, encoding it as a one-hot vector for the four seasons: Winter, Spring, Summer, and Fall.
Classes
SeasonLayer
1 | |
Layer for adding seasonal information based on month.
This layer adds seasonal information based on the month, encoding it as a one-hot vector for the four seasons: Winter, Spring, Summer, and Fall.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs |
Additional layer arguments |
{}
|
Input shape
Tensor with shape: (..., 4) containing [year, month, day, day_of_week]
Output shape
Tensor with shape: (..., 8) containing the original 4 components plus
4 one-hot encoded season values
Initialize the layer.
Source code in kerasfactory/layers/SeasonLayer.py
30 31 32 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
Shape of the input tensor |
required |
Returns:
| Type | Description |
|---|---|
tuple[tuple[int, ...], tuple[int, ...]]
|
Output shape |
Source code in kerasfactory/layers/SeasonLayer.py
106 107 108 109 110 111 112 113 114 115 116 117 118 | |
🔀 DifferentialPreprocessingLayer
Applies differential preprocessing transformations to features.
kerasfactory.layers.DifferentialPreprocessingLayer
This module implements a DifferentialPreprocessingLayer that applies multiple candidate transformations to tabular data and learns to combine them optimally. It also handles missing values with learnable imputation. This approach is useful for tabular data where the optimal preprocessing strategy is not known in advance.
Classes
DifferentialPreprocessingLayer
1 2 3 4 5 6 | |
Differentiable preprocessing layer for numeric tabular data with multiple candidate transformations.
This layer
- Imputes missing values using a learnable imputation vector.
- Applies several candidate transformations:
- Identity (pass-through)
- Affine transformation (learnable scaling and bias)
- Nonlinear transformation via a small MLP
- Log transformation (using a softplus to ensure positivity)
- Learns softmax combination weights to aggregate the candidates.
The entire preprocessing pipeline is differentiable, so the network learns the optimal imputation and transformation jointly with downstream tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of numeric features in the input. |
required |
mlp_hidden_units |
int
|
Number of hidden units in the nonlinear branch. Default is 4. |
4
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
2D tensor with shape: (batch_size, num_features)
Output shape
2D tensor with shape: (batch_size, num_features) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
Initialize the DifferentialPreprocessingLayer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of input features. |
required |
mlp_hidden_units |
int
|
Number of hidden units in MLP. |
4
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DifferentialPreprocessingLayer.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
🔧 DifferentiableTabularPreprocessor
Differentiable preprocessing layer for tabular data end-to-end training.
kerasfactory.layers.DifferentiableTabularPreprocessor
This module implements a DifferentiableTabularPreprocessor layer that integrates preprocessing into the model so that the optimal imputation and normalization parameters are learned end-to-end. This approach is useful for tabular data with missing values and features that need normalization.
Classes
DifferentiableTabularPreprocessor
1 2 3 4 5 | |
A differentiable preprocessing layer for numeric tabular data.
This layer
- Replaces missing values (NaNs) with a learnable imputation vector.
- Applies a learned affine transformation (scaling and shifting) to each feature.
The idea is to integrate preprocessing into the model so that the optimal imputation and normalization parameters are learned end-to-end.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of numeric features in the input. |
required |
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
2D tensor with shape: (batch_size, num_features)
Output shape
2D tensor with shape: (batch_size, num_features) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Initialize the DifferentiableTabularPreprocessor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_features |
int
|
Number of input features. |
required |
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DifferentiableTabularPreprocessor.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | |
🎨 CastToFloat32Layer
Type casting layer for ensuring float32 precision.
kerasfactory.layers.CastToFloat32Layer
This module implements a CastToFloat32Layer that casts input tensors to float32 data type.
Classes
CastToFloat32Layer
1 | |
Layer that casts input tensors to float32 data type.
This layer is useful for ensuring consistent data types in a model, especially when working with mixed precision or when receiving inputs of various data types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
Tensor of any shape and numeric data type.
Output shape
Same as input shape, but with float32 data type.
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initialize the CastToFloat32Layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/CastToFloat32Layer.py
45 46 47 48 49 50 51 52 53 54 55 56 57 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...]
|
Shape of the input tensor. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Same shape as input. |
Source code in kerasfactory/layers/CastToFloat32Layer.py
70 71 72 73 74 75 76 77 78 79 | |
🌐 Graph & Ensemble Methods
📊 GraphFeatureAggregation
Aggregates features from graph structures for relational learning.
kerasfactory.layers.GraphFeatureAggregation
This module implements a GraphFeatureAggregation layer that treats features as nodes in a graph and uses attention mechanisms to learn relationships between features. This approach is useful for tabular data where features have inherent relationships.
Classes
GraphFeatureAggregation
1 2 3 4 5 6 7 | |
Graph-based feature aggregation layer with self-attention for tabular data.
This layer treats each input feature as a node and projects it into an embedding space. It then computes pairwise attention scores between features and aggregates feature information based on these scores. Finally, it projects the aggregated features back to the original feature space and adds a residual connection.
The process involves
- Projecting each scalar feature to an embedding (shape: [batch, num_features, embed_dim]).
- Computing pairwise concatenated embeddings and scoring them via a learnable attention vector.
- Normalizing the scores with softmax to yield a dynamic adjacency (attention) matrix.
- Aggregating neighboring features via weighted sum.
- Projecting back to a vector of original dimension, then adding a residual connection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embed_dim |
int
|
Dimensionality of the projected feature embeddings. Default is 8. |
8
|
dropout_rate |
float
|
Dropout rate to apply on attention weights. Default is 0.0. |
0.0
|
leaky_relu_alpha |
float
|
Alpha parameter for the LeakyReLU activation. Default is 0.2. |
0.2
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
2D tensor with shape: (batch_size, num_features)
Output shape
2D tensor with shape: (batch_size, num_features) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the GraphFeatureAggregation layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embed_dim |
int
|
Embedding dimension. |
8
|
dropout_rate |
float
|
Dropout rate. |
0.0
|
leaky_relu_alpha |
float
|
Alpha parameter for LeakyReLU. |
0.2
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/GraphFeatureAggregation.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | |
🧬 AdvancedGraphFeatureLayer
Advanced graph feature processing with multi-hop aggregation.
kerasfactory.layers.AdvancedGraphFeatureLayer
1 2 3 4 5 6 7 8 | |
Advanced graph-based feature layer for tabular data.
This layer projects scalar features into an embedding space and then applies multi-head self-attention to compute data-dependent dynamic adjacencies between features. It learns edge attributes by considering both the raw embeddings and their differences. Optionally, a hierarchical aggregation is applied, where features are grouped via a learned soft-assignment and then re-expanded back to the original feature space. A residual connection and layer normalization are applied before the final projection back to the original feature space.
The layer is highly configurable, allowing for control over the embedding dimension, number of attention heads, dropout rate, and hierarchical aggregation.
Notes
When to Use This Layer: - When working with tabular data where feature interactions are important - For complex feature engineering tasks where manual feature crosses are insufficient - When dealing with heterogeneous features that require dynamic, learned relationships - In scenarios where feature importance varies across different samples - When hierarchical feature relationships exist in your data
Best Practices: - Start with a small embed_dim (e.g., 16 or 32) and increase if needed - Use num_heads=4 or 8 for most applications - Enable hierarchical=True when you have many features (>20) or known grouping structure - Set dropout_rate=0.1 or 0.2 for regularization during training - Use layer normalization (enabled by default) to stabilize training
Performance Considerations: - Memory usage scales quadratically with the number of features - Consider using hierarchical mode for large feature sets to reduce complexity - The layer works best with normalized input features - For very large feature sets (>100), consider feature pre-selection
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embed_dim |
int
|
Dimensionality of the projected feature embeddings. Determines the size of the learned feature representations. |
required |
num_heads |
int
|
Number of attention heads. Must divide embed_dim evenly. Each head learns different aspects of feature relationships. |
required |
dropout_rate |
float
|
Dropout rate applied to attention weights during training. Helps prevent overfitting. Defaults to 0.0. |
0.0
|
hierarchical |
bool
|
Whether to apply hierarchical aggregation. If True, features are grouped into clusters, and aggregation is performed at the cluster level. Defaults to False. |
False
|
num_groups |
int
|
Number of groups to cluster features into when hierarchical is True. Must be provided if hierarchical is True. Controls the granularity of hierarchical aggregation. |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If embed_dim is not divisible by num_heads. Ensures that the embedding dimension can be evenly split across attention heads. |
ValueError
|
If hierarchical is True but num_groups is not provided. The number of groups must be specified when hierarchical aggregation is enabled. |
Examples:
Basic Usage:
1 2 3 4 5 6 7 8 9 | |
With Hierarchical Aggregation:
1 2 3 4 5 6 7 8 9 | |
Without Training:
1 2 3 4 5 6 7 8 9 | |
Initialize the AdvancedGraphFeature layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embed_dim |
int
|
Embedding dimension. |
required |
num_heads |
int
|
Number of attention heads. |
required |
dropout_rate |
float
|
Dropout rate. |
0.0
|
hierarchical |
bool
|
Whether to use hierarchical attention. |
False
|
num_groups |
int | None
|
Number of groups for hierarchical attention. |
None
|
**kwargs |
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/AdvancedGraphFeature.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
Functions
compute_output_shape
1 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
Shape tuple (batch_size, num_features) |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Output shape tuple (batch_size, num_features) |
Source code in kerasfactory/layers/AdvancedGraphFeature.py
326 327 328 329 330 331 332 333 334 335 | |
👥 MultiHeadGraphFeaturePreprocessor
Multi-head preprocessing for graph features with parallel aggregation.
kerasfactory.layers.MultiHeadGraphFeaturePreprocessor
This module implements a MultiHeadGraphFeaturePreprocessor layer that treats features as nodes in a graph and learns multiple "views" (heads) of the feature interactions via self-attention. This approach is useful for tabular data where complex feature relationships need to be captured.
Classes
MultiHeadGraphFeaturePreprocessor
1 2 3 4 5 6 7 | |
Multi-head graph-based feature preprocessor for tabular data.
This layer treats each feature as a node and applies multi-head self-attention to capture and aggregate complex interactions among features. The process is:
- Project each scalar input into an embedding of dimension
embed_dim. - Split the embedding into
num_headsheads. - For each head, compute queries, keys, and values and calculate scaled dot-product attention across the feature dimension.
- Concatenate the head outputs, project back to the original feature dimension, and add a residual connection.
This mechanism allows the network to learn multiple relational views among features, which can significantly boost performance on tabular data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embed_dim |
int
|
Dimension of the feature embeddings. Default is 16. |
16
|
num_heads |
int
|
Number of attention heads. Default is 4. |
4
|
dropout_rate |
float
|
Dropout rate applied to attention weights. Default is 0.0. |
0.0
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
2D tensor with shape: (batch_size, num_features)
Output shape
2D tensor with shape: (batch_size, num_features) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the MultiHeadGraphFeaturePreprocessor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embed_dim |
int
|
Embedding dimension. |
16
|
num_heads |
int
|
Number of attention heads. |
4
|
dropout_rate |
float
|
Dropout rate. |
0.0
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/MultiHeadGraphFeaturePreprocessor.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
Functions
1 2 3 | |
Split the last dimension into (num_heads, depth) and transpose.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
KerasTensor
|
Input tensor with shape (batch_size, num_features, embed_dim). |
required |
batch_size |
KerasTensor
|
Batch size tensor. |
required |
Returns:
| Type | Description |
|---|---|
KerasTensor
|
Tensor with shape (batch_size, num_heads, num_features, depth). |
Source code in kerasfactory/layers/MultiHeadGraphFeaturePreprocessor.py
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | |
📈 BoostingBlock
Boosting ensemble block for combining weak learners.
kerasfactory.layers.BoostingBlock
This module implements a BoostingBlock layer that simulates gradient boosting behavior in a neural network. The layer computes a correction term via a configurable MLP and adds a scaled version to the input.
Classes
BoostingBlock
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
A neural network layer that simulates gradient boosting behavior.
This layer implements a weak learner that computes a correction term via a configurable MLP and adds a scaled version of this correction to the input. Stacking several such blocks can mimic the iterative residual-correction process of gradient boosting.
The output is computed as
output = inputs + gamma * f(inputs)
where: - f is a configurable MLP (default: two-layer network) - gamma is a learnable or fixed scaling factor
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hidden_units |
int | list[int]
|
Number of units in the hidden layer(s). Can be an int for single hidden layer or a list of ints for multiple hidden layers. Default is 64. |
64
|
hidden_activation |
str
|
Activation function for hidden layers. Default is 'relu'. |
'relu'
|
output_activation |
str | None
|
Activation function for the output layer. Default is None. |
None
|
gamma_trainable |
bool
|
Whether the scaling factor gamma is trainable. Default is True. |
True
|
gamma_initializer |
str | Initializer
|
Initializer for the gamma scaling factor. Default is 'ones'. |
'ones'
|
use_bias |
bool
|
Whether to include bias terms in the dense layers. Default is True. |
True
|
kernel_initializer |
str | Initializer
|
Initializer for the dense layer kernels. Default is 'glorot_uniform'. |
'glorot_uniform'
|
bias_initializer |
str | Initializer
|
Initializer for the dense layer biases. Default is 'zeros'. |
'zeros'
|
dropout_rate |
float | None
|
Optional dropout rate to apply after hidden layers. Default is None. |
None
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
N-D tensor with shape: (batch_size, ..., input_dim)
Output shape
Same shape as input: (batch_size, ..., input_dim)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
Initialize the BoostingBlock layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hidden_units |
int | list[int]
|
Number of hidden units or list of units per layer. |
64
|
hidden_activation |
str
|
Activation function for hidden layers. |
'relu'
|
output_activation |
str | None
|
Activation function for output layer. |
None
|
gamma_trainable |
bool
|
Whether gamma parameter is trainable. |
True
|
gamma_initializer |
str | Initializer
|
Initializer for gamma parameter. |
'ones'
|
use_bias |
bool
|
Whether to use bias. |
True
|
kernel_initializer |
str | Initializer
|
Initializer for kernel weights. |
'glorot_uniform'
|
bias_initializer |
str | Initializer
|
Initializer for bias weights. |
'zeros'
|
dropout_rate |
float | None
|
Dropout rate. |
None
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/BoostingBlock.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
🎯 BoostingEnsembleLayer
Ensemble layer implementing gradient boosting mechanisms.
kerasfactory.layers.BoostingEnsembleLayer
This module implements a BoostingEnsembleLayer that aggregates multiple BoostingBlocks in parallel. Their outputs are combined via learnable weights to form an ensemble prediction. This is similar in spirit to boosting ensembles but implemented in a differentiable, end-to-end manner.
Classes
BoostingEnsembleLayer
1 2 3 4 5 6 7 8 9 10 | |
Ensemble layer of boosting blocks for tabular data.
This layer aggregates multiple boosting blocks (weak learners) in parallel. Each learner produces a correction to the input. A gating mechanism (via learnable weights) then computes a weighted sum of the learners' outputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_learners |
int
|
Number of boosting blocks in the ensemble. Default is 3. |
3
|
learner_units |
int | list[int]
|
Number of hidden units in each boosting block. Can be an int for single hidden layer or a list of ints for multiple hidden layers. Default is 64. |
64
|
hidden_activation |
str
|
Activation function for hidden layers in boosting blocks. Default is 'relu'. |
'relu'
|
output_activation |
str | None
|
Activation function for the output layer in boosting blocks. Default is None. |
None
|
gamma_trainable |
bool
|
Whether the scaling factor gamma in boosting blocks is trainable. Default is True. |
True
|
dropout_rate |
float | None
|
Optional dropout rate to apply in boosting blocks. Default is None. |
None
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
N-D tensor with shape: (batch_size, ..., input_dim)
Output shape
Same shape as input: (batch_size, ..., input_dim)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
Initialize the BoostingEnsembleLayer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_learners |
int
|
Number of boosting learners. |
3
|
learner_units |
int | list[int]
|
Number of units per learner or list of units. |
64
|
hidden_activation |
str
|
Activation function for hidden layers. |
'relu'
|
output_activation |
str | None
|
Activation function for output layer. |
None
|
gamma_trainable |
bool
|
Whether gamma parameter is trainable. |
True
|
dropout_rate |
float | None
|
Dropout rate. |
None
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/BoostingEnsembleLayer.py
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 | |
📊 TabularMoELayer
Mixture of Experts layer optimized for tabular data.
kerasfactory.layers.TabularMoELayer
This module implements a TabularMoELayer (Mixture-of-Experts) that routes input features through multiple expert sub-networks and aggregates their outputs via a learnable gating mechanism. This approach is useful for tabular data where different experts can specialize in different feature patterns.
Classes
TabularMoELayer
1 2 3 4 5 6 | |
Mixture-of-Experts layer for tabular data.
This layer routes input features through multiple expert sub-networks and aggregates their outputs via a learnable gating mechanism. Each expert is a small MLP, and the gate learns to weight their contributions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_experts |
int
|
Number of expert networks. Default is 4. |
4
|
expert_units |
int
|
Number of hidden units in each expert network. Default is 16. |
16
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
2D tensor with shape: (batch_size, num_features)
Output shape
2D tensor with shape: (batch_size, num_features) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the TabularMoELayer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_experts |
int
|
Number of expert networks. |
4
|
expert_units |
int
|
Number of units in each expert. |
16
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/TabularMoELayer.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
🏗️ BusinessRulesLayer
Layer for integrating domain-specific business rules into model.
kerasfactory.layers.BusinessRulesLayer
This module implements a BusinessRulesLayer that allows applying configurable business rules to neural network outputs. This enables combining learned patterns with explicit domain knowledge.
Classes
BusinessRulesLayer
1 2 3 4 5 6 7 8 9 | |
Evaluates business-defined rules for anomaly detection.
This layer applies user-defined business rules to detect anomalies. Rules can be defined for both numerical and categorical features.
For numerical features
- Comparison operators: '>' and '<'
- Example: [(">", 0), ("<", 100)] for range validation
For categorical features
- Set operators: '==', 'in', '!=', 'not in'
- Example: [("in", ["red", "green", "blue"])] for valid categories
Attributes:
| Name | Type | Description |
|---|---|---|
rules |
List of rule tuples (operator, value). |
|
feature_type |
Type of feature ('numerical' or 'categorical'). |
Example
1 2 3 4 5 6 7 8 9 10 11 12 | |
Initializes the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rules |
list[Rule]
|
List of rule tuples (operator, value). |
required |
feature_type |
str
|
Type of feature ('numerical' or 'categorical'). |
required |
trainable_weights |
bool
|
Whether to use trainable weights for soft rule enforcement. Default is True. |
True
|
weight_initializer |
str | Initializer
|
Initializer for rule weights. Default is 'ones'. |
'ones'
|
name |
str | None
|
Optional name for the layer. |
None
|
**kwargs |
Any
|
Additional layer arguments. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If feature_type is invalid or rules have invalid operators. |
Source code in kerasfactory/layers/BusinessRulesLayer.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
Functions
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int | None, int]
|
Input shape tuple. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, tuple[int | None, int]]
|
Dictionary mapping output names to their shapes. |
Source code in kerasfactory/layers/BusinessRulesLayer.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | |
🛡️ Regularization & Robustness
🎲 StochasticDepth
Stochastic depth regularization for improved generalization.
kerasfactory.layers.StochasticDepth
Stochastic depth layer for neural networks.
Classes
StochasticDepth
1 2 3 4 5 | |
Stochastic depth layer for regularization.
This layer randomly drops entire residual branches with a specified probability during training. During inference, all branches are kept and scaled appropriately. This technique helps reduce overfitting and training time in deep networks.
Reference
Example
1 2 3 4 5 6 7 8 9 10 11 | |
Initialize stochastic depth.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
survival_prob |
float
|
Probability of keeping the residual branch (default: 0.5) |
0.5
|
seed |
int | None
|
Random seed for reproducibility |
None
|
**kwargs |
dict[str, Any]
|
Additional layer arguments |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If survival_prob is not in [0, 1] |
Source code in kerasfactory/layers/StochasticDepth.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | |
Functions
1 2 3 | |
Compute output shape.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
list[tuple[int, ...]]
|
List of input shape tuples |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Output shape tuple |
Source code in kerasfactory/layers/StochasticDepth.py
100 101 102 103 104 105 106 107 108 109 110 111 112 | |
classmethod
1 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
StochasticDepth
|
StochasticDepth instance |
Source code in kerasfactory/layers/StochasticDepth.py
129 130 131 132 133 134 135 136 137 138 139 | |
🗑️ FeatureCutout
Feature cutout regularization for dropout-like effects on features.
kerasfactory.layers.FeatureCutout
Feature cutout regularization layer for neural networks.
Classes
FeatureCutout
1 2 3 4 5 6 | |
Feature cutout regularization layer.
This layer randomly masks out (sets to zero) a specified fraction of features during training to improve model robustness and prevent overfitting. During inference, all features are kept intact.
Example
1 2 3 4 5 6 7 8 9 10 11 | |
Initialize feature cutout.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cutout_prob |
float
|
Probability of masking each feature |
0.1
|
noise_value |
float
|
Value to use for masked features (default: 0.0) |
0.0
|
seed |
int | None
|
Random seed for reproducibility |
None
|
**kwargs |
dict[str, Any]
|
Additional layer arguments |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If cutout_prob is not in [0, 1] |
Source code in kerasfactory/layers/FeatureCutout.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | |
Functions
1 2 3 | |
Compute output shape.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...]
|
Input shape tuple |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Output shape tuple |
Source code in kerasfactory/layers/FeatureCutout.py
106 107 108 109 110 111 112 113 114 115 116 117 118 | |
classmethod
1 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
FeatureCutout
|
FeatureCutout instance |
Source code in kerasfactory/layers/FeatureCutout.py
136 137 138 139 140 141 142 143 144 145 146 | |
🎯 SparseAttentionWeighting
Sparse attention weighting for computational efficiency.
kerasfactory.layers.SparseAttentionWeighting
Classes
SparseAttentionWeighting
1 2 3 4 5 | |
Sparse attention mechanism with temperature scaling for module outputs combination.
This layer implements a learnable attention mechanism that combines outputs from multiple modules using temperature-scaled attention weights. The attention weights are learned during training and can be made more or less sparse by adjusting the temperature parameter. A higher temperature leads to more uniform weights, while a lower temperature makes the weights more concentrated on specific modules.
Key features: 1. Learnable module importance weights 2. Temperature-controlled sparsity 3. Softmax-based attention mechanism 4. Support for variable number of input features per module
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_modules |
int
|
Number of input modules whose outputs will be combined. |
required |
temperature |
float
|
Temperature parameter for softmax scaling. Default is 1.0. - temperature > 1.0: More uniform attention weights - temperature < 1.0: More sparse attention weights - temperature = 1.0: Standard softmax behavior |
1.0
|
Initialize sparse attention weighting layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_modules |
int
|
Number of input modules to weight. Must be positive. |
required |
temperature |
float
|
Temperature parameter for softmax scaling. Must be positive. Controls the sparsity of attention weights: - Higher values (>1.0) lead to more uniform weights - Lower values (<1.0) lead to more concentrated weights |
1.0
|
**kwargs |
dict[str, Any]
|
Additional layer arguments passed to the parent Layer class. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If num_modules <= 0 or temperature <= 0 |
Source code in kerasfactory/layers/SparseAttentionWeighting.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
Functions
classmethod
1 2 3 | |
Create layer from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config |
dict[str, Any]
|
Layer configuration dictionary |
required |
Returns:
| Type | Description |
|---|---|
SparseAttentionWeighting
|
SparseAttentionWeighting instance |
Source code in kerasfactory/layers/SparseAttentionWeighting.py
148 149 150 151 152 153 154 155 156 157 158 | |
🔧 Specialized Processing
🐢 SlowNetwork
Slow network layer for temporal smoothing and stability.
kerasfactory.layers.SlowNetwork
This module implements a SlowNetwork layer that processes features through multiple dense layers. It's designed to be used as a component in more complex architectures.
Classes
SlowNetwork
1 2 3 4 5 6 7 | |
A multi-layer network with configurable depth and width.
This layer processes input features through multiple dense layers with ReLU activations, and projects the output back to the original feature dimension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Dimension of the input features. |
required |
num_layers |
int
|
Number of hidden layers. Default is 3. |
3
|
units |
int
|
Number of units per hidden layer. Default is 128. |
128
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
2D tensor with shape: (batch_size, input_dim)
Output shape
2D tensor with shape: (batch_size, input_dim) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 | |
Initialize the SlowNetwork layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Input dimension. |
required |
num_layers |
int
|
Number of hidden layers. |
3
|
units |
int
|
Number of units in each layer. |
128
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/SlowNetwork.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
⚡ HyperZZWOperator
Specialized hyperparameter operator for advanced transformations.
kerasfactory.layers.HyperZZWOperator
This module implements a HyperZZWOperator layer that computes context-dependent weights by multiplying inputs with hyper-kernels. This is a specialized layer for the Terminator model.
Classes
HyperZZWOperator
1 2 3 4 5 6 | |
A layer that computes context-dependent weights by multiplying inputs with hyper-kernels.
This layer takes two inputs: the original input tensor and a context tensor. It generates hyper-kernels from the context and performs a context-dependent transformation of the input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Dimension of the input features. |
required |
context_dim |
int | None
|
Optional dimension of the context features. If not provided, it will be inferred. |
None
|
name |
str | None
|
Optional name for the layer. |
None
|
Input
A list of two tensors: - inputs[0]: Input tensor with shape (batch_size, input_dim). - inputs[1]: Context tensor with shape (batch_size, context_dim).
Output shape
2D tensor with shape: (batch_size, input_dim) (same as input)
Example
1 2 3 4 5 6 7 8 9 10 11 | |
Initialize the HyperZZWOperator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Input dimension. |
required |
context_dim |
int | None
|
Context dimension. |
None
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/HyperZZWOperator.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
🚨 Anomaly Detection
📉 NumericalAnomalyDetection
Detects anomalies in numerical features using statistical methods.
kerasfactory.layers.NumericalAnomalyDetection
Classes
NumericalAnomalyDetection
1 2 3 4 5 6 | |
Numerical anomaly detection layer for identifying outliers in numerical features.
This layer learns a distribution for each numerical feature and outputs an anomaly score for each feature based on how far it deviates from the learned distribution. The layer uses a combination of mean, variance, and autoencoder reconstruction error to detect anomalies.
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Initialize the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hidden_dims |
list[int]
|
List of hidden dimensions for the autoencoder. |
required |
reconstruction_weight |
float
|
Weight for reconstruction error in anomaly score. |
0.5
|
distribution_weight |
float
|
Weight for distribution-based error in anomaly score. |
0.5
|
**kwargs |
dict[str, Any]
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/NumericalAnomalyDetection.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
Functions
1 2 3 | |
Compute output shape.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int, ...]
|
Input shape tuple. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, ...]
|
Output shape tuple. |
Source code in kerasfactory/layers/NumericalAnomalyDetection.py
134 135 136 137 138 139 140 141 142 143 | |
📊 CategoricalAnomalyDetectionLayer
Detects anomalies in categorical features.
kerasfactory.layers.CategoricalAnomalyDetectionLayer
Classes
CategoricalAnomalyDetectionLayer
1 2 3 | |
Backend-agnostic anomaly detection for categorical features.
This layer detects anomalies in categorical features by checking if values belong to a predefined set of valid categories. Values not in this set are considered anomalous.
The layer uses a Keras StringLookup or IntegerLookup layer internally to efficiently map input values to indices, which are then used to determine if a value is valid.
Attributes:
| Name | Type | Description |
|---|---|---|
dtype |
Any
|
The data type of input values ('string' or 'int32'). |
lookup |
StringLookup | IntegerLookup | None
|
A Keras lookup layer for mapping values to indices. |
vocabulary |
StringLookup | IntegerLookup | None
|
list of valid categorical values. |
Example
1 2 3 4 | |
Initializes the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dtype |
str
|
Data type of input values ('string' or 'int32'). Defaults to 'string'. |
'string'
|
**kwargs |
Additional layer arguments. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If dtype is not 'string' or 'int32'. |
Source code in kerasfactory/layers/CategoricalAnomalyDetectionLayer.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
Attributes
property
1 | |
Get the dtype of the layer.
Functions
1 | |
Set the dtype and initialize the appropriate lookup layer.
Source code in kerasfactory/layers/CategoricalAnomalyDetectionLayer.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
1 | |
Initializes the layer with a vocabulary of valid values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vocabulary |
list[str | int]
|
list of valid categorical values. |
required |
Source code in kerasfactory/layers/CategoricalAnomalyDetectionLayer.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
1 2 3 | |
Compute the output shape of the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_shape |
tuple[int | None, int]
|
Input shape tuple. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, tuple[int | None, int]]
|
Dictionary mapping output names to their shapes. |
Source code in kerasfactory/layers/CategoricalAnomalyDetectionLayer.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
classmethod
1 | |
Create layer from configuration.
Source code in kerasfactory/layers/CategoricalAnomalyDetectionLayer.py
202 203 204 205 206 207 208 209 210 211 212 | |