📊 DistributionTransformLayer

🔥 Popular

✅ Stable

🟢 Beginner

🎯 Overview

The DistributionTransformLayer automatically transforms numerical features to improve their distribution characteristics, making them more suitable for neural network processing. This layer supports multiple transformation types including log, square root, Box-Cox, Yeo-Johnson, and more, with an intelligent 'auto' mode that selects the best transformation based on data characteristics.

This layer is particularly valuable for preprocessing numerical data where the original distribution may not be optimal for neural network training, such as skewed distributions, heavy-tailed data, or features with varying scales.

🔍 How It Works

The DistributionTransformLayer processes numerical features through intelligent transformation:

Distribution Analysis: Analyzes input data characteristics (skewness, kurtosis, etc.)
Transformation Selection: Chooses optimal transformation based on data properties
Parameter Learning: Learns transformation parameters during training
Data Transformation: Applies the selected transformation to normalize the data
Output Generation: Returns transformed features with improved distribution

graph TD
    A[Input Features] --> B[Distribution Analysis]
    B --> C{Transform Type}

    C -->|Auto| D[Best Fit Selection]
    C -->|Manual| E[Specified Transform]

    D --> F[Log Transform]
    D --> G[Box-Cox Transform]
    D --> H[Yeo-Johnson Transform]
    D --> I[Other Transforms]

    E --> F
    E --> G
    E --> H
    E --> I

    F --> J[Transformed Features]
    G --> J
    H --> J
    I --> J

    style A fill:#e6f3ff,stroke:#4a86e8
    style J fill:#e8f5e9,stroke:#66bb6a
    style B fill:#fff9e6,stroke:#ffb74d
    style D fill:#f3e5f5,stroke:#9c27b0

💡 Why Use This Layer?

Challenge	Traditional Approach	DistributionTransformLayer's Solution
Skewed Data	Manual transformation or ignore	🎯 Automatic detection and transformation of skewed distributions
Scale Differences	Manual normalization	⚡ Intelligent scaling based on data characteristics
Distribution Types	One-size-fits-all approach	🧠 Adaptive transformation for different distribution types
Preprocessing Complexity	Manual feature engineering	🔗 Automated preprocessing with learned parameters

📊 Use Cases

Financial Data: Transforming skewed financial metrics and ratios
Medical Data: Normalizing lab values and health measurements
Sensor Data: Preprocessing IoT and sensor readings
Survey Data: Transforming rating scales and response distributions
Time Series: Preprocessing numerical time series features

🚀 Quick Start

Basic Usage

import keras
from kerasfactory.layers import DistributionTransformLayer

# Create sample data with skewed distribution
batch_size, num_features = 32, 10
x = keras.random.exponential((batch_size, num_features))  # Exponential distribution

# Apply automatic transformation
transformer = DistributionTransformLayer(transform_type='auto')
transformed = transformer(x)

print(f"Input shape: {x.shape}")           # (32, 10)
print(f"Output shape: {transformed.shape}")  # (32, 10)

Manual Transformation

# Apply specific transformation
log_transformer = DistributionTransformLayer(transform_type='log')
log_transformed = log_transformer(x)

# Box-Cox transformation
box_cox_transformer = DistributionTransformLayer(
    transform_type='box-cox',
    lambda_param=0.5
)
box_cox_transformed = box_cox_transformer(x)

In a Sequential Model

import keras
from kerasfactory.layers import DistributionTransformLayer

model = keras.Sequential([
    DistributionTransformLayer(transform_type='auto'),  # Preprocess data
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

In a Functional Model

import keras
from kerasfactory.layers import DistributionTransformLayer

# Define inputs
inputs = keras.Input(shape=(20,))  # 20 numerical features

# Apply distribution transformation
x = DistributionTransformLayer(transform_type='yeo-johnson')(inputs)

# Continue processing
x = keras.layers.Dense(64, activation='relu')(x)
x = keras.layers.Dropout(0.2)(x)
x = keras.layers.Dense(32, activation='relu')(x)
outputs = keras.layers.Dense(1, activation='sigmoid')(x)

model = keras.Model(inputs, outputs)

Advanced Configuration

# Advanced configuration with custom parameters
transformer = DistributionTransformLayer(
    transform_type='auto',
    epsilon=1e-8,                    # Custom epsilon for numerical stability
    auto_candidates=['log', 'sqrt', 'box-cox', 'yeo-johnson'],  # Limited candidates
    name="custom_distribution_transform"
)

# Use in a complex preprocessing pipeline
inputs = keras.Input(shape=(50,))

# Multiple transformation strategies
x1 = DistributionTransformLayer(transform_type='log')(inputs)
x2 = DistributionTransformLayer(transform_type='yeo-johnson')(inputs)

# Combine different transformations
x = keras.layers.Concatenate()([x1, x2])
x = keras.layers.Dense(128, activation='relu')(x)
x = keras.layers.Dropout(0.3)(x)
outputs = keras.layers.Dense(5, activation='softmax')(x)

model = keras.Model(inputs, outputs)

📖 API Reference

kerasfactory.layers.DistributionTransformLayer

This module implements a DistributionTransformLayer that applies various transformations to make data more normally distributed or to handle specific distribution types better. It's particularly useful for preprocessing data before anomaly detection or other statistical analyses.

Classes

DistributionTransformLayer

DistributionTransformLayer(
    transform_type: str = "none",
    lambda_param: float = 0.0,
    epsilon: float = 1e-10,
    min_value: float = 0.0,
    max_value: float = 1.0,
    clip_values: bool = True,
    auto_candidates: list[str] | None = None,
    name: str | None = None,
    **kwargs: Any
)

Layer for transforming data distributions to improve anomaly detection.

This layer applies various transformations to make data more normally distributed or to handle specific distribution types better. Supported transformations include log, square root, Box-Cox, Yeo-Johnson, arcsinh, cube-root, logit, quantile, robust-scale, and min-max.

When transform_type is set to 'auto', the layer automatically selects the most appropriate transformation based on the data characteristics during training.

Parameters:

Name	Type	Description	Default
`transform_type`	`str`	Type of transformation to apply. Options are 'none', 'log', 'sqrt', 'box-cox', 'yeo-johnson', 'arcsinh', 'cube-root', 'logit', 'quantile', 'robust-scale', 'min-max', or 'auto'. Default is 'none'.	`'none'`
`lambda_param`	`float`	Parameter for parameterized transformations like Box-Cox and Yeo-Johnson. Default is 0.0.	`0.0`
`epsilon`	`float`	Small value added to prevent numerical issues like log(0). Default is 1e-10.	`1e-10`
`min_value`	`float`	Minimum value for min-max scaling. Default is 0.0.	`0.0`
`max_value`	`float`	Maximum value for min-max scaling. Default is 1.0.	`1.0`
`clip_values`	`bool`	Whether to clip values to the specified range in min-max scaling. Default is True.	`True`
`auto_candidates`	`list[str] \| None`	list of transformation types to consider when transform_type is 'auto'. If None, all available transformations will be considered. Default is None.	`None`
`name`	`str \| None`	Optional name for the layer.	`None`

Input shape

N-D tensor with shape: (batch_size, ..., features)

Output shape

Same shape as input: (batch_size, ..., features)

Example

import keras
import numpy as np
from kerasfactory.layers import DistributionTransformLayer

# Create sample input data with skewed distribution
x = keras.random.exponential((32, 10))  # 32 samples, 10 features

# Apply log transformation
log_transform = DistributionTransformLayer(transform_type="log")
y = log_transform(x)
print("Transformed output shape:", y.shape)  # (32, 10)

# Apply Box-Cox transformation with lambda=0.5
box_cox = DistributionTransformLayer(transform_type="box-cox", lambda_param=0.5)
z = box_cox(x)

# Apply arcsinh transformation (handles both positive and negative values)
arcsinh_transform = DistributionTransformLayer(transform_type="arcsinh")
a = arcsinh_transform(x)

# Apply min-max scaling to range [0, 1]
min_max = DistributionTransformLayer(transform_type="min-max", min_value=0.0, max_value=1.0)
b = min_max(x)

# Use automatic transformation selection
auto_transform = DistributionTransformLayer(transform_type="auto")
c = auto_transform(x)  # Will select the best transformation during training

Initialize the DistributionTransformLayer.

Parameters:

Name	Type	Description	Default
`transform_type`	`str`	Type of transformation to apply.	`'none'`
`lambda_param`	`float`	Lambda parameter for Box-Cox transformation.	`0.0`
`epsilon`	`float`	Small value to avoid division by zero.	`1e-10`
`min_value`	`float`	Minimum value for clipping.	`0.0`
`max_value`	`float`	Maximum value for clipping.	`1.0`
`clip_values`	`bool`	Whether to clip values.	`True`
`auto_candidates`	`list[str] \| None`	List of candidate transformations for auto mode.	`None`
`name`	`str \| None`	Name of the layer.	`None`
`**kwargs`	`Any`	Additional keyword arguments.	`{}`

Source code in kerasfactory/layers/DistributionTransformLayer.py

def __init__(
    self,
    transform_type: str = "none",
    lambda_param: float = 0.0,
    epsilon: float = 1e-10,
    min_value: float = 0.0,
    max_value: float = 1.0,
    clip_values: bool = True,
    auto_candidates: list[str] | None = None,
    name: str | None = None,
    **kwargs: Any,
) -> None:
    """Initialize the DistributionTransformLayer.

    Args:
        transform_type: Type of transformation to apply.
        lambda_param: Lambda parameter for Box-Cox transformation.
        epsilon: Small value to avoid division by zero.
        min_value: Minimum value for clipping.
        max_value: Maximum value for clipping.
        clip_values: Whether to clip values.
        auto_candidates: List of candidate transformations for auto mode.
        name: Name of the layer.
        **kwargs: Additional keyword arguments.
    """
    # Set private attributes first
    self._transform_type = transform_type
    self._lambda_param = lambda_param
    self._epsilon = epsilon
    self._min_value = min_value
    self._max_value = max_value
    self._clip_values = clip_values
    self._auto_candidates = auto_candidates

    # Set public attributes BEFORE calling parent's __init__
    self.transform_type = self._transform_type
    self.lambda_param = self._lambda_param
    self.epsilon = self._epsilon
    self.min_value = self._min_value
    self.max_value = self._max_value
    self.clip_values = self._clip_values
    self.auto_candidates = self._auto_candidates

    # Define valid transformations
    self._valid_transforms = [
        "none",
        "log",
        "sqrt",
        "box-cox",
        "yeo-johnson",
        "arcsinh",
        "cube-root",
        "logit",
        "quantile",
        "robust-scale",
        "min-max",
        "auto",
    ]

    # Set default auto candidates if not provided
    if self.auto_candidates is None and self.transform_type == "auto":
        # Exclude 'none' and 'auto' from candidates
        self.auto_candidates = [
            t for t in self._valid_transforms if t not in ["none", "auto"]
        ]

    # Validate parameters
    self._validate_params()

    # Initialize auto-mode variables
    self._selected_transform = None
    self._is_initialized = False

    # Call parent's __init__
    super().__init__(name=name, **kwargs)

🔧 Parameters Deep Dive

`transform_type` (str)

Purpose: Type of transformation to apply
Options: 'none', 'log', 'sqrt', 'box-cox', 'yeo-johnson', 'arcsinh', 'cube-root', 'logit', 'quantile', 'robust-scale', 'min-max', 'auto'
Default: 'none'
Impact: Determines how the data is transformed
Recommendation: Use 'auto' for automatic selection, specific types for known distributions

`lambda_param` (float)

Purpose: Parameter for Box-Cox and Yeo-Johnson transformations
Range: -2.0 to 2.0 (typically 0.0 to 1.0)
Impact: Controls the strength of the transformation
Recommendation: Use 0.5 for moderate transformation, 0.0 for log-like behavior

`epsilon` (float)

Purpose: Small value to prevent numerical issues
Range: 1e-10 to 1e-6
Impact: Prevents log(0) and division by zero errors
Recommendation: Use 1e-8 for most cases, 1e-10 for very small values

📈 Performance Characteristics

Speed: ⚡⚡⚡⚡ Very fast - simple mathematical transformations
Memory: 💾💾 Low memory usage - minimal additional parameters
Accuracy: 🎯🎯🎯🎯 Excellent for improving data distribution characteristics
Best For: Numerical data with skewed or non-normal distributions

🎨 Examples

Example 1: Financial Data Preprocessing

import keras
import numpy as np
from kerasfactory.layers import DistributionTransformLayer

# Simulate financial data with different distributions
batch_size = 1000

# Income data (log-normal distribution)
income = np.random.lognormal(mean=10, sigma=1, size=(batch_size, 1))

# Age data (normal distribution)
age = np.random.normal(50, 15, size=(batch_size, 1))

# Debt ratio (beta distribution)
debt_ratio = np.random.beta(2, 5, size=(batch_size, 1))

# Combine features
financial_data = np.concatenate([income, age, debt_ratio], axis=1)

# Build preprocessing model
inputs = keras.Input(shape=(3,))

# Apply different transformations for different features
income_transformed = DistributionTransformLayer(transform_type='log')(inputs[:, :1])
age_transformed = DistributionTransformLayer(transform_type='none')(inputs[:, 1:2])
debt_transformed = DistributionTransformLayer(transform_type='logit')(inputs[:, 2:3])

# Combine transformed features
x = keras.layers.Concatenate()([income_transformed, age_transformed, debt_transformed])
x = keras.layers.Dense(32, activation='relu')(x)
x = keras.layers.Dropout(0.2)(x)
output = keras.layers.Dense(1, activation='sigmoid')(x)

model = keras.Model(inputs, output)
model.compile(optimizer='adam', loss='binary_crossentropy')

Example 2: Sensor Data Preprocessing

# Preprocess IoT sensor data with automatic transformation
def create_sensor_model():
    inputs = keras.Input(shape=(10,))  # 10 sensor readings

    # Automatic transformation selection
    x = DistributionTransformLayer(transform_type='auto')(inputs)

    # Additional preprocessing
    x = keras.layers.BatchNormalization()(x)
    x = keras.layers.Dense(64, activation='relu')(x)
    x = keras.layers.Dropout(0.3)(x)

    # Multiple outputs
    anomaly_score = keras.layers.Dense(1, activation='sigmoid', name='anomaly')(x)
    sensor_health = keras.layers.Dense(3, activation='softmax', name='health')(x)

    return keras.Model(inputs, [anomaly_score, sensor_health])

model = create_sensor_model()
model.compile(
    optimizer='adam',
    loss={'anomaly': 'binary_crossentropy', 'health': 'categorical_crossentropy'},
    loss_weights={'anomaly': 1.0, 'health': 0.5}
)

Example 3: Survey Data Analysis

# Process survey data with different response scales
def create_survey_model():
    inputs = keras.Input(shape=(15,))  # 15 survey questions

    # Different transformations for different question types
    # Likert scale (1-5) - no transformation needed
    likert_questions = inputs[:, :5]

    # Rating scale (0-10) - min-max scaling
    rating_questions = DistributionTransformLayer(transform_type='min-max')(inputs[:, 5:10])

    # Open-ended numerical - log transformation
    numerical_questions = DistributionTransformLayer(transform_type='log')(inputs[:, 10:15])

    # Combine all features
    x = keras.layers.Concatenate()([likert_questions, rating_questions, numerical_questions])
    x = keras.layers.Dense(64, activation='relu')(x)
    x = keras.layers.Dropout(0.2)(x)
    x = keras.layers.Dense(32, activation='relu')(x)

    # Survey analysis outputs
    satisfaction = keras.layers.Dense(1, activation='sigmoid', name='satisfaction')(x)
    category = keras.layers.Dense(5, activation='softmax', name='category')(x)

    return keras.Model(inputs, [satisfaction, category])

model = create_survey_model()
model.compile(
    optimizer='adam',
    loss={'satisfaction': 'binary_crossentropy', 'category': 'categorical_crossentropy'},
    loss_weights={'satisfaction': 1.0, 'category': 0.3}
)

💡 Tips & Best Practices

Auto Mode: Use 'auto' for unknown distributions, specific types for known patterns
Data Validation: Check for negative values before applying log transformations
Epsilon Tuning: Adjust epsilon based on your data's numerical precision
Feature-Specific: Apply different transformations to different feature types
Monitoring: Track transformation effects on model performance
Inverse Transform: Consider if you need to inverse transform predictions

⚠️ Common Pitfalls

Negative Values: Log and sqrt transformations require non-negative values
Zero Values: Use appropriate epsilon to handle zero values
Overfitting: Don't over-transform - sometimes original distributions are fine
Interpretability: Transformed features may be harder to interpret
Inverse Transform: Remember to inverse transform if needed for predictions

DistributionAwareEncoder - Distribution-aware feature encoding
AdvancedNumericalEmbedding - Advanced numerical embeddings
DifferentiableTabularPreprocessor - End-to-end preprocessing
CastToFloat32Layer - Type casting utility

📚 Further Reading

Box-Cox Transformation - Box-Cox transformation details
Yeo-Johnson Transformation - Yeo-Johnson transformation
Data Preprocessing in Machine Learning - Data preprocessing concepts
KerasFactory Layer Explorer - Browse all available layers
Data Preprocessing Tutorial - Complete guide to data preprocessing

📊 DistributionTransformLayer

📊 DistributionTransformLayer

🎯 Overview

🔍 How It Works

💡 Why Use This Layer?

📊 Use Cases

🚀 Quick Start

Basic Usage

Manual Transformation

In a Sequential Model

In a Functional Model

Advanced Configuration

📖 API Reference

kerasfactory.layers.DistributionTransformLayer

Classes

DistributionTransformLayer

🔧 Parameters Deep Dive

transform_type (str)

lambda_param (float)

epsilon (float)

📈 Performance Characteristics

🎨 Examples

Example 1: Financial Data Preprocessing

Example 2: Sensor Data Preprocessing

Example 3: Survey Data Analysis

💡 Tips & Best Practices

⚠️ Common Pitfalls

🔗 Related Layers

📚 Further Reading

`transform_type` (str)

`lambda_param` (float)

`epsilon` (float)