π DistributionTransformLayer
π DistributionTransformLayer
π― Overview
The DistributionTransformLayer automatically transforms numerical features to improve their distribution characteristics, making them more suitable for neural network processing. This layer supports multiple transformation types including log, square root, Box-Cox, Yeo-Johnson, and more, with an intelligent 'auto' mode that selects the best transformation based on data characteristics.
This layer is particularly valuable for preprocessing numerical data where the original distribution may not be optimal for neural network training, such as skewed distributions, heavy-tailed data, or features with varying scales.
π How It Works
The DistributionTransformLayer processes numerical features through intelligent transformation:
- Distribution Analysis: Analyzes input data characteristics (skewness, kurtosis, etc.)
- Transformation Selection: Chooses optimal transformation based on data properties
- Parameter Learning: Learns transformation parameters during training
- Data Transformation: Applies the selected transformation to normalize the data
- Output Generation: Returns transformed features with improved distribution
graph TD
A[Input Features] --> B[Distribution Analysis]
B --> C{Transform Type}
C -->|Auto| D[Best Fit Selection]
C -->|Manual| E[Specified Transform]
D --> F[Log Transform]
D --> G[Box-Cox Transform]
D --> H[Yeo-Johnson Transform]
D --> I[Other Transforms]
E --> F
E --> G
E --> H
E --> I
F --> J[Transformed Features]
G --> J
H --> J
I --> J
style A fill:#e6f3ff,stroke:#4a86e8
style J fill:#e8f5e9,stroke:#66bb6a
style B fill:#fff9e6,stroke:#ffb74d
style D fill:#f3e5f5,stroke:#9c27b0
π‘ Why Use This Layer?
| Challenge | Traditional Approach | DistributionTransformLayer's Solution |
|---|---|---|
| Skewed Data | Manual transformation or ignore | π― Automatic detection and transformation of skewed distributions |
| Scale Differences | Manual normalization | β‘ Intelligent scaling based on data characteristics |
| Distribution Types | One-size-fits-all approach | π§ Adaptive transformation for different distribution types |
| Preprocessing Complexity | Manual feature engineering | π Automated preprocessing with learned parameters |
π Use Cases
- Financial Data: Transforming skewed financial metrics and ratios
- Medical Data: Normalizing lab values and health measurements
- Sensor Data: Preprocessing IoT and sensor readings
- Survey Data: Transforming rating scales and response distributions
- Time Series: Preprocessing numerical time series features
π Quick Start
Basic Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Manual Transformation
1 2 3 4 5 6 7 8 9 10 | |
In a Sequential Model
1 2 3 4 5 6 7 8 9 10 11 | |
In a Functional Model
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Advanced Configuration
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
π API Reference
kerasfactory.layers.DistributionTransformLayer
This module implements a DistributionTransformLayer that applies various transformations to make data more normally distributed or to handle specific distribution types better. It's particularly useful for preprocessing data before anomaly detection or other statistical analyses.
Classes
DistributionTransformLayer
1 2 3 4 5 6 7 8 9 10 11 | |
Layer for transforming data distributions to improve anomaly detection.
This layer applies various transformations to make data more normally distributed or to handle specific distribution types better. Supported transformations include log, square root, Box-Cox, Yeo-Johnson, arcsinh, cube-root, logit, quantile, robust-scale, and min-max.
When transform_type is set to 'auto', the layer automatically selects the most appropriate transformation based on the data characteristics during training.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform_type |
str
|
Type of transformation to apply. Options are 'none', 'log', 'sqrt', 'box-cox', 'yeo-johnson', 'arcsinh', 'cube-root', 'logit', 'quantile', 'robust-scale', 'min-max', or 'auto'. Default is 'none'. |
'none'
|
lambda_param |
float
|
Parameter for parameterized transformations like Box-Cox and Yeo-Johnson. Default is 0.0. |
0.0
|
epsilon |
float
|
Small value added to prevent numerical issues like log(0). Default is 1e-10. |
1e-10
|
min_value |
float
|
Minimum value for min-max scaling. Default is 0.0. |
0.0
|
max_value |
float
|
Maximum value for min-max scaling. Default is 1.0. |
1.0
|
clip_values |
bool
|
Whether to clip values to the specified range in min-max scaling. Default is True. |
True
|
auto_candidates |
list[str] | None
|
list of transformation types to consider when transform_type is 'auto'. If None, all available transformations will be considered. Default is None. |
None
|
name |
str | None
|
Optional name for the layer. |
None
|
Input shape
N-D tensor with shape: (batch_size, ..., features)
Output shape
Same shape as input: (batch_size, ..., features)
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | |
Initialize the DistributionTransformLayer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform_type |
str
|
Type of transformation to apply. |
'none'
|
lambda_param |
float
|
Lambda parameter for Box-Cox transformation. |
0.0
|
epsilon |
float
|
Small value to avoid division by zero. |
1e-10
|
min_value |
float
|
Minimum value for clipping. |
0.0
|
max_value |
float
|
Maximum value for clipping. |
1.0
|
clip_values |
bool
|
Whether to clip values. |
True
|
auto_candidates |
list[str] | None
|
List of candidate transformations for auto mode. |
None
|
name |
str | None
|
Name of the layer. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Source code in kerasfactory/layers/DistributionTransformLayer.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
π§ Parameters Deep Dive
transform_type (str)
- Purpose: Type of transformation to apply
- Options: 'none', 'log', 'sqrt', 'box-cox', 'yeo-johnson', 'arcsinh', 'cube-root', 'logit', 'quantile', 'robust-scale', 'min-max', 'auto'
- Default: 'none'
- Impact: Determines how the data is transformed
- Recommendation: Use 'auto' for automatic selection, specific types for known distributions
lambda_param (float)
- Purpose: Parameter for Box-Cox and Yeo-Johnson transformations
- Range: -2.0 to 2.0 (typically 0.0 to 1.0)
- Impact: Controls the strength of the transformation
- Recommendation: Use 0.5 for moderate transformation, 0.0 for log-like behavior
epsilon (float)
- Purpose: Small value to prevent numerical issues
- Range: 1e-10 to 1e-6
- Impact: Prevents log(0) and division by zero errors
- Recommendation: Use 1e-8 for most cases, 1e-10 for very small values
π Performance Characteristics
- Speed: β‘β‘β‘β‘ Very fast - simple mathematical transformations
- Memory: πΎπΎ Low memory usage - minimal additional parameters
- Accuracy: π―π―π―π― Excellent for improving data distribution characteristics
- Best For: Numerical data with skewed or non-normal distributions
π¨ Examples
Example 1: Financial Data Preprocessing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | |
Example 2: Sensor Data Preprocessing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
Example 3: Survey Data Analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
π‘ Tips & Best Practices
- Auto Mode: Use 'auto' for unknown distributions, specific types for known patterns
- Data Validation: Check for negative values before applying log transformations
- Epsilon Tuning: Adjust epsilon based on your data's numerical precision
- Feature-Specific: Apply different transformations to different feature types
- Monitoring: Track transformation effects on model performance
- Inverse Transform: Consider if you need to inverse transform predictions
β οΈ Common Pitfalls
- Negative Values: Log and sqrt transformations require non-negative values
- Zero Values: Use appropriate epsilon to handle zero values
- Overfitting: Don't over-transform - sometimes original distributions are fine
- Interpretability: Transformed features may be harder to interpret
- Inverse Transform: Remember to inverse transform if needed for predictions
π Related Layers
- DistributionAwareEncoder - Distribution-aware feature encoding
- AdvancedNumericalEmbedding - Advanced numerical embeddings
- DifferentiableTabularPreprocessor - End-to-end preprocessing
- CastToFloat32Layer - Type casting utility
π Further Reading
- Box-Cox Transformation - Box-Cox transformation details
- Yeo-Johnson Transformation - Yeo-Johnson transformation
- Data Preprocessing in Machine Learning - Data preprocessing concepts
- KerasFactory Layer Explorer - Browse all available layers
- Data Preprocessing Tutorial - Complete guide to data preprocessing