🧩 Layers - Complete Reference & Explorer
Build sophisticated tabular models with advanced attention, feature engineering, and preprocessing layers.
🎯 Why Use KerasFactory Layers?
| Challenge | Traditional Approach | KerasFactory's Solution |
|---|---|---|
| 🔗 Feature Interactions | Manual feature crosses | 👁️ Tabular Attention - Automatic relationship discovery |
| 🏷️ Mixed Feature Types | Uniform processing | 🧩 Feature-wise Layers - Specialized processing per feature |
| 📊 Complex Distributions | Fixed strategies | 📊 Distribution-Aware Encoding - Adaptive transformations |
| ⚡ Performance Optimization | Post-hoc analysis | 🎯 Built-in Selection - Learned during training |
| 🔒 Production Readiness | Extra tooling needed | ✅ Battle-Tested - Used in production models |
✨ Key Features
Attention Mechanisms
Automatically discover feature relationships and sample importance with advanced attention layers.
Feature-wise Processing
Each feature receives specialized processing through mixture of experts and dedicated layers.
Distribution-Aware
Automatically adapt to different distributions with intelligent encoding and transformations.
Performance Ready
Optimized for production with built-in regularization and efficient memory usage.
Built-in Optimization
Learn which features matter during training, not after with integrated feature selection.
Production Proven
Battle-tested in real-world ML pipelines with comprehensive testing and documentation.
🔍 Interactive Layers Explorer
📚 All Layers by Category
⏱️ Time Series & Forecasting (16 layers)
Specialized layers for time series forecasting, decomposition, and feature extraction with multi-scale pattern recognition.
- PositionalEmbedding - Sinusoidal positional encoding for sequence models
- FixedEmbedding - Non-trainable embeddings for temporal indices (months, days, hours)
- TokenEmbedding - 1D convolution-based embedding for time series values
- TemporalEmbedding - Embedding layer for temporal features (month, day, weekday, hour, minute)
- DataEmbeddingWithoutPosition - Combined token and temporal embedding for comprehensive features
- MovingAverage - Trend extraction using moving average filtering
- SeriesDecomposition - Trend-seasonal decomposition using moving average
- DFTSeriesDecomposition - Frequency-based decomposition using Discrete Fourier Transform
- ReversibleInstanceNorm - Reversible instance normalization with optional denormalization
- ReversibleInstanceNormMultivariate - Multivariate reversible instance normalization
- MultiScaleSeasonMixing - Bottom-up multi-scale seasonal pattern mixing
- MultiScaleTrendMixing - Top-down multi-scale trend pattern mixing
- PastDecomposableMixing - Decomposable mixing encoder combining decomposition and multi-scale mixing
- TemporalMixing - MLP-based temporal mixing for TSMixer architecture
- FeatureMixing - Feed-forward feature mixing for cross-series correlations
- MixingLayer - Core mixing block combining temporal and feature mixing
🧠 Attention Mechanisms (6 layers)
Advanced attention layers for capturing complex feature relationships and dependencies in tabular data.
- TabularAttention - Dual attention mechanism for inter-feature and inter-sample relationships
- MultiResolutionTabularAttention - Multi-resolution attention for different feature scales
- InterpretableMultiHeadAttention - Multi-head attention with explainability features
- TransformerBlock - Standard transformer block with self-attention and feed-forward
- ColumnAttention - Column-wise attention for feature relationships
- RowAttention - Row-wise attention for sample relationships
🔧 Data Preprocessing & Transformation (9 layers)
Essential preprocessing layers for data cleaning, transformation, and preparation for optimal model performance.
- DifferentiableTabularPreprocessor - End-to-end differentiable preprocessing with learnable imputation
- DifferentialPreprocessingLayer - Multiple candidate transformations with learnable combination
- DateParsingLayer - Flexible date parsing from various formats
- DateEncodingLayer - Cyclical date feature encoding
- SeasonLayer - Seasonal feature extraction
- DistributionTransformLayer - Automatic distribution transformation
- DistributionAwareEncoder - Distribution-aware feature encoding
- CastToFloat32Layer - Type casting utility
- AdvancedNumericalEmbedding - Advanced numerical embedding with dual-branch architecture
⚙️ Feature Engineering & Selection (5 layers)
Intelligent feature engineering and selection layers for identifying important features and creating powerful representations.
- VariableSelection - Intelligent variable selection using gated residual networks
- GatedFeatureSelection - Learnable feature selection with gating
- GatedFeatureFusion - Gated mechanism for feature fusion
- SparseAttentionWeighting - Sparse attention for efficient computation
- FeatureCutout - Feature cutout for data augmentation and regularization
🏗️ Specialized Architectures (8 layers)
Advanced specialized layers for specific use cases including gated networks, boosting, business rules, and ensemble methods.
- GatedResidualNetwork - Gated residual network with improved gradient flow
- GatedLinearUnit - Gated linear transformation
- TabularMoELayer - Mixture of Experts for adaptive expert selection
- BoostingBlock - Gradient boosting inspired neural block
- BoostingEnsembleLayer - Ensemble of boosting blocks
- BusinessRulesLayer - Domain-specific business rules integration
- StochasticDepth - Stochastic depth regularization
- SlowNetwork - Careful feature processing with controlled information flow
🛠️ Utility & Graph Processing (8 layers)
Essential utility layers for data processing, graph operations, and anomaly detection.
- GraphFeatureAggregation - Graph feature aggregation for relational learning
- AdvancedGraphFeatureLayer - Advanced graph feature processing
- MultiHeadGraphFeaturePreprocessor - Multi-head graph preprocessing
- NumericalAnomalyDetection - Statistical anomaly detection for numerical features
- CategoricalAnomalyDetectionLayer - Pattern-based anomaly detection for categorical features
- HyperZZWOperator - Hyperparameter-aware operator for adaptive behavior
📋 Complete API Reference
⏱️ Time Series & Forecasting (16 layers)
Specialized layers for time series analysis, forecasting, and pattern recognition with advanced decomposition and mixing strategies.
📍 PositionalEmbedding
Sinusoidal positional encoding for sequence models and transformers.
Use when: You need position information in transformer models
🔧 FixedEmbedding
Non-trainable sinusoidal embeddings for temporal indices (months, days, hours).
Use when: You want fixed cyclical embeddings for temporal features
🎫 TokenEmbedding
1D convolution-based embedding for time series values.
Use when: You need learnable embeddings for raw time series values
⏰ TemporalEmbedding
Embedding layer for temporal features like month, day, weekday, hour, minute.
Use when: You have temporal feature information to encode
🎯 DataEmbeddingWithoutPosition
Combined token and temporal embedding for comprehensive feature representation.
Use when: You want unified embeddings for both values and temporal features
🏃 MovingAverage
Trend extraction using moving average filtering for time series.
Use when: You need to separate trends from seasonal components
🔀 SeriesDecomposition
Trend-seasonal decomposition using moving average filtering.
Use when: You want explicit decomposition of time series components
📊 DFTSeriesDecomposition
Frequency-based series decomposition using Discrete Fourier Transform.
Use when: You prefer frequency-domain decomposition
🔄 ReversibleInstanceNorm
Reversible instance normalization with optional denormalization for time series.
Use when: You need reversible normalization for stable training
🏗️ ReversibleInstanceNormMultivariate
Multivariate version of reversible instance normalization.
Use when: You have multivariate time series data
🌊 MultiScaleSeasonMixing
Bottom-up multi-scale seasonal pattern mixing with hierarchical aggregation.
Use when: You want to capture seasonal patterns at multiple scales
📈 MultiScaleTrendMixing
Top-down multi-scale trend pattern mixing with hierarchical decomposition.
Use when: You want to capture trend patterns at multiple scales
🔀 PastDecomposableMixing
Past decomposable mixing encoder combining decomposition and multi-scale mixing.
Use when: You need comprehensive decomposition with multi-scale mixing
⏱️ TemporalMixing
MLP-based temporal mixing for TSMixer that applies transformations across time.
Use when: You want lightweight temporal pattern learning
🔀 FeatureMixing
Feed-forward feature mixing learning cross-series correlations.
Use when: You want to capture dependencies between time series
🔀 MixingLayer
Core mixing block combining TemporalMixing and FeatureMixing for TSMixer.
Use when: You need dual-perspective temporal and feature learning
🎯 Feature Selection & Gating (5 layers)
Layers for dynamic feature selection, gating mechanisms, and feature fusion.
🔀 VariableSelection
Dynamic feature selection using gated residual networks with optional context conditioning.
Use when: You need automatic feature importance learning during training
🚪 GatedFeatureSelection
Feature selection layer using gating mechanisms for conditional feature routing.
Use when: You want learnable adaptive feature importance
🌊 GatedFeatureFusion
Combines and fuses features using gated mechanisms for adaptive integration.
Use when: You need to intelligently combine multiple feature representations
📍 GatedLinearUnit
Gated linear transformation for controlling information flow.
Use when: You need selective information flow in your model
🔗 GatedResidualNetwork
Gated residual network architecture with improved gradient flow.
Use when: You need robust feature processing with residual connections
👁️ Attention Mechanisms (6 layers)
Advanced attention layers for capturing complex feature and sample relationships.
🎯 TabularAttention
Dual attention mechanism for inter-feature and inter-sample relationships.
Use when: You have complex feature interactions to discover
📊 MultiResolutionTabularAttention
Multi-resolution attention for numerical and categorical features.
Use when: You have mixed feature types needing different processing
🔍 InterpretableMultiHeadAttention
Multi-head attention with explainability features.
Use when: You need to understand attention patterns
🧠 TransformerBlock
Complete transformer block with self-attention and feed-forward.
Use when: You want standard transformer architecture for tabular data
📌 ColumnAttention
Column-wise attention for feature relationships.
Use when: You want to focus on feature-level interactions
📍 RowAttention
Row-wise attention for sample relationships.
Use when: You want to capture sample-level patterns
📊 Data Preprocessing & Transformation (9 layers)
Essential preprocessing layers for data preparation and transformation.
🔄 DistributionTransformLayer
Automatic distribution transformation for improved analysis.
Use when: You have skewed distributions that need normalization
🎓 DistributionAwareEncoder
Distribution-aware feature encoding with auto-detection.
Use when: You need adaptive encoding based on data distributions
📈 AdvancedNumericalEmbedding
Advanced numerical embedding with dual-branch architecture.
Use when: You want rich numerical feature representations
📅 DateParsingLayer
Flexible date parsing from various formats.
Use when: You have date/time features to parse
🕐 DateEncodingLayer
Cyclical date feature encoding.
Use when: You want cyclical representations of temporal features
🌙 SeasonLayer
Seasonal feature extraction for temporal patterns.
Use when: Your data has seasonal patterns
🔀 DifferentialPreprocessingLayer
Multiple transformations with learnable combination.
Use when: You want the model to learn optimal preprocessing
🔧 DifferentiableTabularPreprocessor
End-to-end differentiable preprocessing.
Use when: You want learnable imputation and normalization
🎨 CastToFloat32Layer
Type casting utility for float32 precision.
Use when: You need to ensure consistent data types
⚙️ Feature Engineering & Selection (5 layers)
Advanced feature engineering and selection layers.
🧬 GraphFeatureAggregation
Graph feature aggregation for relational learning.
Use when: You have feature relationships to model
🎯 SparseAttentionWeighting
Sparse attention for efficient computation.
Use when: You need memory-efficient attention
🗑️ FeatureCutout
Feature cutout for data augmentation.
Use when: You want to improve model robustness through augmentation
🏗️ Specialized Architectures (8 layers)
Advanced specialized layers for specific use cases.
📈 BoostingBlock
Gradient boosting inspired neural block.
Use when: You want boosting-like behavior in neural networks
🎯 BoostingEnsembleLayer
Ensemble of boosting blocks.
Use when: You want ensemble-based learning
🏗️ BusinessRulesLayer
Domain-specific business rules integration.
Use when: You need to enforce domain constraints
🐢 SlowNetwork
Careful feature processing with controlled flow.
Use when: You want deliberate, well-controlled processing
⚡ HyperZZWOperator
Hyperparameter-aware operator for adaptive behavior.
Use when: You want dynamic hyperparameter adjustment
📊 TabularMoELayer
Mixture of Experts for tabular data.
Use when: You have diverse data requiring different expert processing
🎲 StochasticDepth
Stochastic depth regularization.
Use when: You want improved generalization in deep networks
🛠️ Utility & Graph Processing (8 layers)
Utility layers for data processing, graph operations, and anomaly detection.
🧬 AdvancedGraphFeatureLayer
Advanced graph feature processing with dynamic learning.
Use when: You have complex feature relationships
👥 MultiHeadGraphFeaturePreprocessor
Multi-head graph preprocessing.
Use when: You want parallel feature processing
📉 NumericalAnomalyDetection
Statistical anomaly detection for numerical features.
Use when: You need to detect numerical outliers
📊 CategoricalAnomalyDetectionLayer
Pattern-based anomaly detection for categorical features.
Use when: You need to detect categorical anomalies
🚀 Quick Start Guide
Getting Started with KerasFactory Layers
**Step 1: Choose Your Base Layer** - Start with `DifferentiableTabularPreprocessor` for data preparation - Add `VariableSelection` for feature importance **Step 2: Add Attention** - Use `TabularAttention` to capture feature relationships **Step 3: Build Your Model** - Stack layers together for powerful architectures **Example:**1 2 3 4 5 6 7 8 9 | |
📖 For More Information
- API Reference - Detailed API documentation with autodoc references
- Contributing - How to contribute new layers
- Examples - Real-world usage examples
- Tutorials - Step-by-step guides