📍 PositionalEmbedding
📍 PositionalEmbedding
🟢 Beginner
✅ Stable
⏱️ Time Series
🎯 Overview
The PositionalEmbedding layer generates fixed sinusoidal positional encodings for time series and sequence data. Unlike learnable positional embeddings, this layer uses mathematically defined sinusoidal patterns that encode absolute position information, allowing transformer-based models to understand temporal relationships without training positional parameters.
Positional embeddings are essential for transformer architectures as they provide the model with information about the order and position of elements in sequences.
🔍 How It Works
The PositionalEmbedding generates sinusoidal encodings based on the mathematical formula:
PE(pos, 2i) = sin(pos / 10000^(2i/d_model))
PE(pos, 2i+1) = cos(pos / 10000^(2i/d_model))
Where:
- pos is the position in the sequence
- i is the dimension index
- d_model is the model dimension
💡 Why Use This Layer?
Challenge
Traditional Approach
PositionalEmbedding's Solution
Position Information
No position awareness
🎯 Fixed sinusoidal encodings
Scalability
Learnable embeddings limited
∞ Extrapolates to any length
Interpretability
Black-box embeddings
🔍 Mathematically interpretable patterns
Computational Cost
Learnable parameters
⚡ Zero-cost fixed computation
Generalization
Poor on unseen lengths
🌍 Works on any sequence length
📊 Use Cases
Transformer Models : Providing position information to attention mechanisms
Time Series Forecasting : Encoding temporal positions
Language Models : Position awareness in NLP tasks
Sequence-to-Sequence Models : Maintaining order information
Any Sequential Model : When you need fixed, interpretable positional information
🚀 Quick Start
Basic Usage
1
2
3
4
5
6
7
8
9
10
11
12
13 import keras
from kerasfactory.layers import PositionalEmbedding
# Create sample sequence
batch_size , seq_len , d_model = 32 , 100 , 64
x = keras . random . normal (( batch_size , seq_len , d_model ))
# Apply positional embedding
pos_emb = PositionalEmbedding ( max_len = 100 , d_model = d_model )
pe = pos_emb ( x )
print ( f "Input shape: { x . shape } " ) # (32, 100, 64)
print ( f "Embedding shape: { pe . shape } " ) # (32, 100, 64)
In a Sequential Model
1
2
3
4
5
6
7
8
9
10
11
12 import keras
from kerasfactory.layers import PositionalEmbedding , TokenEmbedding
model = keras . Sequential ([
keras . layers . Input ( shape = ( 100 , 1 )),
TokenEmbedding ( c_in = 1 , d_model = 64 ), # Embed raw values
PositionalEmbedding ( max_len = 100 , d_model = 64 ), # Add positional info
keras . layers . Dense ( 32 , activation = 'relu' ),
keras . layers . Dense ( 1 ),
])
model . compile ( optimizer = 'adam' , loss = 'mse' )
🔧 API Reference
PositionalEmbedding
kerasfactory . layers . PositionalEmbedding (
max_len : int = 5000 ,
d_model : int = 512 ,
name : str | None = None ,
** kwargs
)
Parameters
Parameter
Type
Default
Description
max_len
int
5000
Maximum sequence length to support
d_model
int
512
Model dimension (embedding dimension)
name
str \| None
None
Optional layer name
(batch_size, seq_len, ...)
Output Shape
Returns
Fixed positional encodings for the sequence
Time Complexity : O(seq_len × d_model) for generation (one-time during build)
Space Complexity : O(seq_len × d_model) for storage
Computational Cost : Minimal (no learnable parameters)
Training Efficiency : No gradient computation needed
🎨 Advanced Usage
With Different Sequence Lengths
1
2
3
4
5
6
7
8
9
10
11
12
13 from kerasfactory.layers import PositionalEmbedding
# Create layer for max length 512
pos_emb = PositionalEmbedding ( max_len = 512 , d_model = 64 )
# Can handle any length up to max_len
x_short = keras . random . normal (( 32 , 100 , 64 ))
x_medium = keras . random . normal (( 32 , 256 , 64 ))
x_long = keras . random . normal (( 32 , 512 , 64 ))
pe_short = pos_emb ( x_short ) # Works fine
pe_medium = pos_emb ( x_medium ) # Works fine
pe_long = pos_emb ( x_long ) # Works fine
Combining with Multiple Embeddings
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 from kerasfactory.layers import PositionalEmbedding , TokenEmbedding
# Create embeddings
token_emb = TokenEmbedding ( c_in = 1 , d_model = 64 )
pos_emb = PositionalEmbedding ( max_len = 100 , d_model = 64 )
# Process sequence
x = keras . random . normal (( 32 , 100 , 1 ))
x_embedded = token_emb ( x ) # (32, 100, 64)
x_pos = pos_emb ( x_embedded ) # (32, 100, 64)
# Combine embeddings
output = x_embedded + x_pos # Element-wise addition
print ( output . shape ) # (32, 100, 64)
🔍 Visual Representation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 ┌─────────────────────────────────────────┐
│ Input Sequence (seq_len) │
│ Shape: (batch, seq_len, d_model) │
└─────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Generate Positional Encodings │
│ - For each position: 0 to seq_len-1 │
│ - Apply sin/cos patterns │
└─────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Positional Embeddings (fixed) │
│ Shape: (1, seq_len, d_model) │
│ - Extrapolates to any length │
│ - No learnable parameters │
└─────────────────────────────────────────┘
💡 Best Practices
Choose Appropriate max_len : Set it to the maximum sequence length you expect
Use Same d_model : Ensure d_model matches your embedding dimension
Add to Embeddings : Typically added to token/value embeddings via addition
Placement : Usually placed after initial embeddings, before attention layers
Multiple Scales : The layer naturally captures patterns at multiple frequency scales
⚠️ Common Pitfalls
❌ max_len too small : Sequence lengths beyond max_len won't be handled correctly
❌ d_model mismatch : Using different d_model than embeddings causes shape errors
❌ Treating as learnable : These are fixed; don't expect them to train
❌ Using alone : Usually combined with token embeddings, not used standalone
📚 References
Vaswani et al. (2017). "Attention Is All You Need" - Original transformer paper
Sinusoidal positional encoding patterns from the original attention paper
IEEE/ACM standards for positional encoding implementations
✅ Serialization
# Get configuration
config = pos_emb . get_config ()
# Recreate layer
pos_emb_new = PositionalEmbedding . from_config ( config )
Last Updated : 2025-11-04
Version : 1.0
Keras : 3.0+