Cyclical features, on the circle
Raw integers treat hour=23 and hour=0 as 23 apart — even though on a clock they’re neighbours. Map the value to (cos θ, sin θ) with θ = 2π·x/period and the wraparound is preserved. Drag the value below and watch raw-numeric and sin/cos encodings race on the same task. Linear and distance-based models almost always win with sin/cos; trees mostly don’t care.
Theory & exercises · the unit-circle map, when to skip it, harmonics
The math, in three moves
1. Map the scalar to an angle.
$$ \theta = 2\pi \cdot \frac{x}{P} \qquad (x \in [0, P)) $$$P$ is the period — 24 for hour, 12 for month, 7 for day-of-week, 360 for angle. The map is bijective on one period and identifies $P$ with $0$.
2. Encode as a point on the unit circle.
$$ \phi(x) = \big(\cos\theta,\; \sin\theta\big) $$Two features replace one. The Euclidean distance between $\phi(x_1)$ and $\phi(x_2)$ is monotone in the arc distance — close on the cycle ⇒ close in feature space.
3. (Optional) Add harmonics.
$$ \phi_k(x) = \big(\cos(k\theta),\; \sin(k\theta)\big),\quad k=1,2,\dots,K $$Two features per harmonic. With enough $K$, you reproduce a Fourier basis — linear models on these features can approximate any periodic target.
Try this
The seam test
Pick hour, model = Linear, task = Regression. Look at the prediction curves at the seam (x near 23 → 0). Raw breaks; sin/cos sweeps through smoothly. That single fact is the whole pitch for cyclical encoding.
Trees don’t need it
Switch to Tree (binning). The raw and sin/cos curves match exactly. Trees split on thresholds; they don’t care if your feature is cyclic. Stick with raw integers for tree-only pipelines.
kNN benefits from cyclic distance
kNN here uses an arc distance, not Euclidean. That alone fixes the wraparound problem. If your kNN library only does Euclidean (most do), encode with sin/cos first.
One-hot vs sin/cos size
For angle (period=360), one-hot would mean 360 features. Sin/cos = 2. Same cyclic info, 99% fewer parameters. The size chart on the right shows the same lesson at a glance.
Noise drowns small periods
Push noise σ to 0.9 on day-of-week (period=7). The R² for both encodings collapses. Encoding can’t rescue a target that’s mostly noise.
Importance bars hint at phase
The target here is sin(θ). The sin importance should dominate. Now imagine a target $\cos(\theta)$ — the cos bar would lead. Skewed bars hint at the dominant phase in your data.
In one glance
Frequently asked
23 and 0 as 23 apart, but on the hour clock they’re neighbours. Mapping the value to (cos θ, sin θ) with θ = 2π·x/period preserves cyclic geometry — close points on the cycle stay close in the encoded space. Linear models and distance-based learners (kNN, k-means, SVMs with RBF) benefit immediately.0 and 1 are adjacent. Sin/cos compresses the same information into 2 features while keeping the cyclic structure. Use one-hot only when the period is very small (e.g., 7 weekdays) AND ordering really doesn’t matter.sin(2θ), cos(2θ). This is exactly what Fourier features do. In practice though, if your target needs many harmonics, you probably want a smoother model (GAM, GP, NN) rather than more hand-crafted features.