OPEN-SOURCE SCRIPT
Static K-means Clustering | InvestorUnknown

Static K-Means Clustering is a machine-learning-driven market regime classifier designed for traders who want a data-driven structure instead of subjective indicators or manually drawn zones.
This script performs offline (static) K-means training on your chosen historical window. Using four engineered features:
It groups past market conditions into K distinct clusters (regimes). After training, every new bar is assigned to the nearest cluster via Euclidean distance in 4-dimensional standardized feature space.
This allows you to create models like:
Note:
Core Idea
K-means clustering takes raw, unlabeled market observations and attempts to discover structure by grouping similar bars together.
Pine Script®
Your chosen historical window becomes the static training dataset, and after fitting, the centroids never change again.
This makes the model:
Static Training Window
You select a period with:
Only bars inside this range are used to fit the K-means model. This window defines:

Feature Engineering (4D Input Vector)
Every bar during training becomes a 4-dimensional point: [rsi, cci, cmf, macd_histogram]
This combination balances: momentum, volatility, mean-reversion, trend acceleration giving the algorithm a richer "market fingerprint" per bar.
Standardization
To prevent any feature from dominating due to scale differences (e.g., CMF near zero vs CCI ±200), all features are standardized:
Pine Script®
Centroid Initialization
Centroids start at diverse coordinates using various curves:
Pine Script®
This makes initial cluster spread “random” even though true randomness is hardly achieved in pinescript.
K-Means Iterative Refinement
The algorithm repeats these steps:
(A) Assignment Step, Each bar is assigned to the nearest centroid via Euclidean distance in 4D:
(B) Update Step, Centroids update to the mean of points assigned to them. This repeats iterations times (configurable).
LIVE REGIME CLASSIFICATION
After training, each new bar is:
No re-training occurs. This ensures:
CLUSTER BEHAVIOR & TRADING LOGIC
Clusters (0, 1, 2, 3…) hold no inherent meaning. The user defines what each cluster does.
Example of custom actions:

This flexibility means:

Example on ETHUSD
Important Note:
PERFORMANCE METRICS & ROC TABLE
The indicator computes average 1-bar ROC for each cluster in:
This helps measure:

EQUITY SIMULATION & FEES
Designed for close-to-close realistic backtesting.
Position = cluster of previous bar
Fees applied only on regime switches. Meaning:
Fee input is percentage, but script already converts internally.

Disclaimers
⚠️ This indicator uses machine-learning but does not predict the future. It classifies similarity to past regimes, nothing more.
⚠️ Backtest results are not indicative of future performance.
⚠️ Clusters have no inherent “bullish” or “bearish” meaning. You must interpret them based on your testing and your own feature engineering.
This script performs offline (static) K-means training on your chosen historical window. Using four engineered features:
- RSI (Momentum)
- CCI (Price deviation / Mean reversion)
- CMF (Money flow / Strength)
- MACD Histogram (Trend acceleration)
It groups past market conditions into K distinct clusters (regimes). After training, every new bar is assigned to the nearest cluster via Euclidean distance in 4-dimensional standardized feature space.
This allows you to create models like:
- Regime-based long/short filters
- Volatility phase detectors
- Trend vs. chop separation
- Mean-reversion vs. breakout classification
- Volume-enhanced money-flow regime shifts
- Full machine-learning trading systems based solely on regimes
Note:
- * This script is not a universal ML strategy out of the box.
* The user must engineer the feature set to match their trading style and target market.
* K-means is a tool, not a ready made system, this script provides the framework.
Core Idea
K-means clustering takes raw, unlabeled market observations and attempts to discover structure by grouping similar bars together.
// STEP 1 — DATA POINTS ON A COORDINATE PLANE
// We start with raw, unlabeled data scattered in 2D space (x/y).
// At this point, nothing is grouped—these are just observations.
// K-means will try to discover structure by grouping nearby points.
//
// y ↑
// |
// 12 | •
// | •
// 10 | •
// | •
// 8 | • •
// |
// 6 | •
// |
// 4 | •
// |
// 2 |______________________________________________→ x
// 2 4 6 8 10 12 14
//
//
//
// STEP 2 — RANDOMLY PLACE INITIAL CENTROIDS
// The algorithm begins by placing K centroids at random positions.
// These centroids act as the temporary “representatives” of clusters.
// Their starting positions heavily influence the first assignment step.
//
// y ↑
// |
// 12 | •
// | •
// 10 | • C2 ×
// | •
// 8 | • •
// |
// 6 | C1 × •
// |
// 4 | •
// |
// 2 |______________________________________________→ x
// 2 4 6 8 10 12 14
//
//
//
// STEP 3 — ASSIGN POINTS TO NEAREST CENTROID
// Each point is compared to all centroids.
// Using simple Euclidean distance, each point joins the cluster
// of the centroid it is closest to.
// This creates a temporary grouping of the data.
//
// (Coloring concept shown using labels)
//
// - Points closer to C1 → Cluster 1
// - Points closer to C2 → Cluster 2
//
// y ↑
// |
// 12 | 2
// | 1
// 10 | 1 C2 ×
// | 2
// 8 | 1 2
// |
// 6 | C1 × 2
// |
// 4 | 1
// |
// 2 |______________________________________________→ x
// 2 4 6 8 10 12 14
//
// (1 = assigned to Cluster 1, 2 = assigned to Cluster 2)
// At this stage, clusters are formed purely by distance.
Your chosen historical window becomes the static training dataset, and after fitting, the centroids never change again.
This makes the model:
- Predictable
- Repeatable
- Consistent across backtests
- Fast for live use (no recalculation of centroids every bar)
Static Training Window
You select a period with:
- Training Start
- Training End
Only bars inside this range are used to fit the K-means model. This window defines:
- the market regime examples
- the statistical distributions (means/std) for each feature
- how the centroids will be positioned post-trainin
- Bars before training = fully transparent
- Training bars = gray
- Post-training bars = full colored regimes
Feature Engineering (4D Input Vector)
Every bar during training becomes a 4-dimensional point: [rsi, cci, cmf, macd_histogram]
This combination balances: momentum, volatility, mean-reversion, trend acceleration giving the algorithm a richer "market fingerprint" per bar.
Standardization
To prevent any feature from dominating due to scale differences (e.g., CMF near zero vs CCI ±200), all features are standardized:
standardize(value, mean, std) =>
(value - mean) / std
Centroid Initialization
Centroids start at diverse coordinates using various curves:
- linear
- sinusoidal
- sign-preserving quadratic
- tanh compression
init_centroids() =>
// Spread centroids across [-1, 1] using different shapes per feature
for c = 0 to k_clusters - 1
frac = k_clusters == 1 ? 0.0 : c / (k_clusters - 1.0) // 0 → 1
v = frac * 2 - 1 // -1 → +1
array.set(cent_rsi, c, v) // linear
array.set(cent_cci, c, math.sin(v)) // sinusoidal
array.set(cent_cmf, c, v * v * (v < 0 ? -1 : 1)) // quadratic sign-preserving
array.set(cent_mac, c, tanh(v)) // compressed
This makes initial cluster spread “random” even though true randomness is hardly achieved in pinescript.
K-Means Iterative Refinement
The algorithm repeats these steps:
(A) Assignment Step, Each bar is assigned to the nearest centroid via Euclidean distance in 4D:
- distance = sqrt(dx² + dy² + dz² + dw²)
(B) Update Step, Centroids update to the mean of points assigned to them. This repeats iterations times (configurable).
LIVE REGIME CLASSIFICATION
After training, each new bar is:
- Standardized using the training mean/std
- Compared to all centroids
- Assigned to the nearest cluster
- Bar color updates based on cluster
No re-training occurs. This ensures:
- No lookahead bias
- Clean historical testing
- Stable regimes over time
CLUSTER BEHAVIOR & TRADING LOGIC
Clusters (0, 1, 2, 3…) hold no inherent meaning. The user defines what each cluster does.
Example of custom actions:
- Cluster 0 → Cash
- Cluster 1 → Long
- Cluster 2 → Short
- Cluster 3+ → Cash (noise regime)
This flexibility means:
- One trader might have cluster 0 as consolidation.
- Another might repurpose it as a breakout-loading zone.
- A third might ignore 3 clusters entirely.
Example on ETHUSD
Important Note:
- Any change of parameters or chart timeframe or ticker can cause the “order” of clusters to change
- The script does NOT assume any cluster equals any actionable bias, user decides.
PERFORMANCE METRICS & ROC TABLE
The indicator computes average 1-bar ROC for each cluster in:
- Training set
- Test (live) set
This helps measure:
- Cluster profitability consistency
- Regime forward predictability
- Whether a regime is noise, trend, or reversion-biased
EQUITY SIMULATION & FEES
Designed for close-to-close realistic backtesting.
Position = cluster of previous bar
Fees applied only on regime switches. Meaning:
- Staying long → no fee
- Switching long→short → fee applied
- Switching any→cash → fee applied
Fee input is percentage, but script already converts internally.
Disclaimers
⚠️ This indicator uses machine-learning but does not predict the future. It classifies similarity to past regimes, nothing more.
⚠️ Backtest results are not indicative of future performance.
⚠️ Clusters have no inherent “bullish” or “bearish” meaning. You must interpret them based on your testing and your own feature engineering.
Skrypt open-source
W zgodzie z duchem TradingView twórca tego skryptu udostępnił go jako open-source, aby użytkownicy mogli przejrzeć i zweryfikować jego działanie. Ukłony dla autora. Korzystanie jest bezpłatne, jednak ponowna publikacja kodu podlega naszym Zasadom serwisu.
Wyłączenie odpowiedzialności
Informacje i publikacje nie stanowią i nie powinny być traktowane jako porady finansowe, inwestycyjne, tradingowe ani jakiekolwiek inne rekomendacje dostarczane lub zatwierdzone przez TradingView. Więcej informacji znajduje się w Warunkach użytkowania.
Skrypt open-source
W zgodzie z duchem TradingView twórca tego skryptu udostępnił go jako open-source, aby użytkownicy mogli przejrzeć i zweryfikować jego działanie. Ukłony dla autora. Korzystanie jest bezpłatne, jednak ponowna publikacja kodu podlega naszym Zasadom serwisu.
Wyłączenie odpowiedzialności
Informacje i publikacje nie stanowią i nie powinny być traktowane jako porady finansowe, inwestycyjne, tradingowe ani jakiekolwiek inne rekomendacje dostarczane lub zatwierdzone przez TradingView. Więcej informacji znajduje się w Warunkach użytkowania.