Key Takeaways

“Unlimited experiments” sounds powerful, but the real bottleneck for scaling e-commerce isn’t test count — it’s test design, traffic per test, and team capacity to interpret results.
Unlimited testing only delivers if you can also run mutually exclusive groups, statistical isolation, and have enough traffic per variant for meaningful significance.
Among major platforms, Varify (flat-rate), Convert, and Kameleoon offer effectively unlimited testing within their tier; Optimizely and AB Tasty cap based on contract terms.
If your store does over €5M ARR with 200k+ monthly visitors, unlimited testing genuinely changes what’s possible. Below that, you’ll likely max out at 12-20 quality tests per year regardless of the platform’s ceiling.

The pitch is everywhere: unlimited experiments, unlimited variants, unlimited goals. For a scaling e-commerce brand it sounds like exactly what you need — remove the constraints, let the team run wild, watch the wins compound. The reality is more nuanced. Unlimited testing has real value, but only at certain traffic levels, with certain team structures, and only when the testing platform handles the side effects (test pollution, statistical isolation, performance load) properly.

This article unpacks what “unlimited” actually means in practice for e-commerce teams scaling from €1M to €50M ARR — including the traffic math, the platform shortlist, and where unlimited testing genuinely changes what’s possible vs. where it’s a marketing line that won’t affect your reality.

What “unlimited experiments” actually means

Different platforms mean different things by “unlimited.” The four flavors:

1. Unlimited active tests

You can have any number of tests running simultaneously. The platform doesn’t cap concurrent experiments. Most modern platforms now offer this on their middle and upper tiers.

2. Unlimited variants per test

You can test 10 variants instead of being capped at 4. Useful for design exploration, but be careful: with 10 variants, each gets 10% of test traffic, which means you need 10x the traffic to reach significance. Often more powerful in theory than in practice.

3. Unlimited goals/metrics tracked per test

You can track revenue, AOV, add-to-cart rate, time on page, and 50 other metrics per test. Almost always overkill — tracking too many secondary metrics increases false positive risk and makes interpretation harder, not easier.

4. Unlimited tested users (no traffic-based pricing)

The most genuinely valuable kind of “unlimited” for scaling e-commerce. Your bill doesn’t change when traffic grows or spikes during seasonal peaks. Flat-rate pricing is the e-commerce-friendly version of this.

When unlimited testing actually changes what’s possible

It changes things when:

You have 200k+ monthly visitors. Below this, you don’t have enough traffic to run more than 4-6 simultaneous tests with statistical power. The cap doesn’t bind.
You have a dedicated experimentation team (2+ people). Below this, you can’t generate enough quality test ideas to fill an unlimited program. The bottleneck shifts from platform to team.
You operate in multiple categories or pages. A multi-category retailer can run home, category, PDP, and checkout tests simultaneously without overlap. A single-product brand can’t.
You have seasonal traffic spikes. Unlimited tested users (i.e., flat-rate pricing) is genuinely valuable on Black Friday, Cyber Monday, and major sale events.

It doesn’t change things when:

You’re under 100k monthly visitors. You’re running 6-12 tests a year either way; the cap doesn’t bind.
Your team is one person doing CRO part-time. Test ideation is the bottleneck, not platform capacity.
You don’t have a clear test prioritization framework. Unlimited bandwidth without a hypothesis backlog produces unfocused testing and noise.

The traffic math for scaling e-commerce

Statistical significance requires a minimum sample size per variant. Here’s the rough math:

5,000 visitors per variant per week is roughly the floor for detecting a 10% lift on a 2% conversion rate (typical e-commerce baseline).
For an A/B test with 50/50 split, that’s 10,000 visitors per test per week.
For 5 simultaneous tests on different pages, you need around 50,000 weekly visitors going through those pages combined.
For 20 simultaneous tests — the kind of program where unlimited testing matters — you typically need 200,000+ monthly visitors with enough page diversity to avoid overlap.

Most scaling e-commerce brands hit a practical ceiling around 6-15 quality tests per year, regardless of platform. What “unlimited” offers above that ceiling is rarely the bottleneck.

Which platforms actually offer unlimited testing

Varify — effectively unlimited within tier, flat-rate

Flat monthly pricing means tested-user count doesn’t cap. No limit on simultaneous active tests on the standard tier. Single-page focus means simultaneous tests across pages don’t conflict. Strong fit for European e-commerce that needs predictable pricing as traffic grows.

Convert — unlimited tests on most plans

Most plans include unlimited tests and goals. Pricing scales with monthly tested users, so “unlimited” only applies to test count, not to traffic.

Kameleoon — unlimited within enterprise tier

Standard tier offers unlimited tests; pricing is enterprise. Strong on AI personalization, which can drive efficiency at scale.

VWO — tier-dependent, traffic-based

Higher VWO tiers include unlimited tests but pricing scales with monthly tested users. A high-traffic store can hit a much higher invoice quickly.

Optimizely — contractually defined

Test counts and tested users are negotiated in the enterprise contract. “Unlimited” depends on what you’ve signed.

AB Tasty — tier-dependent

Premium tiers include unlimited tests. Pricing typically enterprise.

What actually helps scaling e-commerce more than unlimited testing

If you’re scaling from €1M to €50M ARR, in our experience these factors matter more than the test ceiling:

A clear test prioritization framework. ICE (Impact / Confidence / Ease) or PIE (Potential / Importance / Ease). Without one, your team runs ad-hoc tests on whatever feels urgent that week.
A test design discipline. Hypothesis statement, single primary metric, defined sample size, pre-registered analysis plan. Sloppy test design wastes more uplift than any platform’s test cap.
Statistical maturity. Stop tests at significance, not at the first day they look like a winner. Use a confidence threshold appropriate to your risk tolerance (typically 90-95% one-sided).
A learning library. Document every test — winners and losers — with the hypothesis, variant, result, and what you learned. This compounds over years.
Predictable pricing. If you’re scaling, traffic-based pricing surprises hurt. Flat-rate avoids this.

Thomas Kraus

CEO at Varify.io

Unlimited Experiments for Scaling E-Commerce: Are They Actually Worth It?