Blue Apron's new website variant lifts the add-to-cart rate by 18.8%
Replica ran 990 simulated user sessions across Blue Apron’s control and treatment website variants. Predicted the A/B test outcome in minutes, not weeks.
Hypothesis
Hiding the calorie filter buttons will increase the added_to_cart metric for users evaluating meals without calorie-based filtering.
Explain every metric change
Every Blue Apron simulated user includes a full session video, action transcript, and interview-style Q&A. Replica clusters patterns across sessions so you can see why users converted, dropped off, hesitated, or behaved differently.
Full session videos
See every click, scroll, pause, and typed input like a real session replay — plus the verbatim thoughts and actions behind each moment, something real session recordings can’t show.
Q&A with Blue Apron users
Ask a single simulated user why they converted, dropped off, or hesitated — or ask aggregate questions across all runs to see the broader pattern of answers.
Auto-clustered themes
Replica analyzes every transcript and groups recurring themes across the simulation, prioritizing patterns by frequency, relevance, and importance.
Searchable transcripts
Search across all sessions, then click any line to jump directly to that moment in the replay.
Use your existing data stack to simulate Blue Apron's real users
Replica uses your existing product and session data to create simulated users, finetune their behavior, and run thousands of browser sessions across control and treatment variants of blueapron.com. In minutes, you get predicted lift, confidence intervals, session replays, transcripts, and behavioral themes before launching the test.



Replica connects to Blue Apron's analytics, experimentation, session replay, and warehouse tools to create simulated users matched to Blue Apron's real audience. We use user attributes and traffic patterns to define each simulated user, then finetune their behavior on session recordings and action transcripts.
Replica uses these simulated users to run thousands of web sessions across control and treatment variants of blueapron.com in minutes. Each simulated user views, thinks, scrolls, clicks, and types like a real user.
Predicted lift and 95% confidence intervals show what changed. Session replays, transcripts, and clustered behavioral themes show why. Ship or skip with quantitative signal and qualitative evidence.
Statsig
Amplitude
Optimizely
Statsig
Amplitude
Optimizely
Statsig
Amplitude
OptimizelyValidated on real experiments and real user behavior
Replica’s case studies show forecasts matching live A/B test outcomes. Our research shows that simulations become more accurate when grounded in real user attributes, behavior data, and finetuning — the core methodology behind Replica.
“The fact that Replica can accurately predict whether A or B is better means we can test extremely rapidly.”
See how Replica performs on Blue Apron's past A/B tests
Share past website A/B tests where you already know the outcomes. Replica calibrates its simulations against your experiment history, compares predicted lift to actual lift, and gets Replica ready for production use on future tests.
