Biomarker Treadmill
Biomarker Treadmill turns measurement into momentum: every new number creates pressure for another test, another retest, or another intervention before anyone has written the decision rule.
Also known as: testing treadmill, dashboard escalation, metric creep, screening cascade, data-driven overtesting
Picture the treadmill part literally: the person is working, the belt is moving, but the room is not changing. Biomarker Treadmill does the same thing with health data. Panels, scans, dashboards, retests, and wearable scores keep producing motion, while the next number is not tied to a decision that changes risk, function, symptoms, or care. Measurement stays valuable when it answers a question. It becomes a treadmill when each result mainly creates the next measurement.
Context
Longevity medicine has made measurement feel like progress. A reader can now buy or request broad blood panels, continuous glucose monitoring, coronary calcium scoring, coronary CT angiography, full-body MRI, multi-cancer early detection blood tests, and DEXA body composition. Wearable recovery scores, biological-age reports, microbiome profiles, and hormone panels extend the same measurement menu. Some of those tests answer strong clinical questions. Some answer weak ones. Some are research-adjacent.
Biomarker Treadmill begins when measurement becomes self-justifying. A person starts with one sensible baseline. A low-level abnormality appears. The follow-up panel gets wider. A dashboard adds new scores. A clinic recommends annual repetition because “tracking trends” sounds responsible. Soon the plan is organized around the next measurement event rather than around decisions that improve risk, function, or quality of life.
This is not an argument against measurement. It is an argument against measurement without governance. A useful biomarker changes a decision. A treadmill biomarker creates motion.
Problem
The trap is the inference that more data means better health. In prevention, that is often false. More data can mean earlier detection of an important risk. It can also mean false positives, incidental findings, ambiguous abnormalities, anxiety, repeat testing, specialist visits, procedures, and a thicker chart that does not change outcomes.
The longevity audience is especially vulnerable because it is numerate, self-directed, and willing to pay. Those traits can be assets. They also make it easy to treat uncertainty as a purchasing problem. If one panel leaves doubt, order a larger panel. If a scan finds an indeterminate lesion, scan again. If a biological-age report moves the wrong way, add more interventions and retest. The dashboard starts asking questions the evidence cannot yet answer.
The treadmill is easiest to see when no one can complete this sentence before the test is ordered: “If the result is X, we will do Y; if it is Z, we will do nothing.”
Forces
- Early detection can help, but low-prevalence screening creates many ambiguous findings.
- A broader panel feels thorough, but every added measure needs its own false-positive and actionability rule.
- Trend tracking can be useful, but normal variation can look like a mandate when measured too often.
- Expensive testing changes psychology. After paying, restraint can feel like waste.
- Clinics need data to guide care, but commercial dashboards can reward more measurement rather than better decisions.
- The reader wants agency, while weakly governed testing can turn agency into anxiety.
Solution
Attach every biomarker to a decision rule before ordering it. The rule names the question, the action threshold, the follow-up owner, and the reason not to act. Without that rule, the test may still be interesting. It is not yet governed care.
Use a five-part test:
| Gate | Good answer | Treadmill answer |
|---|---|---|
| Question | What clinical, functional, or behavioral question does this measure answer? | “It is useful to know.” |
| Evidence | What evidence tier supports using this result for that question? | “Clinics track it.” |
| Threshold | What result would change the plan? | “We will see what it says.” |
| Follow-up | Who owns abnormal, borderline, or incidental findings? | “The dashboard will flag it.” |
| Stop rule | When do we stop repeating it? | “Annual tracking.” |
The best testing plan has fewer mysteries after the test than before it. A lipid panel should clarify cardiovascular-risk management. A CGM should answer a bounded behavior question. A full-body MRI should have an incidental-finding policy before the scan. A biological-age test should state what model it uses, what change exceeds noise, and why the result would change anything.
The most dangerous phrase in preventive testing is “baseline.” A baseline is useful only when it anchors a future decision. Otherwise it is a souvenir from a moment of uncertainty.
The correction is usually subtraction, not abstinence. Keep high-yield measurements. Drop weakly actionable ones. Lengthen retest intervals when variation is likely to be noise. Put clinical findings back under clinician ownership. Let established risks, symptoms, function, and durable behaviors outrank dashboard novelty.
Evidence
Evidence tier: Practitioner consensus. Biomarker Treadmill is not a formal diagnosis. It is a named pattern assembled from overdiagnosis research, cascade-effects literature, screening evidence, imaging guidance, consumer-sensor cautions, and the observable economics of longevity clinics.
The overdiagnosis literature supplies the base. Moynihan, Doust, and Henry argued in BMJ that modern medicine can harm healthy people by widening disease definitions, expanding screening, and finding abnormalities that would never have caused symptoms. Welch and Black made the cancer-screening version explicit: early detection can find disease that was never destined to matter clinically, and treatment can then create harm.
The cascade literature explains how a treadmill starts. Deyo described medical cascades as chains of tests and treatments triggered by an unnecessary test, unexpected finding, or anxiety. The initiating event can look small. The downstream path can become expensive, invasive, and hard to stop because each new result creates the next obligation.
General health checks are a useful warning because they look so sensible. The 2019 Cochrane review of general health checks in adults found little or no effect on all-cause mortality or cancer mortality, and probably little or no effect on cardiovascular mortality, while health checks increased new diagnoses. That does not prove every component is useless. It shows that broad measurement programs need outcome evidence, not only plausibility.
Imaging makes the tradeoff visible. The American College of Radiology stated in 2023 that evidence was insufficient to recommend total-body MRI screening for people without symptoms, risk factors, or relevant family history. Its concern was not only cost. It was the identification of non-specific findings that can lead to follow-up testing and procedures without improving health.
Consumer sensors add the psychological version. Orthosomnia showed that sleep trackers can make some patients more preoccupied with sleep scores, sometimes worsening insomnia behavior. CGM anxiety shows a similar pattern for food. The lesson is not that wearables are useless. The lesson is that high-frequency feedback can create its own problem when interpretation rules are weak.
How It Plays Out
A 46-year-old starts with a reasonable bloodwork expansion: apoB, Lp(a), fasting insulin, hs-CRP, thyroid markers, vitamin D, and ferritin. One value is mildly abnormal. A second panel adds hormones, micronutrients, inflammatory markers, methylation age, and a microbiome test. The person now has ten small questions and no hierarchy. The stronger move would have been to decide which abnormality changes care and which should be rechecked later.
A clinic sells an annual diagnostic day. The bundle includes Full-Body MRI Screening, coronary imaging, MCED testing, DEXA, broad labs, and biological-age reporting. The risk is not that every component is weak. The risk is that each component gets repeated because it exists in the package. A stable cyst, a new borderline marker, or a 0.8-year biological-age change becomes the next project.
A reader wears a CGM for a planned two-week experiment and learns that late alcohol worsens overnight glucose. That is useful. Biomarker Treadmill starts when the same reader keeps wearing sensors because being unmeasured now feels negligent. The original question was answered. The device remains because surveillance has become emotionally reassuring.
A person taking off-label rapamycin widens the lab panel to make the experiment feel controlled. More labs do not solve the endpoint problem. If no validated healthy-longevity endpoint exists for the intervention, broader measurement can create precision theater. That is where Rapamycin Cargo-Culting and Biomarker Treadmill reinforce each other.
Consequences
Benefits. Naming the antipattern protects serious measurement. ApoB Screening, Lp(a) Screening, coronary calcium scoring, DEXA, and selected labs can be high-value when they answer focused questions. A disciplined clinic can use measurement to find missed risk, track response, and prevent vague wellness advice from replacing medical judgment.
The corrective frame also improves purchasing decisions. The reader can ask a clinic, laboratory, or physician what each test is allowed to decide. If the answer is clear, the test may belong. If the answer is mostly “more information,” the reader has probably found the treadmill.
Liabilities. The correction can be misread as anti-screening. That would be wrong. Some abnormalities deserve prompt evaluation. Some tests are underused, especially in ordinary care. A very high Lp(a), high apoB, suspicious imaging finding, persistent abnormal glucose pattern, unexplained anemia, or concerning symptom should not be dismissed as data excess.
The harder liability is emotional. Testing can provide relief because it makes uncertainty feel active. Not testing can feel passive even when it is the better medical decision. That is why the stop rule matters. A test plan should say not only what to measure, but what uncertainty the reader is willing to leave unmeasured.
The practical rule is blunt: order the test when the result can change a defensible decision. Otherwise, wait, watch the higher-yield risks, or decline the measurement. More numbers are not the same as more care.
Related Articles
Sources
- American College of Radiology. “ACR Statement on Screening Total Body MRI.” April 17, 2023. https://www.acr.org/News-and-Publications/Media-Center/2023/ACR-Statement-on-Screening-Total-Body-MRI
- Baron, Kelly Glazer, Sabra Abbott, Nancy Jao, Natalie Manalo, and Rebecca Mullen. “Orthosomnia: Are Some Patients Taking the Quantified Self Too Far?” Journal of Clinical Sleep Medicine 13, no. 2 (2017): 351-354. https://doi.org/10.5664/jcsm.6472
- Deyo, Richard A. “Cascade Effects of Medical Technology.” Annual Review of Public Health 23 (2002): 23-44. https://doi.org/10.1146/annurev.publhealth.23.092101.134534
- Krogsbøll, Lasse T., Karsten Juhl Jørgensen, and Peter C. Gøtzsche. “General Health Checks in Adults for Reducing Morbidity and Mortality from Disease.” Cochrane Database of Systematic Reviews 2019, no. 1: CD009009. https://doi.org/10.1002/14651858.CD009009.pub3
- Moynihan, Ray, Jenny Doust, and David Henry. “Preventing Overdiagnosis: How to Stop Harming the Healthy.” BMJ 344 (2012): e3502. https://doi.org/10.1136/bmj.e3502
- Welch, H. Gilbert, and William C. Black. “Overdiagnosis in Cancer.” Journal of the National Cancer Institute 102, no. 9 (2010): 605-613. https://doi.org/10.1093/jnci/djq099
- Zugni, Fabio, Anwar Roshanali Padhani, Dow-Mu Koh, Paul Eugene Summers, Massimo Bellomi, and Giuseppe Petralia. “Whole-body Magnetic Resonance Imaging (WB-MRI) for Cancer Screening in Asymptomatic Subjects of the General Population: Review and Recommendations.” Cancer Imaging 20, 34 (2020). https://doi.org/10.1186/s40644-020-00315-0
Medical and Legal Boundary
This entry is a reference, not medical advice. It describes published evidence, diagnostic-risk concepts, regulatory status where relevant, and common interpretation patterns. It does not diagnose, prescribe, or replace a clinician’s judgment for a specific person.
Diagnostic testing, imaging, bloodwork, wearable interpretation, and follow-up decisions should be discussed with qualified clinicians when results are abnormal, persistent, symptomatic, tied to a diagnosed condition, likely to change medical care, or likely to worsen health anxiety. Do not start, stop, dose, or combine medications, supplements, fasting, imaging, or clinical interventions because one marker moved without appropriate clinical context.