Are Wearables Making You Healthier or More Anxious?
What the evidence actually says about health wearables — the metrics worth trusting, the ones that aren’t, and the psychological trap nobody talks about.
Roughly half of American adults now own a health-tracking wearable. Smartwatches, fitness rings, continuous glucose monitors, sleep trackers… The market has exploded so fast that the science studying these devices can barely keep pace with the products being sold.
That gap between what these devices promise and what the evidence supports is where most people get lost. And it’s costing some of them their health, because bad data acted on confidently can be worse than no data at all.
This post is about how to tell the difference.
The Fundamental Problem With Consumer Wearables
Technology moves on a one-to-two-year product cycle. Clinical validation research moves on a five-to-ten-year cycle. The device on your wrist almost certainly reached the market before any independent scientist had rigorously tested whether it measures what it claims to measure.
A 2025 scoping review published in PLOS Digital Health, analyzing 80 studies on wearable devices used for remote health monitoring outside of hospital settings, found that clinical evidence of effectiveness remains scant.¹ Of all the studies reviewed, only 8% were randomized controlled trials. The rest were observational, feasibility, or descriptive studies that tell us these devices can collect data, not that acting on that data improves health outcomes.¹
This doesn’t mean wearables are worthless. Several of them are genuinely impressive for specific applications. But it means the blanket claim, that wearing one will help you optimize your health, is still mostly a marketing hypothesis rather than a proven medical fact.
There’s also a structural problem with the validation literature that exists.
A 2025 analysis in npj Digital Medicine on Apple Watch accuracy documented that a substantial portion of wearable validation studies are funded by the companies manufacturing the devices.² Funded studies tend to test the device under the conditions where it performs best, against the metrics it was designed to capture, in populations where it will look good. Independent studies, when they exist, consistently show lower accuracy than manufacturer-sponsored ones. One analysis of sleep staging accuracy found that an Oura-funded study rated the Oura Ring at 79.5%, while an independent study of the same device category produced results in the 69% range at best.³
It means you should treat any claim that a wearable is “clinical grade” with some skepticism unless that validation was done by an independent group with no financial relationship to the manufacturer.
What Wearables Actually Do Well
With those caveats in place, there are areas where consumer wearables have genuine and growing evidence behind them.
Heart Rhythm Detection
This is where the evidence is strongest, and where the clinical implications are most significant.
Atrial fibrillation (AFib), the most common cardiac arrhythmia, is a major risk factor for stroke and heart disease. It’s also frequently asymptomatic, meaning millions of people have it and don’t know. The traditional detection method requires a 12-lead ECG in a clinical setting, which captures only a few seconds of rhythm in a controlled environment. Paroxysmal AFib, the kind that comes and goes, can be completely invisible in a standard ECG if it happens to resolve before the appointment.
Smartwatch ECG is genuinely changing this.
A 2025 meta-analysis in JACC: Advances analyzed studies through January 2025 and found that smartwatch ECG performs remarkably well for AFib detection.⁴ To put the numbers in plain terms: out of every 100 people who actually have AFib, the Apple Watch correctly identifies 94 of them. And out of every 100 people who don’t have it, it correctly gives the all-clear to 97 — meaning only 3 people get a false alarm. Samsung devices perform comparably. For a device that costs a few hundred euros and sits on your wrist, those are serious diagnostic numbers.⁴
A separate 2025 meta-analysis in the Journal of Arrhythmia found that when trained clinicians manually read the smartwatch ECG trace rather than leaving it to the algorithm alone, accuracy improves further: 96 out of every 100 people with AFib are correctly identified, and 95 out of every 100 healthy people are correctly cleared.⁵
These are genuinely impressive numbers for a device someone buys at a consumer electronics store. They are not equivalent to a 12-lead ECG — important arrhythmias other than AFib can be missed, and inconclusive readings are common — but for AFib specifically, the evidence is strong enough that several cardiology societies now consider smartwatch ECG a reasonable screening tool.
If your device flags an irregular rhythm and prompts you to see a cardiologist, take it seriously. That notification has real sensitivity behind it.
Resting Heart Rate and HRV During Sleep
Consumer wearables measure heart rate using photoplethysmography (PPG) — an optical sensor that detects blood flow through the skin. At rest and during sleep, when movement artifact is minimal, this method is reliable.
A 2025 independent validation study tracking 13 participants across 536 nights compared the WHOOP 4.0 directly against a gold-standard ECG chest strap.⁶ For resting heart rate, the two agreed almost perfectly. For HRV, they tracked within a few milliseconds of each other on average. Multiple devices in the same category perform comparably at rest.
Heart rate variability during sleep, the beat-to-beat variation in heart rate that reflects autonomic nervous system function and recovery status, is a legitimate physiological marker. Tracked as a trend over weeks and months, it provides real information about how well your body is recovering from physical and psychological stress. A downward trend in baseline HRV, sustained across two or more weeks, warrants attention. A single low reading does not.
Behavior Change Through Biofeedback
Perhaps the most underappreciated evidence behind wearables is not about the accuracy of any specific metric, but about the mechanism of behavior change.
Real-time biofeedback — seeing a glucose spike after a meal, watching your heart rate decline during a breathing exercise, noticing that your HRV drops after a night of poor sleep — creates a feedback loop that abstract health advice cannot replicate.
A 2025 systematic review on CGM as a behavior change tool found that glucose biofeedback measurably improves dietary choices and physical activity engagement in both diabetic and non-diabetic populations.⁷
The mechanism is straightforward: people change behavior when they can see the consequence of that behavior in real time. Wearables, whatever their accuracy limitations, deliver that.
Where the Data Falls Apart
Sleep Staging
This is where consumer wearables most consistently oversell their capabilities, and where the marketing language is most detached from the evidence.
Every major wearable claims to tell you how much deep sleep and REM sleep you got. The problem is that accurately classifying sleep stages requires measuring brain activity, which requires electrodes on the scalp.
Polysomnography (PSG), the clinical gold standard, is what sleep medicine is based on. Wrist-worn devices do not measure brain activity. They infer sleep stages from heart rate, movement, and skin temperature.
A meta-analysis of 798 patients across 24 studies found that wrist-worn devices underestimate total sleep time by about 17 minutes on average and misjudge sleep efficiency by around 5%.⁸ The bigger problem is REM sleep: these devices underestimate it by 50 to 70% in some cases, with errors exceeding two hours in a single night. And when it comes to classifying the four sleep stages — light, deep, REM, and wake — wrist-worn devices get it right roughly 60 to 72% of the time compared to the clinical gold standard.⁸ Flip that around: on any given night, your device is classifying your sleep stages incorrectly somewhere between one in four and two in five times.
The best-performing consumer wearable for sleep staging in independent studies — WHOOP 4.0 — achieves approximately 70% accuracy for sleep stage classification.³ The Apple Watch is around 50%.³ That means on any given night, the sleep stage breakdown you see in your app could be wrong on roughly one in three nights at best, and one in two nights at worst.
What wearables can tell you reliably is whether you slept or not, and approximately how long you slept. Total sleep time estimates are more accurate than sleep stage estimates. Trends across weeks are more meaningful than nightly fluctuations.
What they cannot tell you reliably is that you got 45 minutes of deep sleep last Tuesday. The number is algorithmically estimated from signals that do not directly measure what the app is claiming to measure.
Calorie and Energy Expenditure
Consumer wearables are consistently poor at estimating calorie burn.
A systematic review and meta-analysis in JMIR mHealth and uHealth found that wearable energy expenditure estimates contain large and variable errors, with some devices overestimating caloric burn by 40 to 80%.⁹ The errors are larger during exercise than at rest, larger in people with higher body fat percentages, and larger during activities that don’t involve consistent arm movement.
Using your smartwatch calorie data to calibrate your food intake is a recipe for significant error. Treat energy expenditure numbers as directional indicators at most.
Continuous Glucose Monitors in Healthy People
CGMs are one of the most powerful tools in medicine for people with diabetes. The evidence for their benefit in diabetic populations is unambiguous and strong.
The question of whether they offer meaningful value to metabolically healthy adults without diabetes or prediabetes is a different question, and the 2025 evidence is more skeptical than the wellness industry would like.
A 2025 research announcement from Mass General Brigham found no association between HbA1c, the standard clinical marker of average blood glucose, and CGM-derived glucose metrics in people with prediabetes and normal glycemia.10
In other words, for people without diabetes, the CGM is measuring something real (interstitial glucose fluctuations) but that measurement may not correlate cleanly with the metabolic risk markers that actually predict long-term outcomes.
A 2025 systematic review in Sensors on CGM use in non-diabetic individuals for cardiovascular prevention found that while CGM shows promise for personalizing lifestyle interventions and improving motivation, direct evidence of impact on hard cardiovascular outcomes remains limited.¹¹
The research on this question is still early. It’s not that CGMs are useless for non-diabetics, the behavior change mechanism is real, but the interpretation framework that applies to diabetic patients does not automatically translate.
A healthy person seeing a postprandial glucose spike to 140 mg/dL after a meal of white rice is not necessarily at risk. Normal non-diabetic glucose can spike and recover rapidly in ways that look alarming on a continuous trace but are within the range of typical physiology. Interpreting those numbers without clinical context can drive unnecessary dietary restriction or anxiety around foods that are nutritionally neutral.
The Psychological Trap
Here’s the side of wearables that gets almost no attention in wellness culture, and that I’ve started seeing in patients with increasing frequency.
Continuous health monitoring can make healthy people sick, not physiologically, but psychologically.
The phenomenon has a name in the research literature: cyberchondria. It describes health anxiety driven by excessive engagement with health data and information, where the monitoring itself becomes a source of distress rather than reassurance.
A 2025 paper in npj Digital Medicine on the consequences of health data overload noted that excess monitoring contributes to “heightened anxiety from misinterpreting normal physiological fluctuations” and “maladaptive tracking behaviors” that ultimately worsen wellbeing.¹²
A 2024 study on the psychological impact of health-monitoring smartwatches found that some participants reported increased perceived stress after just one week of wearing the device.
The irony is that this effect is strongest in the people most motivated to use wearables: health-conscious, analytically minded individuals who engage deeply with the data and take every fluctuation as a signal to investigate.
People who wear a smartwatch casually and check it twice a week are unlikely to develop this problem. People who wake up every morning and analyze their HRV trends, sleep debt scores, and readiness ratings before deciding how to approach their day are at higher risk.
There’s a well-established medical concept. the nocebo effect, where the expectation of harm generates real physiological harm. A person who reads a low recovery score and decides they feel unwell, or who sees a flagged irregular heartbeat notification and spends three days in anxiety waiting for a cardiology appointment, is experiencing a real cost that the wellness marketing for that device did not account for.
Wearables are tools. Like any tool, their effect depends on how they’re used and on the person using them.
The Question Worth Asking
Before I get into the practical framework, I want to leave you with the question that I think cuts through all of this more cleanly than any accuracy statistic.
Is this data changing your behavior for the better, or is it just changing your feelings about yourself?
A wearable that motivates you to move more, sleep more consistently, or recognize that alcohol is wrecking your recovery is working. A wearable that gives you a number every morning that determines how good or bad you feel about yourself, regardless of whether that number is accurate or whether it’s actually changing anything, is working against you.
Below, I’ll give you the framework I use with patients to answer that question honestly: which metrics are worth tracking, how often to look, what to do when the data conflicts with how you actually feel, how to use CGM data in a way that’s genuinely informative rather than anxiety-provoking, and the red flags that tell you your relationship with your wearable has become the problem.


