The rhetoric of individualization through wearables data with the current focus on estimates doesn't hold water.
We build generic models using large-scale population data that shows certain relationships.
We then provide them to the individual claiming that these models can track health and / or performance at the individual level.
When faced with the argument that the absolute values of such estimates are often incorrect at the individual level, we speak about our saviour, "the trend".
However, there is no evidence that when a factor (behaviour, environment) changes the predictors, the predicted parameters also change in consistent ways, so that they can in fact track meaningful changes at the individual level. All models are built on cross-sectional data exploiting differences between individuals, not longitudinal changes within-individuals.
You are marketed "an extra 1%" while being sold generic models that treat you as the average of the population.
If you have worked in health / performance, you know very well how It turns out when you generalize from the population and hope that the same applies to every individual.
This happens every time you rely on something that is estimated from somewhat related parameters, and not measured.
I'll give you a trivial example:
There is an inverse relationship between heart rate and HRV.
If I don't have the required technology to measure HRV, I could estimate it from heart rate. I can use a large population dataset which shows me the inverse relationship between the two, and then build a model that given heart rate, can estimate HRV. Then I provide it to you as the latest feature of my gadget.
There is a difference between the HRV estimate and your actual HRV, for some people, it can be a very large difference. Maybe that's okay, what matter is the trend after all.
How about tracking the trend? Sometimes the trends in the predicted and predictor will agree, when they change in similar ways. Sometimes they will not, for example if HRV is saturated, or if fatigue leads to a suppression in both, or if you are doing deep breathing exercises (same HR, much higher HRV), etc. - in all of these circumstances our estimate will not work at the individual level: the trend is incorrect because we are not measuring the required data point, we are estimating it from parameters that show a strong relationship at the population level, but not at the individual level.
By definition, all the situations in which the estimate fails are the situations where it would actually be interesting to look at the data (otherwise we'd have all the information required in the predictor, in this case heart rate, but we clearly don't).
Does this example sound stupid?
This is what you pay for when using HRV-derived sleep stages, PPG-derived blood pressure, and soon enough, optical methods for glucose monitoring, lactate estimation, and more.
They might work at times, but often, they are inaccurate in absolute values and do not guarantee meaningful trends.
Unless new sensors are developed to measure actual changes in the parameters of interest, we are just being fooled by population-level models that have little to do with what is happening in our body in terms of such parameters.
What are we doing?
This is why I have lost much of the excitement for the field.
15 years ago it was all about sensors, and measuring.
Now, it's mostly about making up things, estimating.
This is also why I keep looking only at resting heart rate and HRV:
I measure them, at rest, intentionally.
Marco holds a PhD cum laude in applied machine learning, a M.Sc. cum laude in computer science engineering, and a M.Sc. cum laude in human movement sciences and high-performance coaching.
He has published more than 50 papers and patents at the intersection between physiology, health, technology, and human performance.
He is co-founder of HRV4Training, advisor at Oura, guest lecturer at VU Amsterdam, and editor for IEEE Pervasive Computing Magazine. He loves running.
Social:
Twitter: @altini_marco.
Personal Substack.
Thanks for posting this, I needed to hear this today. As a middle-aged medium-fit woman with some chronic diseases and menopause symptoms, I have worn a Garmin watch day and night for over a year now and it was hard to ignore the inaccurate assumptions it arrives at based on its generic estimates. It sucks to be shown a red or pink training status that says you are 'strained' or 'unproductive' for weeks, and for the HRV to be shown as 'low' or 'unbalanced' when you feel normal and your morning measurement shows it is just fine and you are good to go again. I try to stick to a sound training plan that builds muscle, incorprates 80/20 endurance traning and contains enough rest days for me individually so that I can keep going. Being told over and over again by my device that this is not so good for me and I should simultaneously add more high intensity training and rest more, is frustrating. I don't feel that these devices are for me. I'm clearly not the target group so I end up turning 'Training status' off a lot. Many women post in the forums, that they have problems with the estimations because they do not consider their hormonal cycle properly. That's like half the population that gets fed generic estimates that don't fit their bodies and training goals - and we cannot adjust the settings either. I'd love to be able to just adjust the settings. It would be a game changer.
Marco - interesting as always. I've been thinking about these issues a little bit recently. On a pure physiological basis I agree with you - much of physiology + model to infer some value seem to be mostly marketing/strong commitments to a very particular philosophy of science, both of which are equally present in academic research.
On the flip side I've been beginning to think that estimates and scores serve a useful psychological purpose. My hunch is that for many people (not elite athletes) these estimates/scores have enough accuracy, enough of the time, to be useful ways to create psychological meaning (essentially gamification) that helps one prioritize taking care of their body in a world full of noise.
The caveat to that view is that why not just use actual meaningful physiology (HRV measured sitting in the morning) to do this... to which my feeling is that scores/estimates maybe helpful we have a world full of noise and marketing :/