Very interesting post, I have been following you for a while now, I love your "no BS" approach.
I wanted to ask you, if wearable metrics are derived from population-level estimates, which often miss the mark on individual accuracy, does this make the search for clinical biomarkers based on wearable data less meaningful? Because I see a lot of current research on this relying on cross-sectional data.
In your opinion, do we need more longitudinal studies that focus on validating within-individual changes over time? Is this a necessary step before wearable-based biomarkers can genuinely support clinical outcomes?
I hope I'm not asking too much :) look forward to hearing your thoughts, thanks a lot!
ciao Marcello, grazie mille. I agree with what you are saying, and indeed I think that we need to validate the ability of the various estimates to track longitudinal changes in various contexts: normal variation, variation in response to a certain intervention, seasonal variation in the long term, etc. - as this is how people are using these tools. It is a good first step to validate sleep staging in a cross-sectional study over a large population, but given that there are clear large errors for many individuals (as shown for example in bland altman plots), then need to run more studies looking at within-individual variation (relative changes over time) in relation to e.g. the outcomes mentioned above, to assess if these changes can be tracked, and if not, trying to determine why that is the case and what - if anything - we can do about it. This would at least make certain parameters usable, such as the ones that do not have a meaning in absolute terms (just like I recommend using HRV), but would still not solve the issue of over reliance on estimates when absolute values do matter (e.g. blood pressure).
Thanks for your thoughts again - always a nice detox from the techbros!
I think an easy marker for if you can rely on a machine learned metric or not is if it's been approved by global health regulators. I was surprised to learn that the Apple Watch's sleep apnea detection takes 30 days, why? Because it doesn't actually use pulseox but wrist movements, combined with other sensor data. That fits the definition of "not measuring directly" but the reality is it works. Now will it have false positives or negatives? Of course - but it's akin to public health - these interventions can have a dramatic impact on peoples lives because of the scale they are done.
A health regulator is never approving HRV or sleep measurements!
All in all if companies like Apple wanted to truly improve public health they'd fund vaccine and sanitation campaigns in the third world or literally tell people to eat an Apple a day because the standard american diet is so poor a single apple can have a dramatic impact on public health (https://nutritionfacts.org/video/does-an-apple-a-day-really-keep-the-doctor-away/)
Ciao Marco
Very interesting post, I have been following you for a while now, I love your "no BS" approach.
I wanted to ask you, if wearable metrics are derived from population-level estimates, which often miss the mark on individual accuracy, does this make the search for clinical biomarkers based on wearable data less meaningful? Because I see a lot of current research on this relying on cross-sectional data.
In your opinion, do we need more longitudinal studies that focus on validating within-individual changes over time? Is this a necessary step before wearable-based biomarkers can genuinely support clinical outcomes?
I hope I'm not asking too much :) look forward to hearing your thoughts, thanks a lot!
Marcello
ciao Marcello, grazie mille. I agree with what you are saying, and indeed I think that we need to validate the ability of the various estimates to track longitudinal changes in various contexts: normal variation, variation in response to a certain intervention, seasonal variation in the long term, etc. - as this is how people are using these tools. It is a good first step to validate sleep staging in a cross-sectional study over a large population, but given that there are clear large errors for many individuals (as shown for example in bland altman plots), then need to run more studies looking at within-individual variation (relative changes over time) in relation to e.g. the outcomes mentioned above, to assess if these changes can be tracked, and if not, trying to determine why that is the case and what - if anything - we can do about it. This would at least make certain parameters usable, such as the ones that do not have a meaning in absolute terms (just like I recommend using HRV), but would still not solve the issue of over reliance on estimates when absolute values do matter (e.g. blood pressure).
Thanks for your thoughts again - always a nice detox from the techbros!
I think an easy marker for if you can rely on a machine learned metric or not is if it's been approved by global health regulators. I was surprised to learn that the Apple Watch's sleep apnea detection takes 30 days, why? Because it doesn't actually use pulseox but wrist movements, combined with other sensor data. That fits the definition of "not measuring directly" but the reality is it works. Now will it have false positives or negatives? Of course - but it's akin to public health - these interventions can have a dramatic impact on peoples lives because of the scale they are done.
A health regulator is never approving HRV or sleep measurements!
All in all if companies like Apple wanted to truly improve public health they'd fund vaccine and sanitation campaigns in the third world or literally tell people to eat an Apple a day because the standard american diet is so poor a single apple can have a dramatic impact on public health (https://nutritionfacts.org/video/does-an-apple-a-day-really-keep-the-doctor-away/)