Q&A: What are your current thoughts on DFA (HRV during exercise), e.g. in Suunto's ZoneSense?
I have received this question a few times since Suunto released their new feature using HRV (and detrended fluctuation analysis, or DFA) to determine exercise intensity, hence this blog post.
EDIT: at the beginning of December I had a call with Suunto and MoniCardi, who developed the technology behind ZoneSense using DDFA. I have since then updated this blog to reflect our conversation (changes start with an EDIT in front, and are reported in italic), and in particular how ZoneSense might address some of the limitations of the standard DFA approach in terms of individualization. Special thanks to Janne and everyone that took the time to chat about this.
A brief history of the past 4 years
For context, 3-4 years ago (time does fly), I got more interested in HRV analysis during exercise (which in this blog I will call simply “DFA” as most research and applications rely on this specific method of HRV analysis).
At that time I developed the first app that could provide DFA analysis in (semi) real-time using a commercially available chest strap, and released the code for others to do the same (code here, app here). There were two reasons to motivate this work, according to initial claims:
No need for calibration, which would make DFA better than heart rate, as for heart rate, you need to know your maximal heart rate or lactate threshold heart rate, to derive training intensity zones (or even better, you can use training and racing data to define race-specific zones, as I describe here).
Threshold values are independent of your fitness level, meaning that a DFA alpha 1 of 0.75 would indicate the first threshold and a value of 0.50 the second threshold, for everyone.
After a while, as more validations claimed that DFA was an effective way to e.g. estimate your LT1 (first lactate threshold), I started saying the opposite.
Why?
Because I had looked at the individual data (see examples here) and unfortunately it was clear that it wasn’t possible to use these fixed values to assess exercise intensity at the individual level, for everyone.
What has changed since then?
Recent research
Please note that this is not a comprehensive overview but just a short outline of what I see has been done recently. Keep in mind also that my main trusted source here is Thomas Gronwald, who has done much of the research in the field. Recently, another scientist I follow with great interest added useful insights that I cover below (Olli‑Pekka Nuuttila, who has also worked on morning and night HRV, training periodization, and more).
Here are the main trends I see:
More validation work on predetermined thresholds, e.g. the same 0.75 and 0.50 that I have criticized in the past. When individual data is reported, the same issues arise (in Thomas’ words “for some individuals, the present approach does not lead to an adequate separation of exercise intensities“). Nothing new here, absolute values cannot work for everyone, unfortunately. To me, this means we lose one of the main advantages of this approach, as I could otherwise just use heart rate, which also requires that I calibrate it for an individual.
Slightly different algorithms such as the DDFA used by Suunto have been developed. Given the results that have been published (see image below), at the moment there doesn’t seem to be any meaningful advantage for the shortcomings previously reported (e.g. if an individual can have their heart rate 20-30 beats per minute away from their lactate threshold - see paper here - you cannot claim that you track changes in metabolic state, sorry). Keep in mind also that despite the claimed higher resolution, the algorithm is still unable to capture short high-intensity effort, as shown in Suunto’s examples. At the moment, the upgraded math doesn’t seem to solve the outstanding issues we had, unfortunately. EDIT: despite the differences shown here with respect to lactate, DDFA uses an approach that is somewhat more similar to what we would use to detect LT1 in a lactate test - i.e. a deflection point after an initial phase of stability - which is different from the original DFA algorithm which would use a universal threshold of 0.75. As such, this method does individualize the data, at least in terms of the boundary between easy and not easy (or green and yellow). This sounds like an interesting approach - something I had suggested originally for the standard DFA as well - as it doesn’t rely on universal thresholds but tries to spot a change in the cardiac response of an individual. I am looking forward to testing this myself in the next months (in terms of the differences with lactate, see also my discussion below, i.e. we are looking at different things, internal stress and metabolic demands, and these do not necessarily need to be aligned).
Investigations in related applications, e.g. fatigue monitoring. This is what I consider interesting and a proper use case for this technology. Absolute values that work for everyone don’t exist for heart rate, lactate, oxygen consumption, etc. - and considering that HRV is even more individual, even when measured at rest - it is hard to imagine we can use it that way. Let’s focus on what HRV can do best then: tracking individual changes in relation to stress, which during exercise we could call ‘fatigue’ and could be associated with a number of physiological aspects (e.g. durability). In Olli‑Pekka’s recent paper (see here) changes in DFA-a1 during prolonged exercise (90 minutes of low-intensity running) were found to correlate with changes in the lactate threshold speed (LT1v). This suggests that DFA-a1 could serve as a potential marker for monitoring fatigue during endurance sessions, reflecting how an athlete's ability to sustain performance diminishes over time. This could be a useful application. EDIT: additional data from Suunto and MoniCardi confirms the utility in this application, possibly the most interesting feature of using HRV during exercise.
Considerations
What is the use?
Given what I wrote above, I would consider DFA a potentially useful marker of internal load during exercise, both between and within sessions (just like heart rate and lactate or VO2). However, I would not consider it prescriptive without individual calibration, meaning that the absolute values (or proposed thresholds) do not apply to all individuals and therefore it cannot be used that way (just like heart rate and lactate or VO2), due to high individual variability (as noted in Thomas’ paper, variability is particularly notable during prolonged exercise in heavy-to-severe intensity domains, where some individuals may experience premature exhaustion, leading to a mismatch between exercise prescription and actual performance capacity).
I would also move away from calling this an estimate of metabolic zones (as in Suunto’s website). If you want to track lactate, measure lactate, do not measure cardiac activity (heart rate or HRV / DFA). It makes no sense to do so, as the interesting part is how lactate and heart rate (or HRV) change in different ways based on how we train (and detrain), and therefore it is meaningless to use one to guess the other.
How would I know that my heart rate at LT1 is now higher, if I wasn’t measuring both my heart rate and lactate?
Finally, given the difficulties in doing the math in a reliable way, I would want to be really sure that there is something to gain by looking at this parameter as opposed to just looking at heart rate. Some papers do show that HRV / DFA might be more sensitive to e.g. fatigue, just like resting measurements, and therefore it seems a promising marker from this point of view.
Data quality and custom scores
In Olli‑Pekka’s study, participants with more than 3% artifacts in their HRV data were excluded, emphasizing the need for clean heart rate data for reliable DFA-a1 assessment. This sensitivity might challenge its use outside controlled settings, where artifacts are even more common. HRV is really prone to issues, while heart rate isn’t, hence I want to reiterate that we should look at HRV only if heart rate isn’t good enough for the job.
On top of data quality issues, there is also variability in how different tools and software process DFA-alpha 1 data, and with DDFA, even more variability is added, with new black box implementations that remind me of other made-up scores (e.g. readiness, recovery, stress, etc.). While it is important to translate the data and methods to something useful for the average user, for the sake of comparison, validation and interoperability, I think the actual HRV data should also be reported.
Practical use
To me, DFA is still about experimenting more than prescribing.
We need to keep in mind that while the concept of using HRV to monitor fatigue or stress during exercise sounds appealing, evidence supporting its practical utility in guiding real-time adjustments during exercise is lacking. The implications of making adjustments are also unknown (would that lead to better performance outcomes? or not?). Maybe we can use (D)DFA to track progress in durability over time, more than to implement changes as we go.
Personally, at the moment, I feel like I have what I need in terms of exercise data, looking at metabolic load once in a while by measuring lactate at different - prolonged - intensities, and then using heart rate on a daily basis to assess load (see a discussion of these aspects, here). Heart rate gives me a clear signal (that I have calibrated in years of training and trial and error in races) which allows me to perform at my limit over different distances (e.g. in a marathon, as I discuss here).
Can DFA capture some changes in my cardiac activity that are not captured by heart rate? Is there something in DFA in relation to durability that is not visible in cardiac drift? Possibly, and in that case, it could be a nice addition to look at during certain training blocks. Maybe the latest changes brought by DDFA and Suunto / MoniCardi can indeed address these aspects.
At the moment, I see DFA as something to explore out of curiosity and not to be used to prescribe or adjust training - which is how we should probably take all new markers currently investigated in research. And by all means, I also sometimes track or look at data just because I find it interesting, not necessarily useful, and that’s totally fine if you are into these things.
My recommendation if you go that way is simply to look at the data in relative terms, and not necessarily in absolute values (therefore ignoring simple interpretations). How does it change for workouts of different intensities? How does it change within a workout as you fatigue? How do these values change over long timeframes as your fitness changes? How does the data relate to other markers, e.g. lactate and heart rate?
If you have used DFA and learned something new from it, please feel free to share your experience in the comments below, always happy to learn more.
See also:
Marco holds a PhD cum laude in applied machine learning, a M.Sc. cum laude in computer science engineering, and a M.Sc. cum laude in human movement sciences and high-performance coaching.
He has published more than 50 papers and patents at the intersection between physiology, health, technology, and human performance.
He is co-founder of HRV4Training, advisor at Oura, guest lecturer at VU Amsterdam, and editor for IEEE Pervasive Computing Magazine. He loves running.
Social:
Twitter: @altini_marco (currently inactive)
Personal Substack
Also AlphaHRV has a great connect IQ app, where you can even configure the cut-offs for the thresholds (eg. 0.75), so this opens the possibility to test and adjust for future training.
This is awesome - thank you! I've been reading about DFA from AI Endurance, who seem to have similar perspectives to you :)