I’ve been working with heart rate variability (HRV) for many years, and one thing I’ve learned is that things can get quite difficult to interpret in the long term, beyond rather simple acute stressors (or major lifestyle or health-related changes).
While HRV can be a useful tool, there are many challenges when it comes to research and its practical use. A recent study comparing morning and night HRV measurements gives us a good opportunity to explore one of the main difficulties, the lack of reference data.
Lacking a reference
The biggest challenge with HRV is that we don’t have a reference system to compare against in order to assess its utility. In other applications, typically we do have a reference. For example, when looking at sleep stages estimated by a wearable, we can compare them against polysomnography, a less practical method, that is however considered the gold standard (or in other words, an accurate reference) to determine sleep stages. The same applies to other markers, e.g. you can check the VO2max estimated by Garmin against an actual lab test where your VO2max is measured using indirect calorimetry. This allows us to determine if our more practical method (e.g. sleep stages in a wearable or fitness estimations from a GPS watch) are valid.
When it comes to HRV, we don’t have a reference.
As I’ve discussed in greater detail in my guide here, HRV is a marker of physiological stress because heart rhythm is modulated by the autonomic nervous system, and in particular the parasympathetic nervous system, in response to stress. This is true only when HRV is measured using accurate tools and best practices (i.e. in the morning or the night, far from stressors and limiting other potential artifacts). Under these circumstances, HRV can be helpful in assessing how we are responding to various stressors, both acutely and chronically.
The issue here is not only that HRV is an indirect assessment of parasympathetic activity, but that we cannot measure directly any of the aspects we use HRV as a proxy of. For example, we cannot measure parasympathetic activity non-invasively in humans. We cannot measure stress or readiness or recovery in any way that is not a subjective questionnaire.
Hence, while the mechanisms are clear, we don’t really have another way to capture recovery or readiness and compare it against HRV, which has several implications.
Without a reference, it is difficult to determine if HRV monitoring is effective in tracking recovery or readiness but it is also difficult to tackle more nuanced issues that are more interesting to me (a few months of tracking will make it obvious that the data is a useful marker of physiological stress), e.g. which protocol is best to assess recovery and guide training.
A brief recap of how to use (or misuse) HRV
Recovery and readiness are complex and multifactorial, and we already know that certain aspects of these constructs cannot be captured by HRV alone (e.g. muscle soreness), which also explains why I argue for looking at the actual (HRV) data, and trying to understand what it means, as opposed to looking at made-up scores (e.g. readiness or recovery scores in wearables) that cannot possibly provide what they claim.
For example, an HRV within your normal range the morning after a hard workout is an ideal response, highlighting that the training stimulus was appropriate, but does not mean that we should go hard again (because e.g. we might be terribly sore). On the other hand, a suppression in HRV after a hard session does not mean that “you went hard and that was good training” but it means that either the stimulus did not match your fitness, i.e. it was too much, or non-training-related stressors were present. Both cases are informative and allow us to learn from the data and adjust (or not) our following training sessions.
In the same scenario, subjective feel would be unlikely to be as informative, as after a hard session we might be terribly sore regardless of our response, it is just a different aspect of recovery. If we understand what HRV is about, how stability highlights an ideal response, and how things can decouple from subjective feel but still provide useful information for our decision-making, we are on the right path.
HRV is not there to override your subjective feel but to complement it with information that can help you make useful adjustments. Keep in mind that none of this can be done by looking at made-up scores (e.g. readiness, recovery) combining physiology and behavior (e.g. activity, sleep) and making it harder - not easier - to capture our physiological response to a stressor. Even if you use a wearable instead of a morning measurement, make an effort to look at the actual data and ignore made-up self-fulfilling scores that penalize you for sleeping less or for being more active, regardless of your actual physiology. For athletes of any level, this is the best way to make use of the data.
Now, back to the original problem, i.e. lacking a reference.
As a result of the lack of a reference, most studies focus on how HRV reacts to an acute stressor, like a workout. In this context, researchers often claim that the HRV measurement closest to the stressor is the most “useful” because it shows the strongest coupling (e.g. the strongest suppression in HRV). But this is not what we use HRV for in practice. The goal isn’t just to see the body’s immediate reaction to stress (during or immediately after a stressor). The goal is to understand how well the body has recovered from the stressor.
To do that, we need to measure HRV at rest, ideally several hours after the stressor, e.g. in the morning after a good night’s sleep.
Morning vs. night: what we learn from a new study
The study Morning versus Nocturnal Heart Rate and Heart Rate Variability Responses to Intensified Training in Recreational Runners (see here) compares HRV measurements taken at night and in the morning.
It gives us some interesting insights but also shows the limits of HRV research when lacking a reference. For example, during a baseline period with normal training, morning and night HRV measurements are highly correlated. This confirms that both methods can reflect overall trends when the body is under no unusual stress, as also shown in previous research (or anecdotally in much of the data I shared in the past). This tends to be true, especially in the medium-long term (weekly averages more than daily measurements). When the training load increases, the correlation between morning and night HRV weakens. In my opinion, morning HRV better reflects how the body is recovering over time and its ability to take additional stress on a given day, as it’s measured further away from the stressor, and after the restorative effect of sleep. In this context, we can see sleep as an acute, positive stressor, and therefore it makes sense to measure after it, as opposed to during it.
Now let’s move to acute stressors. Night HRV reflects the body’s immediate reaction to stress more effectively since it’s measured closer to the workout (a late-in-the-day, hard session). While this is often interpreted as “more useful” since the data shows the suppression better than a measurement taken later on (and in the context of this study, the first half of sleep was the ideal moment to capture the stressor, not because it is more useful though, but simply because it is closer in time, and therefore we have a rather normal suppression in HRV - in my opinion). I would argue that in terms of practical usefulness, this approach (i.e. measuring closer to the stressor) makes the data less useful for understanding recovery and our stress response. While it is perfectly normal for HRV to be suppressed during or after a stressor (one of the many reasons why continuous HRV measurements are problematic), what we can learn from is how HRV changes a few hours later: if we have a positive response, HRV will be within our normal range. If we do not have a positive response, HRV will still be suppressed many hours after the stressor (e.g. during a morning measurement).
Why morning HRV remains my preference
While the study shows that night HRV can reflect acute stress, given the considerations above, I would not consider a more tightly coupled response to be an indication of the data being practically more useful, but simply an indication of the data being collected too soon, and a typical issue of not having a better reference for our readiness or recovery.
This is rather different from e.g. looking at differences in body position when measuring HRV at the same time (or at a very similar time). For example, if we measure HRV in the morning, we can measure in positions other than lying down (something not feasible when we sleep). If we see an HRV suppression with respect to our normal range when we measuring sitting up, as opposed to lying down, this is an indication that the data collected while sitting up is indeed more useful: it is more sensitive to stress, something I have shown in an example here.
However, in the morning vs night study, we are not assessing sensitivity, we are just assessing timing in relation to an acute stressor and showing something obvious: if you measure close to the stressor, the data is impacted more.
For these reasons (time elapsed between the stressor and the measurement and the possibility to measure while in a different body position), I prefer morning measurements, especially for athletes who want to use the data to make useful adjustments to their training. Keep in mind that training is a much smaller stressor with respect to e.g. alcohol intake or sickness (as shown in our paper here), and this is likely to be an even smaller stressor for an athlete that is accustomed to it. Hence, a morning measurement while sitting up is the ideal measurement protocol in this specific context (despite the $$$ poured into marketing campaigns aiming at convincing you of the opposite by various wearable manufacturers).
By the time you wake up, your body has had time to recover, and the data reflects your ability to handle and adapt to more training. Measuring HRV too close to a stressor, like during the first hours of the night, might show a stronger reaction, but it doesn’t tell us much about recovery.
Studies like this are well-intentioned but eventually highlight the limits of sports science research when lacking a reference, more than the usefulness of the data as collected with different protocols, in my opinion.
What about health?
While in this blog I have stressed how HRV lacks a reference in terms of assessing its utility for sports-related studies, the issue is also evident outside of sports science.
For example, HRV is often claimed to be a health indicator, with people ending up optimizing HRV (instead of health!). Before you go down that road, consider that one of the quickest ways to dramatically increase your HRV is to stop eating, including when we develop chronic health conditions such as anorexia nervosa. Chronic malnutrition and behaviors associated with anorexia can lead to long-term complications, such as osteoporosis, cardiovascular issues, hormonal imbalances, and cognitive impairments (let alone the psychological challenges associated with all of this). And yet, people with anorexia nervosa are routinely reported with higher HRV than controls.
In front of this evidence, it is impossible to claim that higher HRV is better, in my opinion.
Wrap up
A common limitation in HRV research is that it focuses too much on short-term acute responses, i.e. how HRV reacts during or shortly after an acute stressor, a limitation that is motivated by our difficulty to do anything else since we don’t have a valid reference. While looking at acute stressors is valuable, it doesn’t tell us how HRV can guide long-term decisions or which protocol is best for the practical use we make of HRV data (i.e. determining the best course of action on any given day).
To make HRV more actionable, we need studies that look at how training adjustments based on HRV (maybe collected using different protocols) affect long-term health and performance over time. For example, one group could adjust their training based on morning HRV trends, while another group sticks to a fixed plan. The two groups could then be compared in terms of performance, fatigue, and injury rates. These types of studies show whether HRV-guided training actually helps (spoiler: it does), even though we can’t measure “recovery” or “readiness” directly. We might want to get into more nuanced differences with similar studies, e.g. comparing different HRV measurement protocols in different types of athletes or lifestyles in relation to health and performance outcomes.
Until, then, this is what we have learned so far:
Morning HRV is more actionable and better reflects recovery, as well as our ability to assimilate additional stress because it’s measured after the restorative effect of sleep and farther away from previous stressors (training, food intake, socializing, etc.). Measuring while sitting up can also make your measurement more sensitive to stress, which can be particularly useful in the context of training (or overtraining).
Night HRV might be easier to capture for some, but it’s more influenced by the previous day’s stress and less reflective of overall recovery or of our ability to adapt to additional stress. In periods without much stress, morning and night HRV track rather similar trends. In the context of training, it tends to be less useful.
HRV is a tool that we can use to make small adjustments, not to make up plans entirely (which is unfortunately what made-up scores aim to do, despite not knowing anything about muscular soreness or additional key context). This is why HRV4Training can only say “proceed as planned” when your HRV looks good. Any other statement (e.g. “go hard”) would be an incorrect use of this data.
HRV is a powerful tool, but it has clear limitations and is often misused. It’s normal for HRV to drop during or after stress, and that’s not a bad thing - it’s part of the process. What matters in terms of actionability is how your HRV responds the next day or once enough time has passed from the acute stressor. These aspects are more difficult to study, but more useful in practice, as they allow us to adjust our plans and improve long-term performance.
I hope this was informative, thank you for reading.
Marco holds a PhD cum laude in applied machine learning, a M.Sc. cum laude in computer science engineering, and a M.Sc. cum laude in human movement sciences and high-performance coaching.
He has published more than 50 papers and patents at the intersection between physiology, health, technology, and human performance.
He is co-founder of HRV4Training, advisor at Oura, guest lecturer at VU Amsterdam, and editor for IEEE Pervasive Computing Magazine. He loves running.
Social: