[CoachCorner] Should We Test?

Useful and not-so-useful physiological tests for endurance athletes.

May 10, 2025

A few days ago, I received a great question from one of the athletes I coach. They were wondering whether we’d be including any specific tests, like a lactate threshold test, VO₂max assessment, or a critical power test, in the current training block, as we recently started working together.

It's a great question, and it taps into something many athletes wonder about: do we need lab tests or structured assessments to identify limiters and guide training? Or can we get the insights we need simply by training and analyzing workouts?

In this blog, I’ll answer these questions and provide an overview of what I consider useful and not-so-useful physiological tests for endurance athletes.

Glossary

Before we start, here are a few terms that are good to know, as I will be using them repeatedly below:

LT1 (First Lactate Threshold):
The intensity at which lactate starts to rise above resting levels. Often considered the boundary between easy and not-easy running. This is the top end of Zone 2, in my 5-zone system.

LT2 (Second Lactate Threshold):
Also called the anaerobic threshold, this marks the point where lactate accumulation becomes rapid. LT1 and LT2 are thought to be actual physiological markers that we can identify with physiological testing (debatable, as I cover below).

Critical Pace (CP):
The fastest pace that can be maintained without a continual rise in fatigue. It’s a practical, field-based measure often aligned with performance at durations of 30–60 minutes, or the time it takes to run a 10 km. For the purpose of my system, critical pace and LT2 are the same thing, and what I call “Threshold”.

VO₂max (Maximal Oxygen Uptake):
A lab-derived measure of the maximum volume of oxygen your body can use during intense exercise. It represents aerobic capacity but does not directly determine performance.

Running Economy:
How efficiently a runner uses oxygen at a given pace. A lower oxygen cost (better economy) at a given speed means greater efficiency, which often correlates better with performance than VO₂max.

Metabolic Flexibility:
The ability to switch efficiently between fuel sources (carbohydrates and fat) depending on intensity. High metabolic flexibility supports long-duration performance and better fuel management.

VO₂ (Oxygen Consumption):
The volume of oxygen used by the body per minute during exercise. Collected during lab tests to analyze aerobic demand and substrate use.

VCO₂ (Carbon Dioxide Production):
The volume of CO₂ exhaled per minute. Used alongside VO₂ to estimate which fuel sources (carbs vs fat) are being used, and to determine the respiratory exchange ratio (RER).

Alright, let’s get started.

Do We Need to Test to Train Well?

For the vast majority of athletes and situations, I believe nearly everything we need to know can be derived directly from training data.

If we structure training well, build in a variety of sessions (from short VO2max intervals to longer threshold workouts and marathon pace efforts), and observe how athletes perform, we can do most of what testing should be about:

We can determine an athlete’s limiters and adjust training accordingly. For example, we can identify LT1 and LT2 / critical pace, and assess if an athlete is lacking top end, or lacking fatigue resistance, or lacking a good aerobic base, and than plan accordingly, e.g. focusing on the weaknesses far from their A event, and on their strengths, together with specificity, near their A event. There are a few exceptions to this, e.g., metabolic flexibility, discussed below.
We can monitor progress. How are the various physiological capacities and limiters of an athlete progressing? Is this VO2max block leading to the expected changes? Did the athlete progress very quickly, and can we therefore switch to the next phase earlier than originally planned? etc, etc, etc.

We don’t need a separate test to do any of the above, typically. In fact I would normally oppose the idea of the single-day test. Such a test introduces variability, potential misinterpretation, and false precision. I believe it’s more robust (and therefore useful!) to evaluate limiters and progress based on trends from multiple workouts, especially when interpreted in the context of an athlete’s life, health, and training history.

If your goal is an ultramarathon sometime in the future, we are certainly working on getting your aerobic metabolism as good as possible, in the very long term. A lab test telling us that, e.g., LT1 comes pretty soon and requires some work, would not change the reality of the constraints we face to get there. Said another way, nearly every athlete running fewer than e.g. 70-90 km/week can benefit from running more but we might be unable to do it because of training age, time available to train, a history of injury, etc., hence the test does not lead to a meaningful change in the plan. What we need in this case is patience and training, more than testing.

My main point here is that knowing what you are doing, or working with someone who knows what they are doing (in terms of training prescription, planning, and most importantly, analysis of your training to assess limiters and progress), is way more important than testing.

Physiological Testing: The Good, the Bad, and the Ugly

Now, I’ve done plenty of testing on myself, and also tested many people with indirect calorimetry during my PhD studies, and I do believe testing can, at times, be very valuable. I’ve just turned my “life” around because of a test after all, and it did work wonderfully, but as usual, here I want to try to provide the right level of nuance to help you navigate this topic.

I’ll add that some athletes enjoy testing and find value in having a number to track. That’s totally fine. But if we test, we should test the right things in the right way, and we should also keep in mind that approach could backfire soon as no test is able to capture the entirety of how our training and performance is evolving and how we might be progressing despite no signs of such progress in a certain test (e.g. your lactate curve can stay the same while you improve your running economy and durability). This lack of progress in a test is something that can get in our heads (or our athletes’) and will need to be managed.

Below, I first cover the tests that tend not to be particularly useful, before moving towards the more useful ones.

Tests That Add Little Value

VO2max tests

Maximal tests, typically done with a ramp test (a test in which intensity is increased every few minutes, optionally while also sampling lactate), give you a number (your VO2max), but that number rarely informs training. VO2max is not trainable for many athletes, and performance doesn't depend on it directly. If you are curious about your VO2max, please do test it; nothing wrong with that. But the utility, beyond satisfying our curiosity, is very limited. Using training data, we can assess your top-end vs your threshold vs your aerobic pace without any maximal test and in a way that relates much better to performance. Additionally, maximal tests often use short stages (1–3 minutes), which are too short to establish a stable metabolic state, therefore making the data of no use in the context of testing your metabolism (e.g., to determine where LT1 is, or to determine substrate oxidation rates). These tests are more about reaching max rather than understanding what’s happening before that point.

You can find some additional considerations on using maximal tests to compare lactate curves over time, here.

Guilty of being curious about my VO2max!

LT2/Critical Power/Pace tests

While I keep these under the same umbrella, as they provide from a practical standpoint similar information, the test required to derive LT2 is a lab test with lactate sampling, while critical pace or power tests are typically done by measuring external load (i.e. pace or power) over a pre-defined type of test (e.g. run for X minutes, rest, run for Y minutes, etc.). I would not prescribe either of the two tests, but for different reasons.

For LT2, which in theory represents a point where lactate accumulation becomes exponential and unsustainable, this point is difficult to detect reliably and can only be approximated with different mathematical models that hardly ever agree with each other. While it is easy to spend long periods of time at low intensities, we cannot do the same at these higher intensities in which lactate supposedly behaves differently, and as such, we need to extrapolate and make many assumptions, eventually defeating the purpose of measuring our physiology (in my opinion).

Critical power/pace: these might be the type of tests I like the least, as they do not even look at our physiology, and the same information can be obtained by looking at any set of regular hard workouts done in a training block (here is a paper I published a few years ago to do just that). No need to compromise a workout to derive them. Athletes might also deal poorly with the idea of “the test”, or get too attached to “the test numbers”, despite the fact that doing the same test any other day would give quite different numbers (in terms of critical power/pace, but also in terms of the associated heart rate). In my view, it is much better to look at hard workouts over a few weeks to derive more reliable conclusions on the athlete’s abilities in terms of critical power/pace (see below).

Using Training Data to Answer The Same Questions

Personally, I spend some time every day analyzing each athlete’s workouts and archiving them in a file that I can access later in Python to assess limiters and progress, as well as to estimate performance.

High-end

To analyze the high end of things, I use an adaptation of the model I published above, so that I can determine my athlete’s critical pace (or what I called “estimated 10 km running time” in the paper), and analyze a number of other variables in relation to their training progression and performance, without requiring any specific testing; the athlete is simply doing their training.

For example, if I look at my workouts over the past few months, I can see that there is a large overlap between VO2max and Threshold work in terms of pace:

Analysis of pace vs average interval time at intensity for different workouts.

I can also see that cumulative time at intensity doesn’t necessarily drive the difference in pacing:

Analysis of pace vs cumulative time spent at intensity for different workouts.

What do we derive from this? The data above tells me that my Threshold (critical pace or LT2) is well developed, and hitting its limit (nearing VO2max). This has implications; the positive is that I’m doing well when it comes to sustaining a hard pace for longer, and the negative is that I might need to raise the roof if I want to further improve around threshold pace (e.g., for a half marathon or 10 km race). Again: no tests, just training data.

I can then use the same workout data to estimate critical pace over time for different sessions, which shows my progress over a few months in which I was rebuilding after dietary changes (normally, several workouts are used to get to a robust estimate, but here I’m showing individual estimates from a single workout, for the purpose of tracking quick progress):

Estimated 10km running time (in minutes) based on data from a single VO2max or Threshold session.

The data above with the estimated critical pace can be used, together with heart rate data, to estimate Heart Rate Zones (which are quite well aligned with the ones that I manually set over the years):

See this blog for an overview of training types and zones in my system.

Low-end

Looking at data for lower intensity sessions, I can also estimate my LT1 and see that between LT1 and LT2, the gap is also small. I normally extend this analysis to the entire roster of athletes I coach so that I can look at typical % differences in pace and heart rate between LT1 and LT2. If there is a large gap between the two or a much slower pace at LT1 wrt LT2 or critical pace, then we know what we need to work on. For example, here is my data:

Many years of consistent high-volume and high-intensity training have led to a smaller gap between LT1 and LT2.

While below is the data of an athlete with a much younger training age and lower training volume. We have a similar heart rate profile, but our ability to sustain a certain external load given that heart rate, is very different:

A large drop in pace between thresholds here highlights a poor aerobic base.

That gap can be closed with a few years of well-planned endurance training. Keep in mind that above, I am deriving LT1 and LT2 paces and heart rate, all based on training data, not based on any actual lactate test (or any field test either).

Plenty more can be analyzed then, in terms of changes in aerobic efficiency, estimated times for longer-duration events, etc. - all without requiring any laboratory tests.

These are the things I spend most of my time thinking about in the context of helping the athletes I coach, but I’ve digressed enough already!

The Top End vs The Low End

So far, I’ve been quite critical of testing. The common theme, though, is that I was focusing on the top end of things. The top/high end of things (moderate to hard intensities) is much more precisely estimated or assessed in training, with short hard sessions, threshold sessions, and marathon pace sessions, than it is in the lab. With these training sessions, we can understand an athlete’s ability to sustain a certain intensity and tolerate fatigue at that intensity, in ways that the lab cannot really capture (the only way to see what you can do in a marathon, is to run a long marathon pace session, no lab test can get that intensity right).

On the other hand, what happens at lower intensities is more interesting (and stable!) and hard to capture from low-intensity external load data, as it can decouple from internal load (e.g. we think we are running at low intensity, but we are not, physiologically speaking), hence the utility of sub-maximal testing for LT1, or the “aerobic threshold” as well as for metabolic flexibility.

The more important question for me is always the same: would we change training (or nutrition!) based on the results of this test? In the low-intensity side of the spectrum, the answer is sometimes yes. For the top-end, it is highly unlikely that we would learn something that is not already obvious in a careful analysis of an athlete’s workouts.

Tests That Can Add Value

Let’s look at the types of tests that tend to provide more actionable information.

True LT1

I call this a True LT1, meaning that we are actually looking at lactate now and not just deriving that point from training or heart rate data. This is the point where lactate starts to rise from baseline, which I have discussed in more detail here. Identifying LT1 helps define "true easy" running, separating zone 1 from moderate effort, which has implications for recovery and, therefore, training volume.

Determining LT1 can be useful, especially for beginners or those training near their LT1 often, which is typical when training frequency is rather low. If the goal is to maximize our potential as endurance athletes, over the years, we’ll have to work on increasing training volume. In this process, slowing down and making sure our easy running is truly easy can be an important step.

For others, already training high volume, we usually know when we are creeping above LT1 based on feel, performance, and trends (e.g., when hard sessions start to suffer). Additionally, when training at high volume, it is natural to start reducing the intensity of much of our training quite a bit below LT1, as it otherwise becomes mechanically and metabolically unsustainable (i.e., LT1 is a high power intensity, already compromising recovery and other training sessions, when done too frequently).

Other cases in which we might want to get a more precise assessment of LT1 are in the context of long-distance events, e.g., an ultramarathon, so that we can try to assess which intensities are metabolically unsustainable for too long, and which ones are not, if we plan to race (less important if we plan to finish). This is, for example, why I tested my lactate before my last 50 km.

Submaximal lactate testing shows improvements in LT1 over a few years.

Metabolic Flexibility (Fat vs Carbohydrate Use) and Running Economy

Metabolic flexibility cannot be assessed without a lab test. We need to measure VO2 and VCO2 with a good indirect calorimeter to assess substrate utilization and be able to answer these questions:

Can you preserve (limited) glycogen stores?
How much fat can you burn at marathon pace or at the intensity of your target event?
Are you metabolically limited?

This is also a helpful test to track progress when using dietary interventions (e.g., periodized carbs or low-carb strategies), from both angles, i.e., flexibility is not only about burning plenty of fat at easy or moderate intensities, but also about being able to burn carbohydrates in large amounts at very high intensities.

Similarly, running economy requires measuring VO2 (and I’d argue also VCO2, in the context of ultramarathons), and therefore the same test can be used. Once we know substrate utilization and running economy, we can better understand an athlete’s fueling requirement for a (ultra)marathon, which can be dramatically different based on aspects other than training and race intensity (i.e. depending on an athlete’s nutrition and diet outside of training), as I showed in this example.

For these tests, ideally, we want to use long steps to reach metabolically stable conditions, e.g., 6-10 minutes for each stage, and therefore we tend to limit the test to sub-maximal intensities, up to just above LT1. For a deep dive into the topic and more details about testing protocols for metabolic flexibility and running economy, check out my blog here.

Sometimes what we need is to eat differently, more than to train differently.

Summary and Practical Takeaways

Training is dynamic. Thoughtful planning and consistent feedback, as well as a careful analysis of your training, can be more valuable than sporadic testing. Using training data, we can assess limiters and training progress without specific lab or field tests, as I’ve shown with some examples above. Then, we can take into account this information to make adjustments as we keep training over the years. For this purpose, a good coach is probably a better use of your money than most lab testing.

Yet, the occasional test can nudge us in the right direction (if we plan it and interpret the data correctly!), showing that there is some extra room for pushing our limits at a certain time, or that we should be more conservative on another occasion. If testing, I would prioritize long-stage submaximal tests that give insight into LT1, running economy, and metabolic efficiency, more than the top-end of things. Don’t test just to get a number; test to guide a decision that would otherwise be uncertain.

I hope there was something useful in there for you. Happy training!

How to Show Your Support

No paywalls here. All my content is and will remain free.

As a HRV4Training user, the best way to help is to sign up for HRV4Training Pro.

Thank you for supporting my work.

Coaching

If you are interested in working with me, please learn more here, and join the waiting list by filling in the athlete intake form, here.

Marco holds a PhD cum laude in applied machine learning, a M.Sc. cum laude in computer science engineering, and a M.Sc. cum laude in human movement sciences and high-performance coaching. He is a certified ultrarunning coach.

Marco has published more than 50 papers and patents at the intersection between physiology, health, technology, and human performance.

He is co-founder of HRV4Training, advisor at Oura, guest lecturer at VU Amsterdam, and editor for IEEE Pervasive Computing Magazine. He loves running.

Social:

Kevin Price

May 10

This is the cheatsheet for runners’ training that we never had…until now. Thanks for this!

Expand full comment

1 reply by Marco Altini

Alan Bonmort

May 15

Hi Marco,

Thank you for this insightful newsletter — I completely agree with the idea that testing shouldn’t be done just for the sake of testing!

I had a quick question regarding your latest post, specifically about how you estimate your 10K performance and first threshold from training data. Would it be possible to get a bit more detail on the relationship or Python code you use for these estimations? I find it really interesting!

Thanks again for your great work.

2 replies by Marco Altini and others

5 more comments...

Marco Altini’s Substack

Discussion about this post