AI Futures Model: Timelines & Takeoff

AboutAnalysisForecasts

AI Futures Model: Timelines & Takeoff

AboutAnalysisForecasts
Explanation

This website presents the AI Futures Model (Dec 2025 version), following up on the timelines and takeoff models we published alongside AI 2027. Predicting superhuman AI capabilities inherently requires much intuition and guesswork, but we've nonetheless found quantitative modeling to be useful.

To the right you can see our model's output with each parameter set to Daniel's median estimate. (Eli's used to be the default, which had AC in 2031 and ASI in 2034.)

Important: Read about how our forecasts have changed since AI 2027 here. The default milestone dates differ from Daniel's median forecasts; those are on the Forecasts page.

Date of Automated Coder (AC)

03/2030
Infoi

Date of Superintelligence (ASI)

07/2031
Infoi
Takeoff Period: AI Software R&D Uplift
i
ACSARSIARTED-AIASI2031
Coding Time Horizon
i
AC time horizon3.3 work years1 min1 hr3 days3 months30 years202220262030
Effective & Training Compute
i
Effective1017 FLOPTraining1023 FLOP1045 FLOP1029 FLOP1027 FLOP202220262030
Log Scale

Software Efficiency

3×10-62×1015

AI Research Taste

2×10-13830

AI Serial Coding Labor Multiplier

1 x100M x

Cumulative Research Effort

32130M
Linear Scale

Software Efficiency Growth Rate

0.8225.05

Coding Automation Fraction

0.2%100.0%
Log Scale

Software Research Effort

19460M

AI Parallel Coding Labor Multiplier

1 x1016 x

Experiment Throughput

19550K

Inference Compute for Coding Automation

7.4 H100e11M H100e

Experiment Compute

130 H100e110M H100e

Human Coding Labor

5941K
Parameters
5.5 months
3.3 work years
0.92x/doubling
3.97x
3.0
Explanation

This website presents the AI Futures Model (Dec 2025 version), following up on the timelines and takeoff models we published alongside AI 2027. Predicting superhuman AI capabilities inherently requires much intuition and guesswork, but we've nonetheless found quantitative modeling to be useful.

To the right you can see our model's output with each parameter set to Daniel's median estimate. (Eli's used to be the default, which had AC in 2031 and ASI in 2034.)

Important: Read about how our forecasts have changed since AI 2027 here. The default milestone dates differ from Daniel's median forecasts; those are on the Forecasts page.

Date of Automated Coder (AC)

03/2030
Infoi

Date of Superintelligence (ASI)

07/2031
Infoi
Takeoff Period: AI Software R&D Uplift
i
ACSARSIARTED-AIASI2031
Coding Time Horizon
i
AC time horizon3.3 work years1 min1 hr3 days3 months30 years202220262030
Effective & Training Compute
i
Effective1017 FLOPTraining1023 FLOP1045 FLOP1029 FLOP1027 FLOP202220262030
Key Parameters
5.5 months
3.3 work years
0.92x/doubling
3.97x
3.0 SDs/2025-effective-FLOP-growth
Log Scale

Software Efficiency

3×10-62×1015

AI Research Taste

2×10-13830

AI Serial Coding Labor Multiplier

1 x100M x

Cumulative Research Effort

32130M

Software Research Effort

19460M

AI Parallel Coding Labor Multiplier

1 x1016 x

Experiment Throughput

19550K

Inference Compute for Coding Automation

7.4 H100e11M H100e

Experiment Compute

130 H100e110M H100e

Human Coding Labor

5941K
Linear Scale

Software Efficiency Growth Rate

0.8225.05

Coding Automation Fraction

0.2%100.0%

Model Architecture Details

Now we describe how our model works in more detail, including: written descriptions of how the model works, the equations forming the model, the reasoning behind our modeling choices, and how the most important parameters of the model are set. We'll start with how the coding time horizon progression works, then discuss how we model improvements in AI efficiency and how automation affects the model, then describe how we model intelligence explosions. You can use the model diagram or table of contents on the right to navigate. If you'd instead like to see the forecasts produced by our model, go to the forecast page.

The final section Recap of the Model is a succinct summary of the whole model, which some readers may prefer and which can serve as a useful reference. It contains little motivation or explanation, and mainly consists of the formulas.

HUMAN RESEARCH TASTEAGGREGATE RESEARCH TASTESOFTWARE EFFICIENCYAUTOMATED RESEARCH TASTECODING AUTOMATION FRACTION AND EFFICIENCYAUTOMATED CODER TIME HORIZONSOFTWARE RESEARCH EFFORTAUTOMATION COMPUTEEXPERIMENT COMPUTEEXPERIMENT THROUGHPUTAGGREGATE CODING LABORHUMAN CODING LABOREFFECTIVE COMPUTETRAINING COMPUTEANCHORSANCHORSAUTOMATION FEEDBACKAUTOMATION FEEDBACK
Introduction
Stage 1: Automating CodingStage 2: Research TasteStage 3: Intelligence Explosion
1. Time Horizon & AC Milestone
1.1 Using Effective Compute1.2 Horizon Progession1.3 AC Effective Compute
2. Modeling Effective Compute
2.1 Compute Forecasts2.2 Experiment Throughput2.3 Research Effort2.4 Software Efficiency
3. Coding Automation
3.1 Automation Fraction3.2 Automation Efficiency3.3 Aggregate Coding Labor
4. Research Taste Automation
4.1 Human Taste4.2 Automated Taste4.3 Aggregate Taste
5. After Full AI R&D Automation
6. Recap of the Model

Model Architecture Details

Now we will describe how our model works in more detail, including: written descriptions of how the model works, the equations forming the model, the reasoning behind our modeling choices, and how the most important parameters of the model are set. We'll start with how the coding time horizon progression works, then discuss how we model improvements in AI efficiency and how automation affects the model, then describe how we model intelligence explosions. If you'd instead like to see the forecasts produced by our model, go to the forecast page.

The final section Recap of the Model is a succinct summary of the whole model, which some readers may prefer and which can serve as a useful reference. It contains little motivation or explanation, and mainly consists of the formulas.

HUMAN RESEARCH TASTEAGGREGATE RESEARCH TASTESOFTWARE EFFICIENCYAUTOMATED RESEARCH TASTECODING AUTOMATION FRACTION AND EFFICIENCYAUTOMATED CODER TIME HORIZONSOFTWARE RESEARCH EFFORTAUTOMATION COMPUTEEXPERIMENT COMPUTEEXPERIMENT THROUGHPUTAGGREGATE CODING LABORHUMAN CODING LABOREFFECTIVE COMPUTETRAINING COMPUTEANCHORSANCHORSAUTOMATION FEEDBACKAUTOMATION FEEDBACK

The AI Futures Model

We've significantly upgraded our timelines and takeoff model! We’ve found it useful for clarifying and informing our thinking, and we hope others do too. We plan to continue thinking about these topics, and update our models and forecasts accordingly.

Why do timelines and takeoff modeling?

The future is very hard to predict. We don't think this model, or any other model, should be trusted completely. The model takes into account what we think are the most important dynamics and factors, but it doesn't take into account everything. Also, only some of the parameter values in the model are grounded in empirical data; the rest are intuitive guesses. If you disagree with our guesses, you can change them above.

Nevertheless, we think that modeling work is important. Our overall view is the result of weighing many considerations, factors, arguments, etc.; a model is a way to do this transparently and explicitly, as opposed to implicitly and all in our head. By reading about our model, you can come to understand why we have the views we do, what arguments and trends seem most important to us, etc.

The future is uncertain, but we shouldn’t just wait for it to arrive. If we try to predict what will happen, if we pay attention to the trends and extrapolate them, if we build models of the underlying dynamics, then we'll have a better sense of what is likely, and we'll be less unprepared for what happens. We’ll also be able to better incorporate future empirical data into our forecasts.

In fact, the improvements we’ve made to this model as compared to our timelines model at the time of AI 2027 (Apr 2025), have resulted in a roughly 2-4 year shift in our median for full coding automation. The plurality of this has come from improving our modeling of AI R&D automation, but other factors such as revising parameter estimates have also contributed. These modeling improvements have resulted in a larger change in our views than the new empirical evidence that we’ve observed. You can read more about the shift in our blog post. 

Why our approach to modeling? Comparing to other approaches

In our blog post, we give a brief survey of other methods for forecasting AI timelines and takeoff speeds, including other formal models such as FTM and GATE but also simpler methods like revenue extrapolation and intuition: you can read that here.

What predictions does the model make?

You can find our forecasts here. These are based on simulating the model lots of times, with each simulation resampling from our parameter distributions. We also display our overall views after adjusting for out-of-model considerations.

With Eli's parameter estimates, the model gives a median of late 2030 for full automation of coding (which we refer to as the Automated Coder milestone or AC), with 19% probability by the end of 2027. With Daniel’s parameter estimates, the median for AC is late 2029.

In our results analysis, we investigate which parameters have the greatest impact on the model’s behavior. We also discuss the correlation between shorter timelines and faster takeoff speeds.

Our blog post discusses how these predictions compare to the predictions of our previous model from AI 2027.

Limitations and all-things-considered views

In terms of how how much they affect Eli’s all-things-considered views (i.e. Eli’s overall forecast of what will happen, taking into account factors outside of the model), the top known limitations are:

Not explicitly modeling data as an input to AI progress. Our model implicitly assumes now that any data progress is proportional to algorithmic progress. But data in practice could be either more or less bottlenecking. Along with accounting for unknown unknowns, this limitation pushes Eli’s median timelines back by 2 years.

Not modeling automation of hardware R&D, hardware production, and general economic automation. We aren’t modeling these, and while they have longer lead times than software R&D, a year might be enough for them to make a substantial difference. This limitation makes Eli place more probability on fast takeoffs from AC to ASI than the model does, especially increasing the probability of <3 year takeoffs (from ~43% to ~60%).

Meanwhile, Daniel increases his uncertainty somewhat in both directions in an attempt to account for model limitations and unknown unknowns and makes his takeoff somewhat faster for reasons similar to Eli’s.

A more thorough discussion of this model’s limitations can be found here, and discussion of how they affect our all-things-considered views is here.

High-level description of the model behavior

We start with a high-level description of how our model can intuitively be divided into 3 stages. The next section contains the actual explanation of the model, and accompanies the arrowed diagram on the right.

Our model's primary output is the trajectory of AIs' abilities to automate and accelerate AI software R&D. But although the same formulas are used in Stages 1, 2, and 3, new dynamics emerge at certain milestones (Automated Coder, Superhuman AI Researcher), and so these milestones delineate natural stages.

Stage 1: Automating coding

Stage 1 predicts when coding in the AGI project will be fully automated. This stage centrally involves extrapolating the METR-HRS coding time horizon study.

Milestone endpoint: Automated Coder (AC). An AC can fully automate an AGI project's coding work, replacing the project’s entire software engineering staff.

The main drivers in Stage 1 are:

  1. Coding time horizon progression. We model how the coding time horizon trend progresses via parameters for (a) the effective compute increase currently required to double time horizon, (b) how this doubling requirement changes over time (i.e. whether time horizon growth is superexponential in log(effective compute)), and (c) the time horizon required for AC. In some simulations, we include a “gap” on top of reaching the time horizon requirement, in case doing well on the METR dataset doesn't immediately translate to doing well on real-world coding tasks.
  2. Partial automation of coding speeding up progress.
  3. Training compute growth slowing. We project that training compute growth will slow over time, due to limits on investment and the speed of building new fabs. This has a big impact in ~2035+ timelines.

Stage 2: Automating research taste

Besides coding, we track one other type of skill that is needed to automate AI software R&D: research taste. While automating coding makes an AI project faster at implementing experiments, automating research taste makes the project better at setting research directions, selecting experiments, and learning from experiments.

Stage 2 predicts how quickly we will go from an Automated Coder (AC) to a Superhuman AI researcher (SAR), an AI with research taste matching the top human researcher.

Milestone endpoint: Superhuman AI Researcher (SAR): A SAR can fully automate AI R&D.

The main drivers of how quickly Stage 2 goes is:

  1. How much automating coding speeds up AI R&D. This depends on a few factors, for example how severely the project gets bottlenecked on experiment compute.
  2. How good AIs' research taste is at the time AC is created. If AIs are better at research taste relative to coding, Stage 2 goes more quickly.
  3. How quickly AIs' research taste improves. For each 10x of effective compute, how much more value does one get per experiment?

Stage 3: The intelligence explosion

Finally we model how quickly AIs are able to self-improve once AI R&D is fully automated and humans are obsolete. The endpoint of Stage 3 is asymptoting at the limits of intelligence.

The primary milestones we track in Stage 3 are:

  1. Superintelligent AI Researcher (SIAR). The gap between a SIAR and the top AGI project human researcher is 2x greater than the gap between the top AGI project human researcher and the median researcher.
  2. Top-human-Expert-Dominating AI (TED-AI). A TED-AI is at least as good as top human experts at virtually all cognitive tasks. (Note that the translation in our model from AI R&D capabilities to general capabilities is very rough.)
  3. Artificial Superintelligence (ASI). The gap between an ASI and the best humans is 2x greater than the gap between the best humans and the median professional, at virtually all cognitive tasks.

In our simulations, we see a wide variety of outcomes ranging from a weeks-long takeoff from SAR to ASI, to a fizzling out of the intelligence explosion requiring further increases in compute to get to ASI.

To achieve a fast takeoff, there usually needs to be a feedback loop such that each successive doubling of AI capabilities takes less time than the last. In the fastest takeoffs, this is usually possible via a taste-only singularity, i.e. the doublings would get faster solely from improvements in research taste (and not increases in compute or improvements in coding). Whether a taste-only singularity occurs depends on which of the following dominates:

  1. The rate at which (experiment) ideas become harder to find. Specifically, how much new “research effort” is needed to achieve a given increase in AI capabilities.
  2. How quickly AIs' research taste improves. For a given amount of inputs to AI progress, how much more value does one get per experiment?

Continued improvements in coding automation matter less and less, as the project gets bottlenecked by their limited supply of experiment compute.

The AI Futures Model

We've significantly upgraded our timelines and takeoff model! We’ve found it useful for clarifying and informing our thinking, and we hope others do too. We plan to continue thinking about these topics, and update our models and forecasts accordingly.

Why do timelines and takeoff modeling?

The future is very hard to predict. We don't think this model, or any other model, should be trusted completely. The model takes into account what we think are the most important dynamics and factors, but it doesn't take into account everything. Also, only some of the parameter values in the model are grounded in empirical data; the rest are intuitive guesses. If you disagree with our guesses, you can change them above.

Nevertheless, we think that modeling work is important. Our overall view is the result of weighing many considerations, factors, arguments, etc.; a model is a way to do this transparently and explicitly, as opposed to implicitly and all in our head. By reading about our model, you can come to understand why we have the views we do, what arguments and trends seem most important to us, etc.

The future is uncertain, but we shouldn’t just wait for it to arrive. If we try to predict what will happen, if we pay attention to the trends and extrapolate them, if we build models of the underlying dynamics, then we'll have a better sense of what is likely, and we'll be less unprepared for what happens. We’ll also be able to better incorporate future empirical data into our forecasts.

In fact, the improvements we’ve made to this model as compared to our timelines model at the time of AI 2027 (Apr 2025), have resulted in a roughly 2-4 year shift in our median for full coding automation. The plurality of this has come from improving our modeling of AI R&D automation, but other factors such as revising parameter estimates have also contributed. These modeling improvements have resulted in a larger change in our views than the new empirical evidence that we’ve observed. You can read more about the shift in our blog post. 

Why our approach to modeling? Comparing to other approaches

In our blog post, we give a brief survey of other methods for forecasting AI timelines and takeoff speeds, including other formal models such as FTM and GATE but also simpler methods like revenue extrapolation and intuition: you can read that here.

What predictions does the model make?

You can find our forecasts here. These are based on simulating the model lots of times, with each simulation resampling from our parameter distributions. We also display our overall views after adjusting for out-of-model considerations.

With Eli's parameter estimates, the model gives a median of late 2030 for full automation of coding (which we refer to as the Automated Coder milestone or AC), with 19% probability by the end of 2027. With Daniel’s parameter estimates, the median for AC is late 2029.

In our results analysis, we investigate which parameters have the greatest impact on the model’s behavior. We also discuss the correlation between shorter timelines and faster takeoff speeds.

Our blog post discusses how these predictions compare to the predictions of our previous model from AI 2027.

Limitations and all-things-considered views

In terms of how how much they affect Eli’s all-things-considered views (i.e. Eli’s overall forecast of what will happen, taking into account factors outside of the model), the top known limitations are:

Not explicitly modeling data as an input to AI progress. Our model implicitly assumes now that any data progress is proportional to algorithmic progress. But data in practice could be either more or less bottlenecking. Along with accounting for unknown unknowns, this limitation pushes Eli’s median timelines back by 2 years.

Not modeling automation of hardware R&D, hardware production, and general economic automation. We aren’t modeling these, and while they have longer lead times than software R&D, a year might be enough for them to make a substantial difference. This limitation makes Eli place more probability on fast takeoffs from AC to ASI than the model does, especially increasing the probability of <3 year takeoffs (from ~43% to ~60%).

Meanwhile, Daniel increases his uncertainty somewhat in both directions in an attempt to account for model limitations and unknown unknowns and makes his takeoff somewhat faster for reasons similar to Eli’s.

A more thorough discussion of this model’s limitations can be found here, and discussion of how they affect our all-things-considered views is here.

High-level description of the model behavior

We start with a high-level description of how our model can intuitively be divided into 3 stages. The next section contains the actual explanation of the model, and accompanies the arrowed diagram on the right.

Our model's primary output is the trajectory of AIs' abilities to automate and accelerate AI software R&D. But although the same formulas are used in Stages 1, 2, and 3, new dynamics emerge at certain milestones (Automated Coder, Superhuman AI Researcher), and so these milestones delineate natural stages.

Stage 1: Automating coding

Stage 1 predicts when coding in the AGI project will be fully automated. This stage centrally involves extrapolating the METR-HRS coding time horizon study.

Milestone endpoint: Automated Coder (AC). An AC can fully automate an AGI project's coding work, replacing the project’s entire software engineering staff.

The main drivers in Stage 1 are:

  1. Coding time horizon progression. We model how the coding time horizon trend progresses via parameters for (a) the effective compute increase currently required to double time horizon, (b) how this doubling requirement changes over time (i.e. whether time horizon growth is superexponential in log(effective compute)), and (c) the time horizon required for AC. In some simulations, we include a “gap” on top of reaching the time horizon requirement, in case doing well on the METR dataset doesn't immediately translate to doing well on real-world coding tasks.
  2. Partial automation of coding speeding up progress.
  3. Training compute growth slowing. We project that training compute growth will slow over time, due to limits on investment and the speed of building new fabs. This has a big impact in ~2035+ timelines.

Stage 2: Automating research taste

Besides coding, we track one other type of skill that is needed to automate AI software R&D: research taste. While automating coding makes an AI project faster at implementing experiments, automating research taste makes the project better at setting research directions, selecting experiments, and learning from experiments.

Stage 2 predicts how quickly we will go from an Automated Coder (AC) to a Superhuman AI researcher (SAR), an AI with research taste matching the top human researcher.

Milestone endpoint: Superhuman AI Researcher (SAR): A SAR can fully automate AI R&D.

The main drivers of how quickly Stage 2 goes is:

  1. How much automating coding speeds up AI R&D. This depends on a few factors, for example how severely the project gets bottlenecked on experiment compute.
  2. How good AIs' research taste is at the time AC is created. If AIs are better at research taste relative to coding, Stage 2 goes more quickly.
  3. How quickly AIs' research taste improves. For each 10x of effective compute, how much more value does one get per experiment?

Stage 3: The intelligence explosion

Finally we model how quickly AIs are able to self-improve once AI R&D is fully automated and humans are obsolete. The endpoint of Stage 3 is asymptoting at the limits of intelligence.

The primary milestones we track in Stage 3 are:

  1. Superintelligent AI Researcher (SIAR). The gap between a SIAR and the top AGI project human researcher is 2x greater than the gap between the top AGI project human researcher and the median researcher.
  2. Top-human-Expert-Dominating AI (TED-AI). A TED-AI is at least as good as top human experts at virtually all cognitive tasks. (Note that the translation in our model from AI R&D capabilities to general capabilities is very rough.)
  3. Artificial Superintelligence (ASI). The gap between an ASI and the best humans is 2x greater than the gap between the best humans and the median professional, at virtually all cognitive tasks.

In our simulations, we see a wide variety of outcomes ranging from a weeks-long takeoff from SAR to ASI, to a fizzling out of the intelligence explosion requiring further increases in compute to get to ASI.

To achieve a fast takeoff, there usually needs to be a feedback loop such that each successive doubling of AI capabilities takes less time than the last. In the fastest takeoffs, this is usually possible via a taste-only singularity, i.e. the doublings would get faster solely from improvements in research taste (and not increases in compute or improvements in coding). Whether a taste-only singularity occurs depends on which of the following dominates:

  1. The rate at which (experiment) ideas become harder to find. Specifically, how much new “research effort” is needed to achieve a given increase in AI capabilities.
  2. How quickly AIs' research taste improves. For a given amount of inputs to AI progress, how much more value does one get per experiment?

Continued improvements in coding automation matter less and less, as the project gets bottlenecked by their limited supply of experiment compute.