Editing IPCC:AR6/WGI/Chapter-4 (section)

== 4.2 Methodology ==

<div id="4.2.1" class="h2-container"></div>

<span id="models-model-intercomparison-projects-and-ensemble-methodologies"></span>
=== 4.2.1 Models, Model Intercomparison Projects, and Ensemble Methodologies ===

<div id="h2-7-siblings" class="h2-siblings"></div>

Similar to the approach used in AR5 ([[#Flato--2013|Flato et al., 2013]]), the primary lines of evidence of this chapter are comprehensive climate models (atmosphere–ocean general circulation models, AOGCMs) and Earth system models (ESMs); ESMs differ from AOGCMs by including representations of various biogeochemical cycles. We also build on results from ESMs of intermediate complexity (EMICs; [[#Claussen--2002|Claussen et al., 2002]] ; [[#Eby--2013|Eby et al., 2013]]) and other types of models where appropriate. This chapter focuses on a particular set of coordinated multi-model experiments known as model intercomparison projects (MIPs). These frameworks recommend and document standards for experimental design for running AOGCMs and ESMs to minimize the chance of differences in results being misinterpreted. CMIP is an activity of the World Climate Research Programme (WCRP), and the latest phase is CMIP6 ([[#Eyring--2016|Eyring et al., 2016]]). To establish robustness of results, it is vital to assess the performance of these models in terms of mean state, variability, and the response to external forcings. That evaluation has been undertaken using the CMIP6 ‘Diagnostic, Evaluation and Characterization of Klima’ (DECK) and historical simulations in [[IPCC:Wg1:Chapter:Chapter-3|Chapter 3]] of this Report, which concludes that there is ''high confidence'' that the CMIP6 multi-model mean captures most aspects of observed climate change well ([[IPCC:Wg1:Chapter:Chapter-3#3.8.3.1|Section 3.8.3.1]]).

This chapter draws mainly on future projections referenced both against the period 1850–1900 and the recent past, 1995–2014, performed primarily under ScenarioMIP ([[#O’Neill--2016|O’Neill et al., 2016]]). This allows us to assess both dimensions of integration across scenarios ([[#4.3|Section 4.3]]) and global warming levels ([[#4.6|Section 4.6]]) as discussed in [[IPCC:Wg1:Chapter:Chapter-1|Chapter 1]] ([[IPCC:Wg1:Chapter:Chapter-1#1.6|Section 1.6]]). Other MIPs also target future scenarios with a focus on specific processes or feedbacks and are summarized in Table 4.1.

<div id="_idContainer013" class="mt-3"></div>

'''Table 4.1 |''' '''Model Intercomparison Projects (MIPs) utilized in Chapter 4.'''

{| class="wikitable"
|-
| MIP/Experiment

| Usage

| Chapter/Section

| Reference

|-
| DECK, 1%, 4×CO <sub>2</sub>

| Diagnosing climate sensitivity

| Assessed in Chapter 7

ECS and TCR used in GSAT assessment

| [[#Eyring--2016|Eyring et al. (2016)]]

|-
| CMIP6 Historical

| Evaluation, baseline

| Assessed in Chapter 3,

used in Chapter 4 to cover reference period

| [[#Eyring--2016|Eyring et al. (2016)]]

|-
| ScenarioMIP

| Future projections

| Used throughout Chapter 4

| [[#O’Neill--2016|O’Neill et al. (2016)]]

|-
| AerChemMIP

| Aerosols and trace gases

| 4.4.4

| [[#Collins--2017|Collins et al. (2017)]]

|-
| C4MIP

| CO <sub>2</sub> emissions-driven simulations

| 4.3.1

| C.D. [[#Jones--2016a|Jones et al. (2016a)]]

|-
| CDRMIP

| Carbon dioxide removal

| 4.6.3

| [[#Keller--2018|Keller et al. (2018)]]

|-
| DCPP

| Near-term climate change

| 4.2.3, Box 4.1, 4.4

| [[#Boer--2016|Boer et al. (2016)]]

|-
| GeoMIP

| Solar radiation modification

| 4.6.3

| [[#Kravitz--2011|Kravitz et al. (2011)]]

|-
| PDRMIP

| Forcing dependence of precipitation

| 4.5.1

| [[#Myhre--2017|Myhre et al. (2017)]]

|-
| SIMIP

| Sea ice assessment

| 4.3

| [[#Notz--2016|Notz et al. (2016)]]

|-
| ZECMIP

| Zero emissions commitment

| 4.7.1

| [[#Jones--2019|Jones et al. (2019)]]

|-
| CMIP5

| RCP scenario assessment

| 4.6.2, 4.7.1

| [[#Taylor--2012|Taylor et al. (2012)]]

|}

Multi-model ensembles provide the central focus of projection assessment. While single-model experiments have great value for exploring new results and theories, multi-model ensembles additionally underpin the assessment of the robustness, reproducibility, and uncertainty attributable to model internal structure and processes variability ([[#4.2.5|Section 4.2.5]] ; [[#Hawkins--2009|Hawkins and Sutton, 2009]]). Techniques underlying the combination of evaluation and weighting that are applied in this chapter are synthesized in Box 4.1.

Climate model simulations can be performed in either ‘concentration-driven’ or ‘emissions-driven’ configurations reflecting whether the CO <sub>2</sub> concentration is prescribed to follow a pre-defined pathway or is simulated by the Earth system models in response to prescribed emissions of CO <sub>2</sub> (Box 6.4, [[#Ciais--2013|Ciais et al., 2013]]). The majority of CMIP6 experiments are conducted in concentration-driven configurations in order to enable models without a fully interactive carbon cycle to perform them, and throughout most of this chapter we present results from those simulations unless otherwise stated. Concentrations of other greenhouse gases are always prescribed. However, the SSP5-8.5 scenario has also been performed in emissions-driven configuration (‘esm-ssp585’) by 10 ESMs, and in [[#4.3.1.1|Section 4.3.1.1]] we assess the impact on simulated climate over the 21st century.

Internal variability complicates the identification of forced climate signals, especially when considering regional climate signals over short time scales (up to a few decades), such as local trends over the satellite era ([[#Hawkins--2009|Hawkins and Sutton, 2009]] ; [[#Deser--2012a|Deser et al., 2012a]] ; [[#Xie--2015|Xie et al., 2015]] ; [[#Lovenduski--2016|Lovenduski et al., 2016]] ; [[#Suárez-Gutiérrez--2017|Suárez-Gutiérrez et al., 2017]]). Large initial-condition ensembles, where the same model is run repeatedly under identical forcing but with initial conditions varied through small perturbations or by sampling different times of a pre-industrial control run, have substantially grown in their use since AR5 ([[#Deser--2012a|Deser et al., 2012a]] ; [[#Kay--2015|Kay et al., 2015]] ; [[#Rodgers--2015|Rodgers et al., 2015]] ; [[#Hedemann--2017|Hedemann et al., 2017]] ; [[#Stolpe--2018|Stolpe et al., 2018]] ; [[#Maher--2019|Maher et al., 2019]]). Such large ensembles have shown potential to quantify uncertainty due to internal variability ([[#Hawkins--2016|Hawkins et al., 2016]] ; [[#McCusker--2016|McCusker et al., 2016]] ; [[#Sigmond--2016|Sigmond and Fyfe, 2016]] ; [[#Lehner--2017|Lehner et al., 2017]] ; [[#McKinnon--2017|McKinnon et al., 2017]] ; [[#Marotzke--2019|Marotzke, 2019]]) and thereby extract the forced signal from the internal variability, which can be calibrated against observational data to improve the reliability of probabilistic climate projections over the near and mid-term ([[#O’Reilly--2020|O’Reilly et al., 2020]]). Moreover, they allow the investigation of forced changes in internal variability (e.g., [[#Maher--2018|Maher et al., 2018]]). A key assumption is that a given model skilfully represents internal variability; structural uncertainty is not accounted for.

A complementary approach that represents structural uncertainty in a given model is stochastic physics ([[#Berner--2017|Berner et al., 2017]]). The approach has proven useful in representing structural uncertainty on seasonal climate time scales ([[#Weisheimer--2014|Weisheimer et al., 2014]] ; [[#Batté--2015|Batté and Doblas-Reyes, 2015]] ; [[#MacLachlan--2015|MacLachlan et al., 2015]]). Stochastic physics can markedly improve the internal variability in a given model ([[#Dawson--2015|Dawson and Palmer, 2015]] ; [[#Wang--2016|Wang et al., 2016]] ; [[#Christensen--2017|Christensen et al., 2017]] ; [[#Davini--2017|Davini et al., 2017]] ; [[#Watson--2017|Watson et al., 2017]] ; [[#Strømmen--2018|Strømmen et al., 2018]] ; [[#Yang--2019|Yang et al., 2019]]). Stochastic physics can also correct long-standing mean-state biases ([[#Sanchez-Gomez--2016|Sanchez-Gomez et al., 2016]]) and can influence the predicted climate sensitivity ([[#Christensen--2019|Christensen and Berner, 2019]] ; [[#Strommen--2019|Strommen et al., 2019]] ; [[#Meccia--2020|Meccia et al., 2020]]).

Perturbed-physics ensembles ([[#Murphy--2004|Murphy et al., 2004]]) are also used to systematically account for parameter uncertainty in a given model. Uncertain model parameters are identified and ranges in their values selected that conform to emergent observational constraints (see [[IPCC:Wg1:Chapter:Chapter-1#1.5.4.2|Section 1.5.4.2]]). These parameters are then changed between ensemble members to sample the effect of parameter uncertainty on climate ([[#Piani--2005|Piani et al., 2005]] ; [[#Sexton--2012|Sexton et al., 2012]] ; [[#Johnson--2018|Johnson et al., 2018]] ; [[#Regayre--2018|Regayre et al., 2018]]). It is possible to weight ensemble members according to some performance metric or emergent constraint (e.g., [[#Fasullo--2015|Fasullo et al., 2015]] ; [[IPCC:Wg1:Chapter:Chapter-1#1.5.4.7|Section 1.5.4.7]]) to improve the ensemble distribution (Box 4.1).

<div id="4.2.2" class="h2-container"></div>

<span id="scenarios"></span>
=== 4.2.2 Scenarios ===

<div id="h2-7-siblings" class="h2-siblings"></div>

The AR5 drew heavily on four main scenarios, known as Representative Concentration Pathways (RCPs: [[#Meinshausen--2011|Meinshausen et al., 2011]] ; [[#van%20Vuuren--2011|van Vuuren et al., 2011]]), and simulation results from CMIP5 ([[#4.2.1|Section 4.2.1]] ; [[#Taylor--2012|Taylor et al., 2012]]). The RCPs were labelled by the approximate radiative forcing reached at the year 2100, going from 2.6, 4.5, 6.0 to 8.5 W m <sup>–2</sup> .

This chapter draws on model simulations from CMIP6 ([[#Eyring--2016|Eyring et al., 2016]]) using a new range of scenarios based on Shared Socio-economic Pathways (SSPs; [[#O’Neill--2016|O’Neill et al., 2016]]). The set of SSPs is described in detail in [[IPCC:Wg1:Chapter:Chapter-1|Chapter 1]] ([[IPCC:Wg1:Chapter:Chapter-1#1.6|Section 1.6]]) and recognizes that global radiative forcing levels can be achieved by different pathways of CO <sub>2</sub> , non-CO <sub>2</sub> greenhouse gases (GHGs), aerosols ([[#Amann--2013|Amann et al., 2013]] ; [[#Rao--2017|Rao et al., 2017]]) and land use; the set of SSPs therefore establishes a matrix of global forcing levels and socio-economic storylines. ScenarioMIP ([[#O’Neill--2016|O’Neill et al., 2016]]) identifies four priority (tier-1) scenarios that participating modelling groups are asked to perform, SSP1-2.6 for sustainable pathways, SSP2-4.5 for middle-of-the-road, SSP3-7.0 for regional rivalry, and SSP5-8.5 for fossil fuel-rich development. This chapter focuses its assessment on these, plus the SSP1-1.9 scenario, which is directly relevant to the assessment of the 1.5°C Paris Agreement goal. Further, this chapter discusses these scenarios and their extensions past 2100 in the context of the very long-term climate change in [[#4.7.1|Section 4.7.1]] . Projections of short-lived climate forcers (SLCFs) are assessed in more detail in [[IPCC:Wg1:Chapter:Chapter-6|Chapter 6]] (Section 6.7).

In presenting results and evidence, this chapter tries to be as comprehensive as possible. In tables we show multi-model mean change and 5–95% range for all five SSPs, while in time series figures we show multi-model mean change for all five SSPs but for clarity 5–95% range only for SSP1-2.6 and SSP3-7.0. Where maps are presented, due to space restrictions we focus on showing multi-model mean change for SSP1-2.6 and SSP3-7.0. SSP1-2.6 is preferred over SSP1-1.9 because the latter has far fewer simulations available. The high-end scenarios RCP8.5 or SSP5-8.5 have recently been argued to be implausible to unfold (e.g., [[#Hausfather--2020|Hausfather and Peters, 2020]] ; see Chapter 3 of the AR6 WGIII). However, where relevant we show results for SSP5-8.5, for example to enable backwards compatibility with AR5, for comparison between emissions-driven and concentration-driven simulations, and because there is greater data availability of daily output for SSP5-8.5. When presenting low-likelihood, high-warming storylines we also show results from the high-end SSP5-8.5 scenario.

ScenarioMIP simulations include advances in techniques to better harmonize with historical forcings relative to CMIP5. For example, projected changes in the solar cycle include long-term modulation rather than a repeating solar cycle ([[#Matthes--2017|Matthes et al., 2017]]). Background natural aerosols are ramped down to an average historical level used in the control simulation by 2025 and background volcanic forcing is ramped up from the value at the end of the historical simulation period (2015) over 10 years to the same constant value prescribed for the pre-industrial control (piControl) simulations in the DECK, and then kept fixed – both changes are intended to avoid inconsistent model treatment of unknowable natural forcing to affect the near-term projected warming.

Complete backward comparability between CMIP5 and CMIP6 scenarios cannot be established for detailed regional assessments, because the SSP scenarios include regional forcings – especially from land use and aerosols – that are different from the CMIP5 RCPs. Even at a global level, a quantitative comparison is challenging between corresponding SSP and RCP radiative forcing levels due to differing contributions to the forcing ([[#Meinshausen--2020|Meinshausen et al., 2020]]) and evidence of differing model responses ([[#4.6.2.2|Section 4.6.2.2]] ; [[#Wyser--2020|Wyser et al., 2020]]). The RCP scenarios assessed in AR5 all showed similar, rapid reductions in SLCFs and emissions of SLCF precursor species over the 21st century; the CMIP5 projections hence did not sample a wide range of possible trajectories for future SLCFs ([[#Chuwah--2013|Chuwah et al., 2013]]). The SSP scenarios assessed in the AR6 offer more scope to explore SLCF pathways as they sample a broader range of air quality policy options ([[#Gidden--2019|Gidden et al., 2019]]) and relationships of CO <sub>2</sub> to non-CO <sub>2</sub> greenhouse gases ([[#Meinshausen--2020|Meinshausen et al., 2020]]). [[#4.6.2.2|Section 4.6.2.2]] assesses RCP and SSP differences. Other MIPs (see [[#4.2.1|Section 4.2.1]]) have been designed to explicitly explore some of the implications of the different socio-economic storylines for a given radiative forcing level.

<div id="4.2.3" class="h2-container"></div>

<span id="sources-of-near-term-information"></span>
=== 4.2.3 Sources of Near-term Information ===

<div id="h2-8-siblings" class="h2-siblings"></div>

This subsection describes the three main sources of near-term information used in Chapter 4. These are (i) the projections from the CMIP6 multi-model ensemble introduced in [[#4.2.1|Section 4.2.1]] ([[#Eyring--2016|Eyring et al., 2016]] ; [[#O’Neill--2016|O’Neill et al., 2016]]); (ii) observationally constrained projections ([[#Gillett--2013|Gillett et al., 2013]] ; [[#Stott--2013|Stott et al., 2013]]); and (iii) the initialized predictions contributed to CMIP6 from the Decadal Climate Prediction Project (DCPP; [[#Boer--2016|Boer et al., 2016]]). The projections under (i) and the observational constraints under (ii) are used for all time horizons considered in this chapter, whereas the initialized predictions under (iii) are relevant only in the near term.

Observationally constrained projections ([[#Gillett--2013|Gillett et al., 2013]] , 2021; [[#Shiogama--2016|Shiogama et al., 2016]] ; [[#Ribes--2021|Ribes et al., 2021]]) use detection and attribution methods to attempt to reach consistency between observations and models and thus provide improved projections of near-term change. Notable advances have been made since AR5, for example the ability to observationally constrain estimates of Arctic sea ice loss for global warming of 1.5°C, 2.0°C, and 3.0°C above pre-industrial levels ([[#Screen--2017|Screen and Williamson, 2017]] ; [[#Jahn--2018|Jahn, 2018]] ; [[#Screen--2018|Screen, 2018]] ; [[#Sigmond--2018|Sigmond et al., 2018]]). There is ''high confidence'' that these approaches can reduce the uncertainties involved in such estimates.

The AR5 was the first IPCC report to assess decadal climate predictions initialized from the observed climate state ([[#Kirtman--2013|Kirtman et al., 2013]]), and assessed with ''high confidence'' that these predictions exhibit positive skill for near-term average surface temperature information, globally and over large regions, for up to ten years. Substantially more experience in producing initialized decadal predictions has been gained since AR5; the remainder of this subsection assesses the advances made.

Because the ‘memory’ that potentially enables prediction of multi-year to decadal internal variability resides mainly in the ocean, some systems initialize the ocean state only (e.g., [[#Müller--2012|Müller et al., 2012]] ; [[#Yeager--2018|Yeager et al., 2018]]), whereas others incorporate observed information in the initial atmospheric states (e.g., [[#Pohlmann--2013|Pohlmann et al., 2013]] ; [[#Knight--2014|Knight et al., 2014]]) or other non-oceanic drivers that provide further sources of predictability ([[#Alessandri--2014|Alessandri et al., 2014]] ; [[#Weiss--2014|Weiss et al., 2014]] ; [[#Bellucci--2015a|Bellucci et al., 2015a]]).

Ocean initialization techniques generally use one of two strategies. Under full-field initialization, estimates of observed climate states are represented directly on the model grid. A potential drawback is that predictions initialized using the full-field approach tend to drift toward the biased climate preferred by the model ([[#Smith--2013a|Smith et al., 2013a]] ; [[#Bellucci--2015b|Bellucci et al., 2015b]] ; [[#Sanchez-Gomez--2016|Sanchez-Gomez et al., 2016]] ; [[#Kröger--2018|Kröger et al., 2018]] ; [[#Nadiga--2019|Nadiga et al., 2019]]). Such drifts can be as large as, or larger than, the climate anomaly being predicted and may therefore obscure predicted climate anomalies ([[#Kröger--2018|Kröger et al., 2018]]) unless corrected for through post-processing. By contrast, anomaly initialization reduces drifts by adding observed anomalies (i.e., deviations from mean climate) to the mean model climate ([[#Pohlmann--2013|Pohlmann et al., 2013]] ; [[#Smith--2013a|Smith et al., 2013a]] ; [[#Thoma--2015b|Thoma et al., 2015b]] ; [[#Cassou--2018|Cassou et al., 2018]]), but has the disadvantage that the model state is then further from the real world from the start of the prediction. For both approaches, unrealistic features in the observation data used for initialization may induce unrealistic transient behavior ([[#Pohlmann--2017|Pohlmann et al., 2017]] ; [[#Teng--2017|Teng et al., 2017]] ; [[#Nadiga--2019|Nadiga et al., 2019]]), and non-linearity can reduce forecast skill ([[#Chikamoto--2019|Chikamoto et al., 2019]]). As yet, neither of the initialization strategies has been clearly shown to be superior ([[#Hazeleger--2013|Hazeleger et al., 2013]] ; [[#Magnusson--2013|Magnusson et al., 2013]] ; [[#Smith--2013a|Smith et al., 2013a]] ; [[#Marotzke--2016|Marotzke et al., 2016]]), although such comparisons may be sensitive to the model, region, and details of the initialization and forecast assessment procedures considered ([[#Polkova--2014|Polkova et al., 2014]] ; [[#Bellucci--2015b|Bellucci et al., 2015b]]).

There is also a wide range of techniques employed to assimilate observed information into models in order to generate suitable initial conditions ([[#Polkova--2019|Polkova et al., 2019]]). These range in complexity from simple relaxation towards observed time series of sea surface temperature (SST) ([[#Mignot--2016|Mignot et al., 2016]]) or wind stress anomalies ([[#Thoma--2015a|Thoma et al., 2015a]] , b), to relaxation toward three-dimensional ocean and sometimes atmospheric state estimates from various sources (e.g., [[#Pohlmann--2013|Pohlmann et al., 2013]] ; [[#Knight--2014|Knight et al., 2014]] ; [[#Dunstone--2016|Dunstone et al., 2016]]), or hybrid relaxation combining surface and tri-dimensional restoring as function of ocean basins and depth ([[#Sanchez-Gomez--2016|Sanchez-Gomez et al., 2016]]), to sophisticated data assimilation methods such as the ensemble Kalman filter ([[#Nadiga--2013|Nadiga et al., 2013]] ; [[#Counillon--2014|Counillon et al., 2014]] , 2016; [[#Msadek--2014|Msadek et al., 2014]] ; [[#Karspeck--2015|Karspeck et al., 2015]] ; [[#Brune--2018|Brune et al., 2018]] ; [[#Cassou--2018|Cassou et al., 2018]] ; [[#Polkova--2019|Polkova et al., 2019]]), the four-dimensional ensemble-variational hybrid data assimilation ([[#He--2017|He et al., 2017]] , 2020) and the initialization of sea ice ([[#Guemas--2016|Guemas et al., 2016]] ; [[#Kimmritz--2018|Kimmritz et al., 2018]]). In addition, decadal predictions necessarily consist of ensembles of forecasts to quantify uncertainty, as discussed in [[#4.2.1|Section 4.2.1]] . A common way to generate an ensemble is through sets of initial conditions containing small variations that lead to different subsequent climate trajectories. A variety of methods and assumptions has been employed to generate and filter initial-condition ensembles for decadal prediction (e.g., [[#Marini--2016|Marini et al., 2016]] ; [[#Kadow--2017|Kadow et al., 2017]]). As yet, there is no clear consensus as to which initialization and ensemble generation techniques are most effective, and evaluations of their comparative performance within a single modelling framework are needed ([[#Cassou--2018|Cassou et al., 2018]]).

A consequence of model imperfections and resulting model systematic errors is that estimates of these errors must be removed from the prediction to isolate the predicted climate anomaly and the phase of the decadal modes of climate variability ([[#4.4.3.5|Sections 4.4.3.5]] and [[#4.4.3.6|4.4.3.6]] , and Annex IV, Sections AIV.2.6 and AIV.2.7). Because of the tendency for systematic drifts to occur following initialization, bias corrections generally depend on time since the start of the forecast, often referred to as lead time. In practice, the lead-time-dependent biases are calculated using ensemble retrospective predictions, also known as hindcasts, and recommended basic procedures for such corrections are provided in previous studies ([[#Goddard--2013|Goddard et al., 2013]] ; [[#Boer--2016|Boer et al., 2016]]). The biases are also dynamically corrected during hindcasts and predictions by incorporating the multi-year monthly mean analysis increments from the initialization into the initial condition at each integration step ([[#Wang--2013b|Wang et al., 2013b]]). Besides mean climate as a function of lead time, further aspects of decadal predictions may be biased, such as the modes of variability (see [[IPCC:Wg1:Chapter:Chapter-3#3.7|Section 3.7]] and Annex IV) upon which drift patterns are projected ([[#Sanchez-Gomez--2016|Sanchez-Gomez et al., 2016]]), and additional correction procedures have thus been proposed to remove biases in representing long-term trends ([[#Kharin--2012|Kharin et al., 2012]] ; [[#Kruschke--2016|Kruschke et al., 2016]] ; [[#Balaji--2018|Balaji et al., 2018]] ; [[#Pasternack--2018|Pasternack et al., 2018]]), as well as more general dependences of drift on initial conditions ([[#Fučkar--2014|Fučkar et al., 2014]] ; [[#Pasternack--2018|Pasternack et al., 2018]] ; [[#Nadiga--2019|Nadiga et al., 2019]]).

Many skill measures exist that describe different aspects of the correspondence between predicted and observed conditions, and no single metric should be considered exclusively. Important aspects of forecast performance captured by different skill measures include: (i) the ability to predict the sign and phases of the main modes of decadal variability and their regional fingerprint through teleconnections; (ii) the typical magnitude of differences between predicted and observed values, forecast reliability and resolution ([[#Corti--2012|Corti et al., 2012]]); and (iii) whether the forecast ensemble appropriately represents uncertainty in the predictions. A framework for skill assessment that encompasses each of these aspects of forecast quality has been proposed ([[#Goddard--2013|Goddard et al., 2013]]). A new, process-based method to assess forecast skill in decadal predictions is to analyse how well a specific mechanism is represented at each lead time ([[#Mohino--2016|Mohino et al., 2016]]).

One additional aspect of forecast quality assessment is that estimated skill can be degraded by errors in observational datasets used for verification, in addition to errors in the predictions ([[#Massonnet--2016|Massonnet et al., 2016]] ; [[#Ferro--2017|Ferro, 2017]] ; [[#Karspeck--2017|Karspeck et al., 2017]] ; [[#Juricke--2018|Juricke et al., 2018]]). This suggests that skill may tend to be underestimated, particularly for climate variables whose observational uncertainties are relatively large, and that the predictions themselves may prove useful for assessing the quality of observational datasets ([[#Massonnet--2019|Massonnet, 2019]]).

Skill assessmentshave shown that initialized predictions can out-perform their uninitialized counterparts ([[#Doblas-Reyes--2013|Doblas-Reyes et al., 2013]] ; [[#Meehl--2014|Meehl et al., 2014]] ; [[#Bellucci--2015a|Bellucci et al., 2015a]] ; D.M. [[#Smith--2018|Smith et al., 2018]] , 2019; [[#Yeager--2018|Yeager et al., 2018]]), although such comparisons are sensitive to the region and variable considered, multi-model predictions are generally more skilful than individual models ([[#Doblas-Reyes--2013|Doblas-Reyes et al., 2013]] ; D.M. [[#Smith--2013b|Smith et al., 2013b]] , 2019). Considerable skill, especially for temperature, can be attributed to external forcings such as changes in GHG, aerosol concentrations, and volcanic eruptions. On a global scale, this contribution to skill has been found to exceed that from the prediction of internal variability except in the early stages (about one year for global SST) of the forecast (Corti et al., 2015; [[#Sospedra-Alfonso--2020|Sospedra-Alfonso and Boer, 2020]] ; [[#Bilbao--2021|Bilbao et al., 2021]]), though idealized potential skill measures and observations-based studies suggest that improving the prediction of internal variability could extend this crossover to longer lead times ([[#Boer--2013|Boer et al., 2013]] ; [[#Årthun--2017|Årthun et al., 2017]]). In some cases, part of the skill arises from the ability of initialized predictions to capture observed transitions of major modes of climate variability ([[#Meehl--2016|Meehl et al., 2016]]) such as the Pacific Decadal Variability (PDV) and the Atlantic Multi-decadal Variability (AMV; see Sections 4.4.3.5 and 4.4.3.6, and Annex IV, Sections AIV.2.6 and AIV.2.7).

Initialized predictions of near-surface temperature are particularly skilful over the North Atlantic, a region of high potential and realized predictability ([[#Keenlyside--2008|Keenlyside et al., 2008]] ; [[#Pohlmann--2009|Pohlmann et al., 2009]] ; [[#Boer--2013|Boer et al., 2013]] ; [[#Yeager--2017|Yeager and Robson, 2017]]). Much of this predictability is associated with the North Atlantic subpolar gyre ([[#Wouters--2013|Wouters et al., 2013]]), where skill in predicting ocean conditions is typically high ([[#Hazeleger--2013|Hazeleger et al., 2013]] ; [[#Brune--2020|Brune and Baehr, 2020]]) and shifts in ocean temperature and salinity potentially affecting surface climate can be predicted up to several years in advance ([[#Robson--2012|Robson et al., 2012]] ; [[#Hermanson--2014|Hermanson et al., 2014]]), although such assessments remain challenging due to incomplete knowledge of the state of the ocean during the hindcast evaluation periods ([[#Menary--2018|Menary and Hermanson, 2018]]). A substantial improvement of the sub-polar gyre SST prediction is found in CMIP6 models, which is attributed to a more accurate response to the AMOC-related delayed response to volcanic eruptions ([[#4.4.3|Section 4.4.3]] ; [[#Borchert--2021|Borchert et al., 2021]]). A significant improvement GSAT prediction skill is also found over some land regions including East Asia ([[#Monerie--2018|Monerie et al., 2018]]), Eurasia ([[#Wu--2019|Wu et al., 2019]]), Europe ([[#Müller--2012|Müller et al., 2012]] ; D.M. [[#Smith--2019|]] [[#Smith--2019|Smith et al., 2019]]) and the Middle East (D.M. [[#Smith--2019|]] [[#Smith--2019|Smith et al., 2019]]).

Skill for multi-year to decadal precipitation forecasts is generally much lower than for temperature, although one exception is Sahel rainfall ([[#Sheen--2017|Sheen et al., 2017]]), due to its dependence on predictable variations in North Atlantic SST through teleconnections (Annex IV; [[#Martin--2014a|Martin and Thorncroft, 2014a]]). Predictive skill on decadal time scales is found for extratropical storm-tracks and storm density ([[#Kruschke--2016|Kruschke et al., 2016]] ; [[#Schuster--2019|Schuster et al., 2019]]), atmospheric blocking ([[#Schuster--2019|Schuster et al., 2019]] ; [[#Athanasiadis--2020|Athanasiadis et al., 2020]]), the Quasi-Biennial Oscillation (QBO; [[#Scaife--2014|Scaife et al., 2014]] ; [[#Pohlmann--2019|Pohlmann et al., 2019]]) and over the tropical oceans (tropical trans-basin variability; [[#Chikamoto--2015|Chikamoto et al., 2015]]). In addition, decadal predictions with large ensemble sizes are able to predict multi-annual temperature (Peters et al., 2011; [[#Sienz--2016|Sienz et al., 2016]] ; [[#Borchert--2019|Borchert et al., 2019]] ; [[#Sospedra-Alfonso--2020|Sospedra-Alfonso and Boer, 2020]]), precipitation ([[#Yeager--2018|Yeager et al., 2018]] ; D.M. [[#Smith--2019|]] [[#Smith--2019|Smith et al., 2019]]), and atmospheric circulation ([[#Smith--2020|Smith et al., 2020]]) anomalies over certain land regions, although the ensemble-mean magnitudes are much weaker than observed. This discrepancy may be symptomatic of an apparent deficiency in climate models that causes some predictable signal, such as that associated to the North Atlantic Oscillation (NAO; Section AIV.2.1), to be much weaker than in nature ([[#Eade--2014|Eade et al., 2014]] ; [[#Scaife--2018|Scaife and Smith, 2018]] ; [[#Strommen--2019|Strommen and Palmer, 2019]] ; [[#Smith--2020|Smith et al., 2020]]), while others, such as that linked to the SAM (Section AIV.2.2), are more consistent with observations ([[#Byrne--2019|Byrne et al., 2019]]).

Evidence is accumulating that additional properties of the Earth system relating to ocean variability may be skilfully predicted on multi-annual time scales. These include levels of Atlantic hurricane activity ([[#Smith--2010|Smith et al., 2010]] ; [[#Caron--2017|Caron et al., 2017]]), winter sea ice in the Arctic ([[#Onarheim--2015|Onarheim et al., 2015]] ; [[#Dai--2020|Dai et al., 2020]]), drought and wildfire ([[#Chikamoto--2017|Chikamoto et al., 2017]] ; [[#Paxian--2019|Paxian et al., 2019]] ; [[#Solaraju-Murali--2019|Solaraju-Murali et al., 2019]]), and variations in the ocean carbon cycle including CO <sub>2</sub> uptake (H. [[#Li--2016|]] [[#Li--2016|Li et al., 2016]] , 2019; [[#Lovenduski--2019|Lovenduski et al., 2019]] ; [[#Fransner--2020|Fransner et al., 2020]]) and chlorophyll ([[#Park--2019|Park et al., 2019]]).

In summary, despite challenges ([[#Cassou--2018|Cassou et al., 2018]]), there is ''high confidence'' that initialized predictions contribute information to near-term climate change for some regions over multi-annual to decadal time scales. Furthermore, there are indications that initialized predictions can constrain near-term projections ([[#Befort--2020|Befort et al., 2020]]). The clearest improvements through initialization are seen in the North Atlantic and related phenomena such as hurricane frequency, Sahel and European rainfall. By contrast, there is ''medium'' or ''low confidence'' that uncertainty is reduced for other climate variables.

<div id="4.2.4" class="h2-container"></div>

<span id="pattern-scaling"></span>
=== 4.2.4 Pattern Scaling ===

<div id="h2-9-siblings" class="h2-siblings"></div>

Projected climate change is typically represented in this chapter for specific future periods. One important source of uncertainty in projections presented for fixed future epochs (time-slabs/time-slices) is the underlying scenario used; another is the structural uncertainty associated with model climate sensitivity. Presenting projections and associated measures of uncertainty for specific periods (see Sections 4.4 and 4.5) remains the most widely applied methodology towards informing climate change impact studies. It is becoming increasingly important from the perspective of climate change and mitigation policy, however, to present projections also as a function of the change in global mean temperature (i.e., global warming levels, GWLs). They are expressed either in terms of changes of global mean surface temperature (GMST) or GSAT (see [[IPCC:Wg1:Chapter:Chapter-1#1.6.2|Section 1.6.2]] and Cross-Chapter Box 2.3). For example, the IPCC SR1.5 ([[#Hoegh-Guldberg--2018|Hoegh-Guldberg et al., 2018]]) assessed the regional patterns of warming and precipitation change for GMST increase of 1.5°C and 2°C above 1850–1900 levels. Techniques used to represent the spatial variations in climate at a given GWL are referred to as pattern scaling.

In the ‘traditional’ methodology as applied in AR5 ([[#Collins--2013|Collins et al., 2013]]), patterns of climate change in space are calculated as the product of the change in GSAT at a given point in time and a spatial pattern of change that is constant over time and the scenario under consideration, and which may or may not depend on a particular climate model ([[#Allen--2002|Allen and Ingram, 2002]] ; [[#Mitchell--2003|Mitchell, 2003]] ; [[#Lambert--2009|Lambert and Allen, 2009]] ; [[#Andrews--2010|Andrews and Forster, 2010]] ; [[#Bony--2013|Bony et al., 2013]] ; [[#Lopez--2014|Lopez et al., 2014]]). This approach assumes that external forcing does not affect the internal variability of the climate system, which may be regarded a stringent assumption when taking into account decadal and multi-decadal variability ([[#Lopez--2014|Lopez et al., 2014]]) and the potential non-linearity of the climate change signal. Moreover, pattern scaling is expected to have lower skill for variables with large spatial variability ([[#Tebaldi--2014|Tebaldi and Arblaster, 2014]]). Pattern scaling also fails to capture changes in boundaries that move poleward such as sea ice extent and snow cover ([[#Collins--2013|Collins et al., 2013]]), and temporal frequency quantities such as frost days that decrease under warming but are bounded at zero. Spatial patterns are also expected to be different between transient and equilibrium simulations because of the long adjustment time scale of the deep ocean.

Further developments of the AR5 approach have since explored the role of aerosols in modifying regional climate responses at a specific degree of global warming and also the effect of different GCMs and scenarios on the scaled spatial patterns ([[#Frieler--2012|Frieler et al., 2012]] ; [[#Levy--2013|Levy et al., 2013]]). Furthermore, the modified forcing-response framework ([[#Kamae--2012|Kamae and Watanabe, 2012]] , 2013; [[#Sherwood--2015|Sherwood et al., 2015]]), which decomposes the total climate change into fast adjustments and slow response, identifies the fast adjustment as forcing-dependent and the slow response as forcing-independent, scaling with the change in GSAT.

For precipitation change, there is near-zero fast adjustment for solar forcing but suppression during the fast-adjustment phase for CO <sub>2</sub> and black-carbon radiative forcing ([[#Andrews--2009|Andrews et al., 2009]] ; [[#Bala--2010|Bala et al., 2010]] ; [[#Cao--2015|Cao et al., 2015]]). By contrast, the slow response in precipitation change is independent of the forcing. This indicates that pattern scaling is not expected to work well for climate variables that have a large fast-adjustment component. Even in such cases, pattern scaling still works for the slow response component, but a correction for the forcing-dependent fast adjustment would be necessary to apply pattern scaling to the total climate change signal. In a multi-model setting, it has been shown that temperature change patterns conform better to pattern scaling approximation than precipitation patterns ([[#Tebaldi--2014|Tebaldi and Arblaster, 2014]]).

[[#Herger--2015|Herger et al. (2015)]] have explored the use of multiple predictors for the spatial pattern of change at a given degree of global warming, following the approach of [[#Joshi--2013|Joshi et al. (2013)]] that explored the role of the land–sea warming ratio as a second predictor. They found that the land–sea warming contrast changes in a non-linear way with GSAT, and that it approximates the role of the rate of global warming in determining regional patterns of climate change. The inclusion of the land–sea warming contrast as the second predictor provides the largest improvement over the traditional technique. However, as pointed out by [[#Herger--2015|Herger et al. (2015)]] , multiple-predictor approaches still cannot detect non-linearities (or internal variability), such as the apparent dependence of spatial temperature variability in the mid- to high latitudes on GSAT (e.g., Fischer and Knutti, 2014; [[#Screen--2014|Screen, 2014]]).

An alternative to the traditional pattern scaling approach is the time-shift method described by [[#Herger--2015|Herger et al. (2015)]] which is applied in this chapter (also called the epoch approach; see [[#4.6.1|Section 4.6.1]]). When applied to a transient scenario such as SSP5-8.5, a future time-slab is referenced to a particular increase in the GSAT (e.g., 1.5°C or 2°C of global warming above pre-industrial levels). The spatial patterns that result represent a direct scaling of the spatial variations of climate change at the particular level of global warming. An important advantage of this approach is that it ensures physical consistency between the different variables for which changes are presented ([[#Herger--2015|Herger et al., 2015]]). The internal variability does not have to be scaled and is consistent with the GSAT change. Furthermore, the time-shift method allows for a partial comparison of how the rate of increase in GSAT influences the regional spatial patterns of climate change. For example, spatial patterns of change for global warming of 2°C can be compared across the SSP2-4.5 and SSP5-8.5 scenarios. Direct comparisons can also be obtained between variations in the regional impacts of climate change for the case where global warming stabilizes at, for instance, 1.5°C or 2°C of global warming, as opposed to the case where the GSAT reaches and then exceeds the 1.5°C or 2°C thresholds ([[#Tebaldi--2018|Tebaldi and Knutti, 2018]]). An important potential caveat is that forcing mechanisms such as aerosol radiative forcing are represented differently in different models, even for the same SSP. This may imply different regional aerosol direct and indirect effects, implying different regional climate change patterns. Hence, it is important to consider the variations in the forcing mechanisms responsible for a specific increase in GSAT towards understanding the uncertainty range associated with the variations in regional climate change. A minor practical limitation of this approach is that stabilization scenarios at 1.5°C or 2°C of global warming, such as SSP1-2.6, do not allow for spatial patterns of change to be calculated from these scenarios at higher levels of global warming, while it is possible in scenarios such as SSP5-8.5 ([[#Herger--2015|Herger et al., 2015]]).

In this chapter, the spatial patterns of change as a function of GWLs (defined in terms of the increase in the GSAT relative to 1850–1900) are thus constructed using the time-shift approach, thereby accounting for various non-linearities and internal variability that influence the projected climate change signal. This implies a reliance on large ensemble sizes to quantify the role of uncertainties in regional responses to different degrees of global warming. The assessment in [[#4.6.1|Section 4.6.1]] also explores how the rate of global warming (as represented by different SSPs), aerosol effects, and transient as opposed to stabilization scenarios influence the spatial variations in climate change at specific levels of global warming.

<div id="4.2.5" class="h2-container"></div>

<span id="quantifying-various-sources-of-uncertainty"></span>
=== 4.2.5 Quantifying Various Sources of Uncertainty ===

<div id="h2-10-siblings" class="h2-siblings"></div>

The AR5 assessed with ''very high confidence'' that climate models reproduce the general features of the global-scale annual mean surface temperature increase over the historical period, including the more rapid warming in the second half of the 20th century, and the cooling immediately following large volcanic eruptions. Furthermore, because climate and Earth system models are based on physical principles, they were assessed in AR5 to reproduce many important aspects of observed climate. Both aspects were argued to contribute to our confidence in the models’ suitability for their application in quantitative future predictions and projections ([[#Flato--2013|Flato et al., 2013]]). This Report assesses (in [[IPCC:Wg1:Chapter:Chapter-3#3.8.2|Section 3.8.2]]) with ''high confidence'' that for most large-scale indicators of climate change, the recent mean climate simulated by the latest generation climate models underpinning this assessment has improved compared to the models assessed in AR5, and with ''high confidence'' that the multi-model mean captures most aspects of observed climate change well. These assessments form the foundation of applying climate and Earth system models to the projections assessed in this chapter. Where appropriate, the assessment of projected changes is accompanied by an assessment of process understanding and model evaluation.

That said, fitness-for-purpose of the climate models used for long-term projections is fundamentally difficult to ascertain and remains an epistemological challenge ([[#Parker--2009|Parker, 2009]] ; [[#Frisch--2015|Frisch, 2015]] ; [[#Baumberger--2017|Baumberger et al., 2017]]). Some literature exists comparing previous IPCC projections to what has unfolded over the subsequent decades ([[#Cubasch--2013|Cubasch et al., 2013]]), and recent work has confirmed that climate models since around 1970 have projected global surface warming in reasonable agreement with observations once the difference between assumed and actual forcing has been taken into account ([[#Hausfather--2020|Hausfather et al., 2020]]). However, the long-term perspective to the end of the 21st century or even out to 2300 takes us beyond what can be observed in time for a standard evaluation of model projections, and in this sense the assessment of long-term projections will remain fundamentally limited.

The spread across individual runs within a multi-model ensemble represents the response to a combination of different sources of uncertainties ([[IPCC:Wg1:Chapter:Chapter-1#1.4.3|Section 1.4.3]]), specifically: scenario uncertainties, climate response uncertainties (also referred to as model uncertainties) related to parametric and other structural uncertainties in the model representation of the climate system, and internal variability (e.g., [[#Hawkins--2009|Hawkins and Sutton, 2009]] ; [[#Kirtman--2013|Kirtman et al., 2013]]). While the nature of these uncertainties was introduced in [[IPCC:Wg1:Chapter:Chapter-1#1.4.3|Section 1.4.3]] , this subsection assesses methods to disentangle different sources of uncertainties and quantify their contributions to the overall ensemble spread.

As discussed extensively in AR5 ([[#Collins--2013|Collins et al., 2013]]), ensemble spread in projections performed with different climate models accounts for only part of the entire model uncertainty, even when considering the uncertainty in the radiative forcing in projections ([[#Vial--2013|Vial et al., 2013]]) and forced response. The AR5 uncertainty characterisation ([[#Kirtman--2013|Kirtman et al., 2013]]) followed [[#Hawkins--2009|Hawkins and Sutton (2009)]] and diagnosed internal variability through a high-pass temporal filter. This approach has deficiencies particularly if internal variability manifests on the multi-decadal time scales ([[#Deser--2012a|Deser et al., 2012a]] ; [[#Marotzke--2015|Marotzke and Forster, 2015]]) and is classified as (model) response uncertainty instead of internal variability. Single-model initial-condition large ensembles revealed that the AR5 approach underestimates the role of internal variability uncertainty and overestimates the role of model uncertainty ([[#Maher--2018|Maher et al., 2018]] ; [[#Stolpe--2018|Stolpe et al., 2018]] ; [[#Lehner--2020|Lehner et al., 2020]]) particularly at the local scale while yielding a reasonable approximation for uncertainty separation for GSAT ([[#Lehner--2020|Lehner et al., 2020]]).

Single-model initial-condition large ensembles thus represent a crucial step towards a cleaner separation of model uncertainty and internal variability than available for AR5 ([[#Deser--2014|Deser et al., 2014]] , 2016; [[#Saffioti--2017|Saffioti et al., 2017]] ; [[#Sippel--2019|Sippel et al., 2019]] ; [[#Milinski--2020|Milinski et al., 2020]] ; [[#von%20Trentini--2020|von Trentini et al., 2020]] ; [[#Maher--2021|Maher et al., 2021]]). Novel approaches have been proposed to further quantify internal variability in multi-model ensembles ([[#Hingray--2014|Hingray and Saïd, 2014]] ; [[#Evin--2019|Evin et al., 2019]] ; [[#Hingray--2019|Hingray et al., 2019]]). For time horizons beyond the limit of decadal predictability ([[#Branstator--2010|Branstator and Teng, 2010]] ; [[#Meehl--2014|Meehl et al., 2014]] ; [[#Marotzke--2016|Marotzke et al., 2016]]), such as in the CMIP6 projections, the simulations are starting from random rather than assimilated initial conditions. Internal variability constitutes an uncertainty in the projection of the climate in a future period of 10 or 20 years that is irreducible, but can be precisely quantified for individual models using sufficiently large initial-condition ensembles ([[#Fischer--2013|Fischer et al., 2013]] ; [[#Deser--2016|Deser et al., 2016]] , 2020; [[#Hawkins--2016|Hawkins et al., 2016]] ; [[#Pendergrass--2017|Pendergrass et al., 2017]] ; [[#Luo--2018|Luo et al., 2018]] ; [[#Dai--2019|Dai and Bloecker, 2019]] ; [[#Maher--2019|Maher et al., 2019]]).

Uncertainties in emissions of greenhouse gases and aerosols that affect future radiative forcings are represented by selected SSP scenarios (Sections 1.6.1 and 4.2.2). In addition to emission uncertainties, SSPs represent uncertainties in land use changes ([[#van%20Vuuren--2011|van Vuuren et al., 2011]] ; [[#Ciais--2013|Ciais et al., 2013]] ; [[#O’Neill--2016|O’Neill et al., 2016]] ; [[#Christensen--2018|Christensen et al., 2018]]). Additional uncertainty comes from climate carbon-cycle feedbacks and the residence time of atmospheric constituents, and are at least partly accounted for in emissions-driven simulations as opposed to concentration-driven simulations ([[#Friedlingstein--2014|Friedlingstein et al., 2014]] ; [[#Hewitt--2016|Hewitt et al., 2016]]). The climate carbon-cycle feedbacks affect the transient climate response to cumulative CO <sub>2</sub> emissions (TCRE). Constraining this uncertainty is crucial for the assessment of remaining carbon budgets consistent with global mean temperature levels ([[#Millar--2017|Millar et al., 2017]] ; [[#IPCC--2018a|IPCC, 2018a]]) and is covered in [[IPCC:Wg1:Chapter:Chapter-5|Chapter 5]] of this Report. Finally, there are uncertainties in future solar and volcanic forcing ([[#cross-chapter-box-4.1|Cross-Chapter Box 4.1]]).

The relative magnitude of model uncertainty and internal variability depends on the time horizon of the projection, location, spatial and temporal aggregation, variable, and signal strength ([[#Rowell--2012|Rowell, 2012]] ; [[#Fischer--2013|Fischer et al., 2013]] ; [[#Deser--2014|Deser et al., 2014]] ; [[#Saffioti--2017|Saffioti et al., 2017]] ; [[#Kirchmeier-Young--2019|Kirchmeier-Young et al., 2019]]). New literature published after AR5 systematically discusses the role of different sources of uncertainty and shows that the relative contribution of internal variability is larger for short than for long projection horizons ([[#Marotzke--2015|Marotzke and Forster, 2015]] ; [[#Lehner--2020|Lehner et al., 2020]] ; [[#Maher--2021|Maher et al., 2021]]), larger for high latitudes than for low latitudes, larger for land than for ocean variables, larger at station level than for continental or global means, larger for annual maxima/minima than for multi-decadal means, larger for dynamic quantities (and, by implication, precipitation) than for temperature ([[#Fischer--2014|Fischer et al., 2014]]).

The method introduced by [[#Hawkins--2009|Hawkins and Sutton (2009)]] and applied to GSAT projections reveals that by the end of the 21st century, the fraction contribution of the climate model response uncertainty to the total uncertainty is larger in CMIP6 than in CMIP5 whereas the relative contribution of scenario uncertainty is smaller ([[#Lehner--2020|Lehner et al., 2020]]). This is the case even when sub-selecting pathways and scenarios that are most similar in CMIP5 and CMIP6, that is, the range from RCP2.6 to RCP8.5 vs SSP1-2.6 to SSP5-8-5, respectively ([[#Lehner--2020|Lehner et al., 2020]]). The larger range of response uncertainty is further consistent with the larger range of TCR and GSAT warming for a comparable pathway in CMIP6 than CMIP5 ([[#Forster--2020|Forster et al., 2020]] ; [[#Tokarska--2020|Tokarska et al., 2020]]).

Some uncertainties are not, or only partially accounted for in the CMIP6 experiments, such as uncertainties in natural forcings from solar and volcanic forcings, long-term Earth system feedbacks including land–ice feedbacks, groundwater feedbacks ([[#Smerdon--2017|Smerdon, 2017]]) or some long-term carbon-cycle feedbacks ([[#Fischer--2018|Fischer et al., 2018]]). Where appropriate, this chapter uses results from non-CMIP ESMs or EMICs to assess the role of these feedbacks. Still other uncertainties – such as further pandemics, nuclear holocaust, global natural disaster such as tsunami or asteroid impact, or fundamental technological change such as fusion – are not accounted for at all.

<div id="4.2.6" class="h2-container"></div>

<span id="display-of-model-agreement-and-spread"></span>
=== 4.2.6 Display of Model Agreement and Spread ===

<div id="h2-11-siblings" class="h2-siblings"></div>

Maps of multi-model mean changes provide an average estimate for the forced model climate response to a certain forcing. However, they do not include any information on the robustness of the response across models nor on the significance of the change with respect to unforced internal variability ([[#Tebaldi--2011|Tebaldi et al., 2011]]). Models can consistently show absence of significant change, in which case they should not be expected to agree on the sign of a change (e.g., [[#Tebaldi--2011|Tebaldi et al., 2011]] ; [[#Knutti--2013|Knutti and Sedláček, 2013]] ; [[#Fischer--2014|Fischer et al., 2014]]). If a multi-model mean map of precipitation shows no change, it is unclear whether the models consistently project insignificant changes or whether projections span both significant increases and significant decreases. Several methods have been proposed to distinguish significant conflicting signals from agreement on no significant change ([[#Tebaldi--2011|Tebaldi et al., 2011]] ; [[#Knutti--2013|Knutti and Sedláček, 2013]] ; [[#McSweeney--2013|McSweeney and Jones, 2013]] ; [[#Zappa--2021|Zappa et al., 2021]]). A set of different methods have been introduced in the literature to display model robustness and to put a climate change signal into the context of internal variability. Box 12.1 in AR5 provides a detailed assessment of different methods of mapping model robustness and Cross-Chapter Box Atlas.1 provides an update of recent proposals including the methods used in this Report.

Most methods for quantifying robustness assume that only one realization from each model is applied. There are challenges that arise from having heterogeneous multi-model ensembles with many members for some models and single members for others ([[#Olonscheck--2017|Olonscheck and Notz, 2017]] ; [[#Evin--2019|Evin et al., 2019]]). Furthermore, the methods that map model robustness usually ignore that sharing parametrizations or entire components across coupled models can lead to substantial model interdependence ([[#Fischer--2011|Fischer et al., 2011]] ; [[#Kharin--2012|Kharin et al., 2012]] ; [[#Knutti--2013|Knutti et al., 2013]] , 2017; [[#Leduc--2015|Leduc et al., 2015]] ; [[#Sanderson--2015|Sanderson et al., 2015]] , 2017; [[#Annan--2017|Annan and Hargreaves, 2017]] ; [[#Boé--2018|Boé, 2018]] ; [[#Abramowitz--2019|Abramowitz et al., 2019]]). This may lead to a biased estimate of model agreement if a substantial fraction of models is interdependent. The methodologies and results in this literature since AR5 are higher in quality and clarity. However, quantifying and accounting for model dependence in a robust way remains challenging ([[#Abramowitz--2019|Abramowitz et al., 2019]]). Furthermore, absence of significant mean change in a certain climate variable does not imply absence of substantial impact, because there may be substantial change in variability, which is typically not mapped ([[#McSweeney--2013|McSweeney and Jones, 2013]]).

Chapter 4 uses the advanced approach, taking into account the sign and significance of the change (Cross-Chapter Box Atlas.1, approach C). Where not applicable, such as due to a lack of the necessary model output, the simple method is used taking into account only agreement on the sign of the change across the multi-model ensemble (Cross-Chapter Box Atlas.1, approach B). The advanced approach is similar to the method used in AR5 but isolates conflicting signals as proposed in [[#Zappa--2021|Zappa et al. (2021)]] . It uses three mutually exclusive categories and distinguishes (i) areas with significant change and high model agreement (no overlay), (ii) areas with no change or no robust change (diagonal lines), and (iii) areas with significant change but '''low agreement''' (crossed lines). Category (i) marks areas where the climate change signals ''likely'' emerge from internal variability, where two-thirds or more of the models project changes greater than internal variability and 80% or more of the models agree on the sign of the change. Category (ii) marks areas where fewer than two-thirds of the models project changes greater than internal variability, and category (iii) marks areas with significant but conflicting signals, where two-thirds or more of the models project changes greater than internal variability but less than 80% agree on the sign of the change.

In this chapter variability is defined as <code>1.645 * √ 2  σ <sub>yr</sub></code>, where <code>σ <sub>yr</sub></code> is the standard deviation of 20-year means in the pre-industrial control simulations (see Cross-Chapter Box, Atlas.1). Category (a) uses a definition very similar to the AR5 method for stippling, except that the model signal is compared to its corresponding internal rather than the multi-model mean variability, to account for the substantial model differences in pre-industrial internal variability ([[#Parsons--2020|Parsons et al., 2020]]). Changes smaller than internal variability can have potential impacts particularly if they persist over sustained periods such as several decades. Finally, even when changes do not exceed variability at the grid point level they may exceed variability if aggregated over catchment basins, regions, or continents (Cross-Chapter Box Atlas.1). Maps of mean changes also ignore potential changes in variability addressed by a more comprehensive assessment of changes in temperature variability ([[#4.5.1|Section 4.5.1]]) and modes of internal variability ([[#4.4.3|Section 4.4.3]]).

<div id="box-4.1" class="h2-container box-container"></div>
<div class="container-box col-regular">