Abstract

Horizontal Visibility Graphs (HVGs) are a recently developed method to construct networks based on time series. Values (the nodes of the network) of the time series are linked to each other if there is no value higher between them. The network properties reflect the nonlinear dynamics of the time series. For some classes of stochastic processes and for periodic time series, analytic results can be obtained for the degree distribution, the local clustering coefficient distribution, the mean path length, and others. HVGs have the potential to discern between deterministic-chaotic and correlated-stochastic time series. We investigate a set of around 150 river runoff time series at daily resolution from Brazil with an average length of 65 years. Most of the rivers are exploited for power generation and thus represent heavily managed basins. We investigate both long-term trends and human influence (e.g. the effect of dam construction) in the runoff regimes (disregarding direct upstream operations). HVGs are used to determine the degree and distance distributions. Statistical and information-theoretic properties of these distributions are calculated: robust estimators of skewness and kurtosis, the maximum degree occurring in the time series, the Shannon entropy, permutation complexity and Fisher Information. For the latter, we also compare the information measures obtained from the degree distributions to those using the original time series directly, to investigate the impact of graph construction on the dynamical properties as reflected in these measures. We also show that a specific pretreatment of the time series conventional in hydrology, the elimination of seasonality by a separate z-transformation for each calendar day, changes long-term correlations and the overall dynamics substantially and towards more random behaviour. Moreover, hydrological time series are typically limited in length and may contain ties, and we present empirical consequences and extensive simulations to investigate these issues from a HVG methodological perspective. Focus is on one hand on universal properties of the HVG, common to all runoff series, and on site-specific aspects on the other.

To document

Abstract

Extreme climatic events, such as droughts and heat stress, induce anomalies in ecosystem–atmosphere CO2 fluxes, such as gross primary production (GPP) and ecosystem respiration (Reco), and, hence, can change the net ecosystem carbon balance. However, despite our increasing understanding of the underlying mechanisms, the magnitudes of the impacts of different types of extremes on GPP and Reco within and between ecosystems remain poorly predicted. Here we aim to identify the major factors controlling the amplitude of extreme-event impacts on GPP, Reco, and the resulting net ecosystem production (NEP). We focus on the impacts of heat and drought and their combination. We identified hydrometeorological extreme events in consistently downscaled water availability and temperature measurements over a 30-year time period. We then used FLUXNET eddy covariance flux measurements to estimate the CO2 flux anomalies during these extreme events across dominant vegetation types and climate zones. Overall, our results indicate that short-term heat extremes increased respiration more strongly than they downregulated GPP, resulting in a moderate reduction in the ecosystem's carbon sink potential. In the absence of heat stress, droughts tended to have smaller and similarly dampening effects on both GPP and Reco and, hence, often resulted in neutral NEP responses. The combination of drought and heat typically led to a strong decrease in GPP, whereas heat and drought impacts on respiration partially offset each other. Taken together, compound heat and drought events led to the strongest C sink reduction compared to any single-factor extreme. A key insight of this paper, however, is that duration matters most: for heat stress during droughts, the magnitude of impacts systematically increased with duration, whereas under heat stress without drought, the response of Reco over time turned from an initial increase to a downregulation after about 2 weeks. This confirms earlier theories that not only the magnitude but also the duration of an extreme event determines its impact. Our study corroborates the results of several local site-level case studies but as a novelty generalizes these findings on the global scale. Specifically, we find that the different response functions of the two antipodal land–atmosphere fluxes GPP and Reco can also result in increasing NEP during certain extreme conditions. Apparently counterintuitive findings of this kind bear great potential for scrutinizing the mechanisms implemented in state-of-the-art terrestrial biosphere models and provide a benchmark for future model development and testing.

To document

Abstract

Daily precipitation extremes and annual totals have increased in large parts of the global land area over the past decades. These observations are consistent with theoretical considerations of a warming climate. However, until recently these trends have not been shown to consistently affect dry regions over land. A recent study, published by Donat et al. (2016), now identified significant increases in annual-maximum daily extreme precipitation (Rx1d) and annual precipitation totals (PRCPTOT) in dry regions. Here, we revisit the applied methods and explore the sensitivity of changes in precipitation extremes and annual totals to alternative choices of defining a dry region (i.e. in terms of aridity as opposed to precipitation characteristics alone). We find that (a) statistical artifacts introduced by data pre-processing based on a time-invariant reference period lead to an overestimation of the reported trends by up to 40 %, and that (b) the reported trends of globally aggregated extremes and annual totals are highly sensitive to the definition of a "dry region of the globe". For example, using the same observational dataset, accounting for the statistical artifacts, and based on different aridity-based dryness definitions, we find a reduction in the positive trend of Rx1d from the originally reported +1.6 % decade−1 to +0.2 to +0.9 % decade−1 (period changes for 1981–2010 averages relative to 1951–1980 are reduced to −1.32 to +0.97 % as opposed to +4.85 % in the original study). If we include additional but less homogenized data to cover larger regions, the global trend increases slightly (Rx1d: +0.4 to +1.1 % decade−1), and in this case we can indeed confirm (partly) significant increases in Rx1d. However, these globally aggregated estimates remain uncertain as considerable gaps in long-term observations in the Earth's arid and semi-arid regions remain. In summary, adequate data pre-processing and accounting for uncertainties regarding the definition of dryness are crucial to the quantification of spatially aggregated trends in precipitation extremes in the world's dry regions. In view of the high relevance of the question to many potentially affected stakeholders, we call for a well-reflected choice of specific data processing methods and the inclusion of alternative dryness definitions to guarantee that communicated results related to climate change be robust.

To document

Abstract

Accurate model representation of land–atmosphere carbon fluxes is essential for climate projections. However, the exact responses of carbon cycle processes to climatic drivers often remain uncertain. Presently, knowledge derived from experiments, complemented by a steadily evolving body of mechanistic theory, provides the main basis for developing such models. The strongly increasing availability of measurements may facilitate new ways of identifying suitable model structures using machine learning. Here, we explore the potential of gene expression programming (GEP) to derive relevant model formulations based solely on the signals present in data by automatically applying various mathematical transformations to potential predictors and repeatedly evolving the resulting model structures. In contrast to most other machine learning regression techniques, the GEP approach generates readable models that allow for prediction and possibly for interpretation. Our study is based on two cases: artificially generated data and real observations. Simulations based on artificial data show that GEP is successful in identifying prescribed functions, with the prediction capacity of the models comparable to four state-of-the-art machine learning methods (random forests, support vector machines, artificial neural networks, and kernel ridge regressions). Based on real observations we explore the responses of the different components of terrestrial respiration at an oak forest in south-eastern England. We find that the GEP-retrieved models are often better in prediction than some established respiration models. Based on their structures, we find previously unconsidered exponential dependencies of respiration on seasonal ecosystem carbon assimilation and water dynamics. We noticed that the GEP models are only partly portable across respiration components, the identification of a general terrestrial respiration model possibly prevented by equifinality issues. Overall, GEP is a promising tool for uncovering new model structures for terrestrial ecology in the data-rich era, complementing more traditional modelling approaches.

To document

Abstract

Data analysis and model-data comparisons in the environmental sciences require diagnostic measures that quantify time series dynamics and structure, and are robust to noise in observational data. This paper investigates the temporal dynamics of environmental time series using measures quantifying their information content and complexity. The measures are used to classify natural processes on one hand, and to compare models with observations on the other. The present analysis focuses on the global carbon cycle as an area of research in which model-data integration and comparisons are key to improving our understanding of natural phenomena. We investigate the dynamics of observed and simulated time series of Gross Primary Productivity (GPP), a key variable in terrestrial ecosystems that quantifies ecosystem carbon uptake. However, the dynamics, patterns and magnitudes of GPP time series, both observed and simulated, vary substantially on different temporal and spatial scales. We demonstrate here that information content and complexity, or Information Theory Quantifiers (ITQ) for short, serve as robust and efficient data-analytical and model benchmarking tools for evaluating the temporal structure and dynamical properties of simulated or observed time series at various spatial scales. At continental scale, we compare GPP time series simulated with two models and an observations-based product. This analysis reveals qualitative differences between model evaluation based on ITQ compared to traditional model performance metrics, indicating that good model performance in terms of absolute or relative error does not imply that the dynamics of the observations is captured well. Furthermore, we show, using an ensemble of site-scale measurements obtained from the FLUXNET archive in the Mediterranean, that model-data or model-model mismatches as indicated by ITQ can be attributed to and interpreted as differences in the temporal structure of the respective ecological time series. At global scale, our understanding of C fluxes relies on the use of consistently applied land models. Here, we use ITQ to evaluate model structure: The measures are largely insensitive to climatic scenarios, land use and atmospheric gas concentrations used to drive them, but clearly separate the structure of 13 different land models taken from the CMIP5 archive and an observations-based product. In conclusion, diagnostic measures of this kind provide dataanalytical tools that distinguish different types of natural processes based solely on their dynamics, and are thus highly suitable for environmental science applications such as model structural diagnostics.

Abstract

In an attempt to discern stochastic and deterministic parts of measured signals, we analyze time series from the viewpoint of ordinal pattern statistics. After choosing a suitable embedding dimension $D$, the occurrencies of all $D!$ patterns form a probability distribution $P$. The latter is input to information and complexity functionals describing, e.g., chaotic regimes or stochastic properties due to long-range correlations. Here, we use an information quantifier which is local in pattern probability space, the Fisher information $F$. This is calculable only after fixing a pattern coding scheme, i.e. numbering each and every pattern. It has been demonstrated that $F$ discerns different dynamic regimes for the logistic map to a certain extent; however, this depends on the details of the coding scheme. Here, we seek to find an optimal coding scheme for long-range correlated stochastic processes, mimicking many records e.g. from the geosciences. To increase the contrast between colored noise and deterministic processes, $F$ should be minimal for the former. Structurally similar ordinal patterns should be located adjacent to each other. Similarity is related to the number of inversions in the respective patterns. In practical terms, it is impossible to try all $D!!$ coding schemes whenever$D > 3$; however, we demonstrate a classification of coding schemes into equivalence classes based on the number of "jumps" in the patterns. These are used to improve the Keller and Lehmer coding schemes. The approach has a potential to provide an analytical understanding of the Fisher information for stochastic processes. Results for these optimizations will be shown for both the logistic map and colored ($k$-) noise. As a byproduct, an innovative method to estimate the scaling exponent $k$ emerges. Finally, we comment shortly on the importance of finite size effects, which is always an issue when dealing with observed data.