Numerical Solution for the Extrapolation Problem of Analytic Functions

In this work, a numerical solution for the extrapolation problem of a discrete set of n values of an unknown analytic function is developed. The proposed method is based on a novel numerical scheme for the rapid calculation of higher order derivatives, exhibiting high accuracy, with error magnitude of O(10−100) or less. A variety of integrated radial basis functions are utilized for the solution, as well as variable precision arithmetic for the calculations. Multiple alterations in the function's direction, with no curvature or periodicity information specified, are efficiently foreseen. Interestingly, the proposed procedure can be extended in multiple dimensions. The attained extrapolation spans are greater than two times the given domain length. The significance of the approximation errors is comprehensively analyzed and reported, for 5832 test cases.


Introduction
Prediction always differs from observation [1], and extrapolation remains an open challenge [2] or even a hopelessly ill-conditioned [3] problem, in a vast variety of scientific disciplines. It relies on numerical methods which attempt to predict the future, unknown values of a studied phenomenon, given a limited set of observations. Its importance is reflected in the Scopus database [4], by a search with the term extrapolation, yielding nearly 150 thousands of research items, in a wide range of scientific fields (Table S1). The visual presentation of the corresponding keywords ( Figure  S1) in a bibliometric map [5] reveals that extrapolation is closely associated with modeling, simulation, uncertainty, and interpolation, which is the reciprocal problem to simulate the unknown mechanism which produced the given data, aiming to predict utilizing the constructed model. Contradictorily to interpolation and despite the significance of extrapolation and forecasting methods, they exhibit a decreasing pattern in literature since the early 1980s ( Figure S2), indicating the futility to predict for extended time frames. The given data is often called time-series and their extrapolation forecasting, which usually depend on machine learning algorithms such as support vector machines [6][7][8] and artificial neural networks [2,9,10], which construct nonlinear regression models, to predict beyond the known domain. Prediction algorithms are useful in a wide range of scientific disciplines, such as earth sciences [7,9], finance [8,11], computer science [12,13], and engineering [14][15][16]. Several methods have been proposed for time-series forecasting [2] and competitions regarding accuracy have been conducted, utilizing statistical [11,17] and machine learning procedures [2,10,15]; however, the prediction horizon regards only a small percentage (∼ 20-30%) of the given data extension. Similarly, for less uncertain problems as the extrapolation of curves defined by polynomials (splines) [18][19][20], extrapolation can cause unpredictable results, and their extension should be short [21]. Apparently, prediction procedures are essential for a vast variety of scientific fields, always based on numerical interpolation methods.
More specifically, in the case of analytic functions, where the given data follow an unknown rigor mathematical law, any extrapolation results are too weak to be interesting [3]. Previous theoretical works focus on the interpolation problem, such as the ability of neural networks with one hidden layer to approximate analytic functions [22], proofs regarding the degree of rational approximation of an analytic function [23], and analysis of the rate of decrease of the best approximations of analytic functions [24]. Accuracy and convergence techniques are demonstrated in [25], yet it is 2 Research shown theoretically [26] that the approximation of analytic functions cannot converge exponentially. Approximation of analytic functions is investigated utilizing a variety of approximators [27][28][29], such as Hermite [30,31] and Airy functions [32]. Computational works regard the approximation of a function [33] as well as its derivatives [34,35], investigating the interpolation errors among the given nodal points. Such errors are higher than the approximation errors (Runge phenomenon) especially at the boundaries [33], affecting dramatically the extrapolation outside the given domain. Moreover, the round-off errors of computing machines follow some stochastic law [36] and recent works deal with the high impact of infinitesimal errors in the given data, utilizing extended arithmetic precision [37,38]. Accordingly, the constitution of an accurate numerical method to approximate an analytic function and its derivatives, which is vital for the extrapolation problem, remains a challenge.
The purpose of this study was to provide a generic numerical solution for the extrapolation problem. It was attained for extended extrapolation horizons of even greater than 200% the given domain length if the set of data are derived from an unknown analytic function and their precision is high. The rationale of the proposed method adheres to the following three stages: (a) interpolation of the set of values ( 1 , 2 , . . . , ) at specified points ( 1 , 2 , . . . , ) using integrated radial basis functions (IRBFs), (b) computation of high order derivatives { , , (3) , . . .} at any point of the closed domain [ 1 , ] with high accuracy based on a novel numerical scheme, and (c) successive application of the Taylor series formula to extrapolate at points outside the given domain. The method is capable of interpolating within the given data with high precision, avoiding the Runge phenomenon at the boundaries [33], as well as of computing the higher order derivatives with remarkable accuracy, which are fundamental problems in numerous applications. The effects of the method's parameters on the prediction errors were extensively investigated by their analysis of variance (ANOVA) and feature extraction utilizing the Random Forest [39] method, for 5832 test cases. Illustrative examples of highly nonlinear analytic functions in two and three dimensions demonstrate the prediction extents attained by the proposed method.

Extrapolation of One Infinitesimal dx. Let
be an analytic function, which is unknown. It is given that the function takes values b = ( 1 , 2 , . . . , ) at specified points x = ( 1 , 2 , . . . , ) as in Figure 1(a), for a generic analytic function. The extrapolation problem, that is, to predict the values which take the function outside the given domain [ 1 , ], can be transformed into the computation of ordinary derivatives of , ( , , . . . , ( ) ), at point . In particular, applying the Taylor formula from the point to the next point +1 = + , the corresponding value of is given by for an infinitesimal step . ( + ) is the value of at point + , depending on the ordinary derivatives of at . These derivatives should be computed, as they are unknown.

Numerical Evaluation of the Derivatives.
Around the endpoint , we used the Taylor formula for computing the value of the function at the first predicted point +1 . In order to calculate the derivatives, we need to consider the function values within the end interval from the point with abscissa −1 to . Accordingly, the Taylor expansion from the endpoint to any previous point ℎ → (Figure 1(b)) may also be written in the form By interchanging the positions of ( ) and ( − ℎ ), we may write and by applying it to + 1 points ℎ ℎ ℎ V dx (Figure 1(b)) and writing the resulting system of equations in matrix form, we obtain ] .
(4) It is important to note that all the derivatives are considered at the endpoint . Hence, extracting the derivatives as a column vector, we have ] .
Thus, only the first column of the matrix in (5) where ] , ( 6 d) Since (6) is a system of + 1 equations, in order to calculate the vector D (containing the values of the derivatives of at point ), by (6), we deduce which is a system of n equations. Thus, the matrix of the n ordinary derivatives can be directly computed by where n, and hence the number of the calculated derivatives, can be arbitrarily selected. Equation (8) offers a direct computation of the n ordinary derivatives D. Since C is still unknown, we can compute it, by interpolating f with IRBFs as defined subsequently.

Function Approximation with IRBFs.
Radial Basis Functions (RBFs) networks are universal approximators while Integrated RBFs are capable of attaining precise approximation for a function and its derivatives [34,35] as well as for the solution of partial differential equations [37,38]. The essential formulation of RBFs, adopted in this work, is as follows.
Let be a variable taking values in a domain . The known values of the function are given at the positions 1 , 2 , . . . , as per Figure 1, while ∀ ∈ (1, 2, . . . , − 1) the interval length between two adjacent points , +1 is equal to . The given values ( ) of the unknown function can be approximated with RBFs ( ), centered at the N points. The RBFs approximation is defined by where are the unknown weights for the approximation with RBFs, = | − | represents the distance of a given point in the domain from any other point , and ( ) is the radial basis kernels such as Gaussian, Shifted Multiquadrics, and Sigmoid or their integrals, as in (14a), (14b), (14c), (14d), (14e), (14f). Applying (9) at the given points results in the approximated values of , = ( ); hence we deduce that If we write (10) Hence we derive The vector b is known, and the matrix Φ can be calculated from the distances of the known points and , so the weights of the RBFs approximation of the function are computed by numerically inverting matrix Φ and multiplying it with vector b. After the computation of vectors , the radial basis functions ( ) can be recomputed for each intermediate point ℎ , in order to approximate the values of the unknown function at these points. Accordingly, each approximation vector which corresponds to the points ℎ , = (| −ℎ |), when multiplied with the calculated weights of (12), results in the interpolated values ( − ℎ ) of (6c), and so where are the approximation vectors for each h j and the RBFs weights. The vectors are computed utilizing (14a), (14b), (14c), (14d), (14e), (14f), resulting in the vector C, which is the function's f approximation at points h j (Figure 1(b)).
The definitions of the radial basis functions utilized in (12), (13) are specified in the following group of (14a), (14b), (14c), (14d), (14e), (14f): the Gaussian and the Shifted Logarithmic, as well as integrations of them, for one and two times. Equations (14a), (14b), (14c), (14d), (14e), (14f) are written concerning = | − |, in order to be generic for any ∈ [ 1 , ] and hence for points x i and h j . The Gaussian RBF is defined by where c is a constant influencing the shape of [40][41][42]. By integration of the Gaussian kernel, we obtain and if we integrate for a second time, then where erf is the error function, erf , exhibiting a sigmoid scheme, which is commonly used in ANNs [43].
Respectively, the Shifted Logarithmic RBF is defined by and by integration of the kernel we obtain and if we integrate for a second time, then

Iterative Implementation of Taylor Series Expansion.
Exploiting the computed derivatives with high accuracy at the end of the domain , the predicted value of the function can be computed at the next step +1 , by the Taylor series  (15a) and (15b)). Keeping constant the length dx between a predicted point and its subsequent, the matrix Φ remains the same, as in the initial interpolation domain. The same stands for the inverted matrix Φ −1 as well as the matrices , and despite the sequential shifts outside the original domain, the relative distances r of (14a), (14b), (14c), (14d), (14e), (14f) remain the same. The same stands for the matrices H and H −1 . Hence, the interpolation of the function within the end increment dx each time is accomplished by matrix multiplication and not inversion, which is performed only one time for Φ and one time for H. Utilizing this approach, during the iterations of the extrapolation procedure, the computational time is decreased dramatically, as these matrices are computed only once.

Parameters Affecting the Method and Preparation of
Dataset to Test Performance. The efficiency of the procedure was verified through numerical examples for a variety of highly nonlinear functions and their extrapolation spans. The extrapolation span is based on the accuracy of the interpolation method ((9), (10), (12), (13), (14a), (14b), (14c), (14d), (14e), (14f)) as well as the numerical differentiation ((1), (5), (6), (6a), (6b), (6c), (6d), (6e), (6f), (8)). As demonstrated in Figure 1(b), the intervals for the differentiation h j are selected near the − 1 st node of the domain discretization, in order to achieve the highest possible accuracy, because this region is within the selected dx, and simultaneously, the interpolation error is minimized near this node, due to the Runge phenomenon [33]. Hence we select where l indicates the limit distance near the node −1 . However, for extremely low values of h, the matrix H (6d) contains elements near zero, and its inversion grows into unstable one (high condition number and inversion errors).
The features which have an effect on the calculations, as well as their values (in parentheses) utilized in the parametric investigation, are the number of Taylor terms utilized, indicated as number of derivatives, (25,50,75); the number of computer digits used for the arbitrary precision calculations (500, 1000, 2000); the span of the given domain L (1/2, 1); the number of the divisions N (50, 100, 200); the limit l (5,10,20); the number of IRBFs integrations (0, 1, 2); the kernel of RBFs networks (Gaussian = 1, Shifted Logarithmic = 2); their shape parameter c (1/5, 1, 5); and the unknown function which is studied (sin(x), e cos(x) ). Accordingly, the resulting database consists of 5832 records of test cases. The output of each parametric execution of the proposed procedure was the condition number of the H and Φ matrices, their inversion errors, and the error of the first and second derivatives according to the proposed procedure. Moreover, the error of the first step of the extrapolation was computed ( ), in order to investigate its dependence on the problem statement parameters (N, L, dx, f ) as well as the method's attributes (#derivatives, digits, kernel, c, #integrations).

Effects of Parameters on Extrapolation Accuracy and
Computational Time. The = log 10 (| | + 10 −323 ) was utilized as a measure for the extrapolation error, as the values of exhibit a variation within the domain 4.623 * 10 −283 to the IEEE5 arithmetic representation for positive infinity [44]. Given that the lower values of error are important, the analysis of variance was conducted for the cases with values of < 10 −50 ; hence the resulting database consists of 2292 records. ANOVA found statistically significant differences between the means (MD) of for 25 and 50 derivatives (MD = 43.5803, p-value = 9.5606 * 10 −10 ) and for 25 and 75 derivatives (MD = 54.2206, p-value = 9.5606 * 10 −10 ) as demonstrated in Figure S3 and Table S2. Similarly, the number of digits ( Figure S4 and Table S3) was found significant for the (MD = 20.3576, p-value = 9.6963 * 10 −10 for 500 to 1000 digits and MD = 33.9628, p-value = 9.5606 * 10 −10 for 500 to 2000 digits). The number of divisions exhibits a clear difference as well, in Figure S5 and Table S4 (MD = 22 In order to further investigate more complex associations among the studied parameters and the extrapolation error , the random forests method [39] was utilized. The numerical dataset was divided into a train set (85% of observations) and a test set (15%) in order to constitute and investigate the reliability of the predictive model. The R 2 for the predicted versus actual for the test set was 0.8954 ( Figure S18), indicating a reliable model. The features significance was evaluated in terms of their contribution to the prediction error in the constituted model ( Figure S19A), signifying high values for the studied function, the condition of Φ, the number of derivatives, and the domain length. Similarly, considering the computation time as dependent variable, the R 2 for the test set was 0.9658 and the highest predictive features were the number of divisions, the number of integrations, and number of digits ( Figure S19B).

Illustrative Functions and Their Extrapolation Spans.
In Figure 2 the extrapolation results are demonstrated by four test functions as indicated. The initial domain contains only the known given data for and ( ), and after the vertical lines, each graph contains the exact function values, the predicted and the normalized extrapolation error = (log 10 (| |) − min(log 10 (| |)))/(max(log 10 (| |)) − min(log 10 (| |))), which take values in the [0, 1]. The text cases include a vast variety of analytic functions as well as their combinations. In Figures 2(a)-2(d) the given curvature or periodicity information is meager, compared to the predicted evolution of the function's values. Interestingly, the logged error plot exhibits a logarithmic scheme, indicating a weak form of stability [3].
In Figure 3, the proposed procedure is implied firstly in the x-axis and then in the y-axis for the function ( , ) = sin( 2 ) + cos( 3 /2 ). Given only the cyan region, the method can predict the highly nonlinear colored surface in the x, y, z space.

Numerical Differentiation.
Numerical differentiation is highly sensitive to noise [34], especially for the higher order derivatives. After an extensive examination the literature utilizing IRBFs [35,45,46] for the derivatives approximation, or for the solution of specific problems [47,48] as well as other differentiation methods [49,50] and a variety of formulations for IRBFs ((14a), (14b), (14c), (14d), (14e), (14f)), the proposed procedure accomplished striking accuracy, with error magnitude of O(10 −100 ) or less (Figures S3-S11 and supplementary database), for the derivatives' computation. This finally permitted the implementation of the Taylor method for significant extrapolation extents (Figures 2 and 3 and Data File S1). The numerical calculation of the derivatives with variable precision arithmetic offers high accuracy [37,38]. The inverse problem of numerical integration exhibits lower errors with ∼ (10 −1000 ); however, the integration is less sensitive to-even small-errors in the given data [51]. Cheng [37], similarly to this work, found that the errors of the derivatives approximation are of one or more order of magnitude higher than the function's (Figures S16, S17, and Data File S1). However, the digits studied were 100-200, while the precise computation of the derivatives was vital for the extrapolation, and so for a higher number of digits. Similarly, Mai-Duy et al. [52] examined the compactly integrated radial basis functions, with errors for the derivatives ∼ (10 −10 ) for fifty digits accuracy.
The association of the condition number of Φ with exhibits low R 2 (0.25595) with slightly negative slope ( Figure  S12), although the high condition number is considered to increase instability [37,41,52] indicating the need for high precision to the calculations; however the majority of the values are 10 323 , that is, the maximum real number considered by the software [44]. Sensitivity analysis was selected instead of optimizing each method's parameters such as times of RBFs integration, the kernel function, or its shape parameter [40][41][42], as the interpolation errors in a number of the test cases were equal to zero at the nodes, eliminating any relevant objective function. The number of integrations of RBF increased computational time, as it magnifies the Φ and Φ −1 matrices, due to the more complex formulas ((14b), (14c), (14e), (14f)).

Summary of Findings and Limitations of the Proposed
Solution. In brief, a method for the extrapolation of analytic functions is developed, accompanied by a systematic investigation of the involved parameters. The proposed scheme for the numerical differentiation exhibited low enough errors to permit adequate extrapolation spans. The constituted database of the numerical experiments highlights the fact that, for the same problem formulation (L, N, dx, f ), the derivatives calculation exhibits a high variation, indicating the vagueness of the transference from the presumed theory to the actual calculations. The numerical investigation of the extrapolation errors suggests that only utilizing high accuracy and a precise approximation scheme for a function as well as its derivatives, the extrapolation is attainable. Thus, for real-world phenomena, with laboratory accuracies even of O(10 −20 ), the predictions are limited to short length. Only if the measurements contain some hundreds of significant digits, the proposed solution is efficient. As this is difficult to be accomplished by laboratory instruments, this work's findings provide strong evidence that we are far from lengthy predictions in physics, biology, and engineering and also even more far from phenomena studied in health and social sciences. However, the parametric investigation suggests that the precision in calculations and the utilized methods are vastly significant, as the extrapolation horizons achieved by the proposed numerical scheme are about an order of magnitude higher than those in the existing literature, highlighting the potentiality of predictions.

Materials and Methods
All the software operations ran on an Intel i7-6700 CPU @3.40GHz with 32GB memory and SSD hard disk, and the execution time was accordingly tracked. The errors of the derivatives in the relevant literature are reported in the Supplementary Data File S1, and they highlight the immense accuracy of numerical differentiation achieved in this work.

Conflicts of Interest
The author declares no conflicts of interest. Φ and . Figure S13: scatter plot for condition number of H and . Figure S14: scatter plot for inversion error of Φ and . Figure S15: scatter plot for inversion error of H and . Figure  S16: scatter plot for error of first derivative and . Figure S17: scatter plot for error of second derivative and . Figure S18: scatter plot for actual versus predicted . Table S2: differences between the means of for groups with an uneven number of derivatives. Table S3: differences between the means of for groups with an uneven number of digits. Table S4: differences between the means of for groups with an uneven number of divisions. Table S5: differences between the means of for groups with an uneven number of integrations. Table S6: differences between the means of for groups with uneven domain lengths. Table S7: differences between the means of for groups with uneven functions f. Table S8: differences between the means of for groups with uneven RBF kernel. Table S9: differences between the means of for groups with uneven limit l. Table S10: differences between the means of for groups with uneven RBFs shape parameter. Figure S19: features significance. (Supplementary Materials)