TIME SERIES ARIMA MODEL FOR PREDICTING MONTHLY NET RADIATION

Net radiation is not a climatic variable hence not observed. Tedious numerical computations have been shown to characterize the methods used in its determination using data on some climatic variables. This study aims at generating monthly synthetic net radiation data in Ibadan, Benue and Kano, Nigeria using the Autoregressive Integrated Moving Average (ARIMA) model. This study performed Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) analysis in determining the parameters of the model while, the residual plots of Autocorrelation Function (ACF) and Partial Autocorrelation Functions (PACF) and graphical plots of backward model predictions or estimates and their respective actual values were used in the model validation. The study reveals that, the first difference of monthly net radiation can be represented by ARIMA (2, 1, 2) for Ibadan and Kano, and ARIMA (1, 1, 1) for Benue. Further result showed that there is a significant and fairly strong positive correlation between the monthly actual and predicted net radiation values across stations (p < 0.05). Lastly, the residual plots of Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) for Benue, Ibadan and Kano were examined and it was observed that the residuals were within the confidence intervals. This affirms the fact that the Autoregressive Integrated Moving Average (ARIMA) model is of good fit.


INTRODUCTION
Net radiation is the difference between the incoming solar (shortwave) radiation that reaches the earth's surface and the total terrestrial (longwave) radiation that is being emitted from the earth's surface (Lincoln et al., 2015). This difference between the shortwave and longwave radiation creates an adiabatic heat sink over the polar-regions and heat source over the equatorial latitudes. Surplus net radiation from the equatorial region must be transferred by wind to the polar region, in order to balance the heat energy between the two regions. This factor controls the environmental climate of a place and leads to increase or decrease of air temperature. Due to high cost and constant maintenance of recording instruments such as net radiometers, net radiation (Rn) measurements are difficult to be obtained. Santiago et al. (2002) and Gavilán et al. (2007) recommended the use of Penman Monteith (FAO-56) model in computing net radiation (Rn).This is because, Von Randow and Alvalá (2006) and Galvão and Fisch (2000) encounter difficulties in computing net longwave (terrestrial) radiation using the FAO-24 equation. The use of these methods is not without the challenge of tedious numerical computations that characterize them. It therefore becomes imperative to develop predictive models of net radiation as a way of alleviating this problem. The prediction of net radiation is relevant to the study of climate change, agricultural meteorology, estimation of evapotranspiration and weather monitoring. The prediction of net radiation is a difficult task due to the variability in climatic parameter such as air temperature, relative humidity and solar radiation. The Numerical Weather Prediction (NWP) is generally available in meteorological organizations but the direct implementation of this method in predicting solar radiation has been criticized (Mohammed et al., 2019). Solar radiation is one of the major climatic variables that affect net radiation. Thus, the Numerical Weather Prediction (NWP) cannot be used to predict net radiation because it greatly dependent on air quality and hydrological characteristics, which strongly vary with time and sensitive to location (Bauer et al., 2015). This further justifies the use of a time series Auto-Regressive Integrated Moving Average (ARIMA) model in this study. ARIMA is regarded as a smooth method, and it is appropriate when the data is practically long and the correlation between past observations is established (Farhath et al., 2016). The ARIMA model has already been extensively used in a number of related areas such as ecological and weather forecasting, economic time series forecasting, traffic flow forecasting, medical monitoring, and so on (Musa, 2013;Jadevicius and Huston, 2015;Colak et al., 2015;Boualit and Mellit, 2016;David et al., 2016). Nyatuame and Agodzo (2018) analyzed and predicted annual rainfall and maximum temperature using the Stochastic ARIMA model over Tordzie watershed in Ghana. The results of the various analyses indicated that the models were satisfactory and can assist in future water planning projections. Islam and Zakaria (2019) used the ARIMA model in carrying out 9 years predictions of monthly maximum and minimum temperatures in the Cox's Bazar and Teknaf area of Bangladesh. The forecast result reveals that maximum and minimum temperature is increasing in trend which is very alarming for this coastal area of Bangladesh. Bakar and Rosbi (2017) investigated the volatility of bitcoin cryptocurrency using the ARIMA model. Several studies in literature have used ARIMA models in predicting climatic variables but not net radiation in Benue, Ibadan and Kano, Nigeria. Benue state lies between latitude 7 0 44 1 N and longitude 8 0 54 1 E. Ibadan lies within latitude 7-9° N and longitude 2.8 -4.5° E. Kano is located between latitude 11.7574 o N and 8.6601 o E.

AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODELS
An Auto-Regressive Integrated Moving Average (ARIMA) model combines the autoregressive AR(p) model with moving average MA(q) model. The notation AR(p) indicates an autoregressive model of order 'p' , a representation of a type of random process. It is usually used to express certain time-varying processes in time series data. The autoregressive model specifies that the output variable depends linearly on its own preceding values and on a stochastic term. Hence, the model is in the form of a stochastic difference equation. The difference process is significant in order to make sure that the data involved in this analysis can be represented as data with stationary characteristics. The moving-average MA(q) model specifies that the output variable depends linearly on the present and a range of past values of a stochastic term. Mathematically, the autoregressive AR(p) model is given as (Nashirah and Sofian, 2017): Equation (10) can be written as: where, … … … are the parameters associated with −1 , … … . − respectively, A is a constant and is the white noise.
The moving average MA (q) model is given as: Equation (12) is reduced to: where … … … are the parameters associated with −1 … … . − and is the mean of the series. The value of p and q is called the order of the p and q in the autoregressive AR (p) and moving average MA(q) model respectively.
Autoregressive (AR) and moving average (MA) models can be successfully combined together to form a general time series models, known as the ARMA models. Mathematically an ARMA(p, q) model is represented as (Hipel and McLeod, 1994): The ARMA models are inadequate in describing non-stationary time series, which are commonly encountered in practice. On this basis, the ARIMA model is proposed, which is a generalization of an ARMA model to include the case of non-stationarity as well (Hipel and McLeod, 1994). For seasonal time series forecasting, the Seasonal Autoregressive Integrated Moving Average (SARIMA) model is usually used. The general ARIMA (p, d, q) model using lag polynomials is given as (Hipel and McLeod, 1994): p, d and q are integers greater than or equal to zero and refer to the order of the autoregressive, integrated (difference), and moving average parts of the model respectively.
In developing ARIMA model, analysis of autocorrelation function (ACF) and partial autocorrelation function (PACF) need to be performed. The autocorrelation function (ACF) plot shows the correlation of the series with itself at different lags while the partial autocorrelation function (PACF) plot shows the amount of autocorrelation at lag k that is not explained by lower-order autocorrelations. ACF and PACF are also used to determine the structure of the seasonal ARIMA model. The autocorrelation function (ACF) is given as: where ̅ denotes the sample mean and any ̂outside that band is statistically significant.

CORRELATION ANALYSIS
Correlation quantifies the extent to which two quantitative variables, X and Y, agree. When high values of X are associated with high values of Y, a positive correlation exists but when high values of X are associated with low values of Y, a negative correlation exists. Methods of correlation summarize the relationship between two variables in a single number called the correlation coefficient. The correlation coefficient is generally denoted by the symbol and it ranges from -1 to +1. A correlation coefficient value close to 0, but either positive or negative connotes little or no relationship between the two variables. A correlation coefficient close to +1 implies a positive relationship between the two variables, with increase in one of the variables being associated with increase in the other variable. A correlation coefficient close to -1 means a negative relationship between two variables, with an increase in one of the variables being associated with a decrease in the other variable. The Spearman correlation coefficient and Pearson correlation coefficient are basically two types of correlation coefficient. Mathematically, the Pearson correlation coefficient ( ) is given as (Tukey, 1977) The Spearman correlation coefficient ( ) is given as where D is the difference in the rank on variable X and on variable Y.

METHODOLOGY
The daily maximum and minimum relative humidity, maximum air temperature, minimum air temperature and solar radiation data was obtained for each state from the International Institute of Tropical Agriculture (IITA) Ibadan, Nigeria for the period of thirtyfour (34) years . The daily net radiation ( n R ) were computed using the step by step Penman Monteith model (equations 1-9) and averaged out monthly. Plotting a graph is the first step in the analysis of any time series. Such a plot provides an initial clue about the possible nature of the time series as to whether it shows an upward or downward trend, seasonal or cyclical variations etc. The collected data were processed, and the first difference was applied to simplify the correlation structure and to reveal any underlying pattern. Stationary time series data are prerequisite for developing and testing an ARIMA model. ARIMA model identification was done by considering the ACF and PACF for the stationary time series data. The computed monthly net radiation from 1977 to 2009 was used in the analysis of the ARIMA model using the Minitab software. The ARIMA model was used in forecasting net radiation for the next 12 months (i.e year 2010). The forecasted data was correlated with the actual data in order to determine the degree of association.

RESULT AND DISCUSSION
This section describes the result of autoregressive integrated moving average (ARIMA) model used in forecasting the monthly net radiation in Ibadan, Benue and Kano. The time series plot for monthly net radiation is presented in Figure 1. The cyclical variation in a time series plot of Figure 1 illustrates the medium-term changes in the series, caused by circumstances, which repeat in cycles. Changes in air temperature, relative humidity and solar radiation can result to changes in net radiation. Net radiation is surplus in Kano and Benue, while a deficit and surplus state occurs in Ibadan as shown in Figure 1. Net radiation of any region ought to be equal to zero, that is, the amount of incoming solar radiation absorbed by the earth surface equals to the outgoing terrestrial radiation emitted by the earth surface. Surplus net radiation occurs when incoming solar radiation is greater than terrestrial radiation, likewise deficit net radiation occurs when terrestrial radiation emitted by the earth surface is greater than the solar radiation absorb by the earth surface. Changes in net radiation can lead to increase or decrease in air temperature, which is one of the major indicators of climate change in any region. Forecasting and careful analysis of net radiation in Ibadan, Benue and Kano can help in monitoring the weather and climate, estimation of evapotranspiration and the study of climate change of the regions. It is apparent that a successful time series forecasting depends on suitable model fitting. ARIMA modeling uses differencing, autocorrelation and partial autocorrelation functions in identifying an acceptable model. Differencing is used to simplify the correlation structure and to reveal any underlying pattern. Figure 2 shows the first difference of Net radiation in Ibadan, Benue and Kano. The first order seasonal difference is the difference between an observation and the corresponding observation from the previous year and is used to remove non-stationarity from the series. A stationary time series is one whose statistical properties such as mean, variance, autocorrelation, etc. are all constant over time as seen in Figure 2. This study performed the autocorrelation function (ACF) and partial autocorrelation function (PACF) analysis on the monthly net radiation across stations as shown in Figure 3. The ACF and PACF is significant at lag 1.There is slow decay in the autocorrelation and partial autocorrelation function analysis as presented in Figure 3. As explained by Box and Jenkins (1970), autocorrelation function (ACF) plot is useful in determining the type of model to fit a time series of length N and partial autocorrelation function (PACF) plot helps in identifying the maximum order of an AR process. ARIMA model identification was done by taking into consideration the ACF and PACF of the stationary time series data after differencing across stations as shown in Figure 4. The ACF and PACF were tested for 60 lags to investigate the seasonality action. Figure 4 shows significant autocorrelations (spikes) are present at lags that are multiples of 12 at Ibadan station, which signifies a seasonality action every 12 months. However, at smaller lags, significant autocorrelations are present. Autocorrelation function (ACF) shows a significant spike at lag two in Ibadan and Kano stations and lag one in Benue. This indicates that, the moving average 'q' is represented by order two for Ibadan and Kano, while order one for Benue. In the same vein, partial autocorrelation function (PACF) shows a significant spike at lag two in Ibadan and Kano stations and lag one in Benue. This indicates that, the autoregressive part can be represented by order two in Ibadan and Kano stations and order one in Benue as presented in Figure 4. The PACF and ACF decay gradually. Thus, the first difference of monthly net radiation can be represented by ARIMA (2, 1, 2) for Ibadan and Kano, and ARIMA (1, 1, 1) for Benue.
Thus, for Ibadan and Kano, the derived equation for ARIMA (2, 1, 2) is given as: Therefore, using the estimated parameters after running the analysis: Benue: ∆ = −0.017328 − 0.4619∆ −1 + 0.9514 −1 + (25) Ibadan: Kano: Equation 24, 26 and 27 was used in the prediction model of ARIMA (1, 1, 1), ARIMA (2, 1, 2) and ARIMA (2, 1, 2) for Benue, Ibadan and Kano respectively. In validating the prediction model of ARIMA (2, 1, 2) and ARIMA (1, 1, 1), the predicted values for year 2010 was correlated with the actual value of the same year. The result for the spearman correlation is presented in table 1. There is a significant and fairly strong positive correlation between the monthly actual and predicted net radiation values across stations with p-values less than 0.05 as shown in Table 1. Figure 5 is the graphical representation of the monthly actual and predicted net radiation in Ibadan, Benue and Kano.
From Figure 5, it is observed that the monthly actual and predicted net radiation follow the same pattern. After fitting the models, the residual plots of ACF and PACF for Benue, Ibadan and Kano were examined and it was observed that the residuals were within the confidence intervals as shown in Figure 6. This is an indication of a good fit and the satisfactoriness of the proposed ARIMA models. The residual is the difference between the observed and the forecast data. The positive and negative values of the residuals also indicate model goodness and suggest that the predicted values are sometimes higher or lower than the original values. Although some residuals fall beyond the ±2 limits, these represent a limited number of readings, but many residuals falls within the accepted 95% confidence interval and gradually decay. This implies that the residuals are independent and thus satisfying the residual criterion. Residuals are employed to validate models.

CONCLUSION
Net radiation is surplus in Kano and Benue, while a deficit and surplus state occurs in Ibadan. The autocorrelation function (ACF) and partial autocorrelation function (PACF) is significant at lag 1 across stations. The first difference of monthly net radiation can be represented by ARIMA (2, 1, 2) for Ibadan and Kano, while ARIMA (1, 1, 1) for Benue. The residual plots of ACF and PACF for Benue, Ibadan and Kano were examined and it was observed that the residuals were within the confidence intervals. Hence, the predicting approach using autoregressive integrated moving average (ARIMA) method, generated a more reliable predicting model. This information will help in weather and climate monitoring, study of climate change, agricultural meteorology and estimation of evapotranspiration.