A function to calculate correlation statistics used for comparision of model data to observations or between two different models INPUT Pred Data to compare. An m by n matrix with m timesteps and n items or points Obs The reference values. An m by n matrix the same size as Pred. OUTPUT gofstat A MATLAB structure. See NOTES NOTES This function returns a number of useful model comparision statitics, calculated from pair-wise comparisions the columns in each of the matrixes Pred and Obs. Column 1 in Pred is compared against column 1 in Obs; Column 2 with column 2; and so-on up to column n. The returned structure contains the following information: corr : An n-by-4 matrix with : The correlation coefficient (R) : The lower 95% confidence limit on R : The upper 95% confidence limit on R : The p-value on R Pbar : An n-by-1 vector. The nanmean of Pred Obar : An n-by-1 vector. The nanmean of Obs absPbar : An n-by-1 vector. The nanmean of abs(Pred) absObar : An n-by-1 vector. The nanmean of abs(Obs) bias : An n-by-4 matrix with : The nanmean of the pairwise differences : The nanmean of the absolute pairwise differences : The standard deviation of pairwise differences : The so-called bias-index RMS : An n-by-3 matrix with : The root mean square : The scatter index (RMS/Obar) : The scatter index using abs of signal (RMS/abs(Obar)) : The stdev of the differences (same as bias(:,3), above?) MEF : An n-by-1 vector. The modelling efficency as per Stow 2009 skill: An n-by-1 vector. The modelling skill as per Dias 2009 RI : An n-by-1 vector. The reliability index Note that this uses the nanmean and nanstd functions to calculate the mean of the two datasets, which will implicitly ignore missing values. Note that this function relies on linear correlations. If your data are circular, you might consider a circular correlation coefficient. OCTAVE COMPATIBILITY Untested. REFERENCES Stow, C. A., Jolliff, J., McGillicuddy, D. J., Doney, S. C., Allen, J. I., Friedrichs, M. A. M., Rose, K. A., and Wallhead, P. 2009. Skill assessment for coupled biological/physical models of marine systems. Journal of Marine Systems, 76: 4--15. Dias, J. M., Sousa, M. C., Bertin, X., Fortunato, A. B., and Oliveira, A. 2009. Numerical modeling of the impact of the Ancao Inlet relocation (Ria Formosa, Portugal). Environmental modelling and software, 24: 711-725. AUTHORS Clare Duggan Daniel Pritchard Bjoern Elsaesser LICENCE Code distributed as part of the MRG toolbox from the Marine Research Group at Queens Univeristy Belfast (QUB) School of Planning Architecture and Civil Engineering (SPACE). Distributed under a creative commons CC BY-SA licence, retaining full copyright of the original authors. http://creativecommons.org/licenses/by-sa/3.0/ http://www.qub.ac.uk/space/ http://www.qub.ac.uk/research-centres/eerc/ DEVELOPMENT v 1.0 2011-10-01 First version. Clare Duggan v 1.1 2012-10-01 Modified. Clare Duggan v 1.2 2013-07-28 Documentation. Cleanup. Daniel Pritchard v 1.3 2013-08-15 DP. Fixed error in upper confidence interval. More documentation. v 1.4 2013-10-01 BE. Added scatter index using absolute of obsbar and model skill. as per Dias et al. v 1.5 2013-10-23 BE corrected bug in size comparison % Function Begin!
0001 function gofstat = mrg_gofstat(Pred,Obs) 0002 % A function to calculate correlation statistics used for comparision of model 0003 % data to observations or between two different models 0004 % 0005 % INPUT 0006 % Pred Data to compare. An m by n matrix with m timesteps and n 0007 % items or points 0008 % Obs The reference values. An m by n matrix the same size as Pred. 0009 % 0010 % OUTPUT 0011 % gofstat A MATLAB structure. See NOTES 0012 % 0013 % NOTES 0014 % This function returns a number of useful model comparision statitics, 0015 % calculated from pair-wise comparisions the columns in each of the 0016 % matrixes Pred and Obs. Column 1 in Pred is compared against column 1 0017 % in Obs; Column 2 with column 2; and so-on up to column n. 0018 % 0019 % The returned structure contains the following information: 0020 % corr : An n-by-4 matrix with 0021 % : The correlation coefficient (R) 0022 % : The lower 95% confidence limit on R 0023 % : The upper 95% confidence limit on R 0024 % : The p-value on R 0025 % Pbar : An n-by-1 vector. The nanmean of Pred 0026 % Obar : An n-by-1 vector. The nanmean of Obs 0027 % absPbar : An n-by-1 vector. The nanmean of abs(Pred) 0028 % absObar : An n-by-1 vector. The nanmean of abs(Obs) 0029 % bias : An n-by-4 matrix with 0030 % : The nanmean of the pairwise differences 0031 % : The nanmean of the absolute pairwise differences 0032 % : The standard deviation of pairwise differences 0033 % : The so-called bias-index 0034 % RMS : An n-by-3 matrix with 0035 % : The root mean square 0036 % : The scatter index (RMS/Obar) 0037 % : The scatter index using abs of signal (RMS/abs(Obar)) 0038 % : The stdev of the differences (same as bias(:,3), above?) 0039 % MEF : An n-by-1 vector. The modelling efficency as per Stow 2009 0040 % skill: An n-by-1 vector. The modelling skill as per Dias 2009 0041 % RI : An n-by-1 vector. The reliability index 0042 % 0043 % Note that this uses the nanmean and nanstd functions to calculate the 0044 % mean of the two datasets, which will implicitly ignore missing values. 0045 % Note that this function relies on linear correlations. If your data are 0046 % circular, you might consider a circular correlation coefficient. 0047 % 0048 % OCTAVE COMPATIBILITY 0049 % Untested. 0050 % 0051 % REFERENCES 0052 % Stow, C. A., Jolliff, J., McGillicuddy, D. J., Doney, S. C., Allen, J. I., 0053 % Friedrichs, M. A. M., Rose, K. A., and Wallhead, P. 2009. Skill 0054 % assessment for coupled biological/physical models of marine systems. 0055 % Journal of Marine Systems, 76: 4--15. 0056 % Dias, J. M., Sousa, M. C., Bertin, X., Fortunato, A. B., and Oliveira, 0057 % A. 2009. Numerical modeling of the impact of the Ancao Inlet 0058 % relocation (Ria Formosa, Portugal). Environmental modelling and 0059 % software, 24: 711-725. 0060 % 0061 % AUTHORS 0062 % Clare Duggan 0063 % Daniel Pritchard 0064 % Bjoern Elsaesser 0065 % 0066 % LICENCE 0067 % Code distributed as part of the MRG toolbox from the Marine Research 0068 % Group at Queens Univeristy Belfast (QUB) School of Planning 0069 % Architecture and Civil Engineering (SPACE). Distributed under a 0070 % creative commons CC BY-SA licence, retaining full copyright of the 0071 % original authors. 0072 % 0073 % http://creativecommons.org/licenses/by-sa/3.0/ 0074 % http://www.qub.ac.uk/space/ 0075 % http://www.qub.ac.uk/research-centres/eerc/ 0076 % 0077 % DEVELOPMENT 0078 % v 1.0 2011-10-01 0079 % First version. Clare Duggan 0080 % v 1.1 2012-10-01 0081 % Modified. Clare Duggan 0082 % v 1.2 2013-07-28 0083 % Documentation. Cleanup. Daniel Pritchard 0084 % v 1.3 2013-08-15 DP. 0085 % Fixed error in upper confidence interval. 0086 % More documentation. 0087 % v 1.4 2013-10-01 BE. 0088 % Added scatter index using absolute of obsbar and model skill. 0089 % as per Dias et al. 0090 % v 1.5 2013-10-23 BE 0091 % corrected bug in size comparison 0092 %% Function Begin! 0093 k1 = size(Pred); 0094 k2 = size(Obs); 0095 0096 if any(k1~=k2) 0097 msgbox('Data sets are not the same size!'); 0098 return 0099 end 0100 0101 gofstat = struct(); 0102 0103 %% Performing Statistical Calculations 0104 for n = 1:k1(2); 0105 % Correlation coefficent with the 95% upper and lower confidence limit 0106 [R,P,RLO,RUP]=corrcoef(Pred(1:end,n),Obs(1:end,n),'rows','pairwise'); 0107 gofstat.corr(n,1) = R(2); 0108 gofstat.corr(n,2) = RLO(2); 0109 gofstat.corr(n,3) = RUP(2); 0110 gofstat.corr(n,4) = P(2); 0111 0112 % mean of the different data sets 0113 gofstat.Pbar(n,1) = nanmean(Pred(1:end,n)); 0114 gofstat.Obar(n,1) = nanmean(Obs(1:end,n)); 0115 gofstat.absPbar(n,1) = nanmean(abs(Pred(1:end,n))); 0116 gofstat.absObar(n,1) = nanmean(abs(Obs(1:end,n))); 0117 0118 % difference of the data sets 0119 delta = Pred(1:end,n) - Obs(1:end,n); 0120 delta2 = abs(Pred(1:end,n) - Obs(1:end,n)); 0121 % mean of difference 0122 gofstat.bias(n,1) = nanmean(delta); 0123 % average absolute error 0124 gofstat.bias(n,2) = nanmean(delta2); 0125 % standard deviation of difference 0126 gofstat.bias(n,3) = nanstd(delta); 0127 % bias index of the data set 0128 gofstat.bias(n,4) = gofstat.bias(n,1)/gofstat.Obar(n,1); 0129 0130 % RMS 0131 gofstat.RMS(n,1) = sqrt(nanmean(delta.^2)); 0132 % Scatter index 0133 gofstat.RMS(n,2) = gofstat.RMS(n,1)/gofstat.Obar(n,1); 0134 % Scatter index using absolute 0135 gofstat.RMS(n,3) = gofstat.RMS(n,1)/gofstat.absObar(n,1); 0136 % Standard deviation of the differences 0137 gofstat.RMS(n,4) = sqrt(nanmean((delta - nanmean(delta)).^2)); 0138 0139 % MEF - the modeling efficiency 0140 % observations minus average of observations 0141 difme = (Obs(1:end,n) - gofstat.Obar(n,1)).^2; 0142 % predictions minus average of predictions 0143 difmo = (Pred(1:end,n) - Obs(1:end,n)).^2; 0144 difme2 = nansum(difme); 0145 difmo2 = nansum(difmo); 0146 % MEF 0147 gofstat.MEF(n,1) =(difme2-difmo2)/difme2; 0148 0149 % SKILL - the modeling skill 0150 % observations minus average of observations 0151 difsko = (Obs(1:end,n) - gofstat.Obar(n,1)); 0152 difskp = (Pred(1:end,n) - gofstat.Obar(n,1)); 0153 % sum of differences 0154 difsk2 = nansum((difskp + difsko).^2); 0155 0156 % MEF 0157 gofstat.skill(n,1)=1-difmo2/difsk2; 0158 0159 % RI - Reliability Index 0160 rel = (log(Obs(1:end,n)./Pred(1:end,n))).^2; 0161 rel2 = nanmean(rel); 0162 rel3 = sqrt(rel2); 0163 gofstat.RI(n,1) = exp(rel3); 0164 end