Advertisement
pszemraj

Comparing asymmetric semantic search models - ex "strengths of ARIMA models"

Aug 29th, 2022 (edited)
203
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 28.08 KB | None | 0 0
  1. comparison of SBERT models for asymmetric semantic search on course data (for searching for docs that are relevant quickly
  2.  
  3. query for all 3 models is the same: strengths of ARIMA models
  4.  
  5.  
  6. ## sentence-transformers/msmarco-distilbert-base-tas-b
  7.  
  8. [{'Rank': 1,
  9. 'Search Score': 104.0884,
  10. 'doc_dir': 'TB-forecasting-principles',
  11. 'doc_name': 'OCR_Ch-9-ARIMA models-FPP_',
  12. 'doc_relative_loc': 70.968,
  13. 'doc_text': '0, 1 ) 1, 1, 0 ) 12 of these models, the best is the arima ( 3, '
  14. '0, 1 ) ( 0, 1, 2 ) 12 model ( i. e, it has the smallest aicc '
  15. 'value ). ( fit < arima ( h02, order = c ( 3, 0, 1 ), seasonal - '
  16. 'c ( 0, 1, 2 ), lambda = 0 ) ) # > series : h0z # > arima ( 3, '
  17. '0, 1 ) ( 0, 1, 2 ) [ 12 ] # > box cox transformation : zambda = '
  18. '0 # > coefficients : # ari ar2 ar3 mal # > 0 160 0. 548 0. 568 '
  19. '0. 383 # > 5. e _ 0. 164 0. 088 0. 094 0. 190 smal sma2 0 5220 '
  20. '177 0. 086 0. 087 # > # > sigma ^ 2 estimated as 0. 00428 : log '
  21. 'likelihood - 250 # > aic = - 486 _ 1 aicc = - 485. 5 bic = - '
  22. '463. 3 checkresiduals ( fit, zag - 36 ) reslduals from arima',
  23. 'id_within_doc': 66},
  24. {'Rank': 3,
  25. 'Search Score': 101.2414,
  26. 'doc_dir': 'TS-Analysis-apps-in-R-pgsplit',
  27. 'doc_name': 'OCR_5_201_cryer - time series analysis apps in R_',
  28. 'doc_relative_loc': 94.231,
  29. 'doc_text': 'arima models are just special cases of our general arima '
  30. 'models. as such, all of our work on parameter estimation in '
  31. 'chapter 7 carries over t0 the seasonal case. exhibit 10. 10 '
  32. 'gives the maximum likelihood estimates and their standard '
  33. 'errors for the arima ( 0, 1, 1 ) x ( 0, 1, 1912 model for coz '
  34. 'levels. exhibit 10. 10 parameter estimates for the coz model '
  35. 'coefficient estimate 0. 5792 0. 8206 standard error 0. 0791 0. '
  36. '1137 82 0. 5446 : log - likelihood = 139. 54, aic = 283. 08 ml. '
  37. 'co2 - arima ( co2, order - c ( 0, 1, 1 ) seasonal - list ( '
  38. 'order - c ( 0, 1, 1 ), period - 12 ) ) ml _ co2 238 seasonal '
  39. 'models the coefficient estimates are all highly significant, '
  40. 'and we proceed to check further on this model _ diagnostic '
  41. 'checking to check the estimated the arima ( o, 1, 1 ) x ( 0, 3 '
  42. '1, 1 ) 12 model, we first look at the time series plot of the '
  43. 'residuals. exhibit 10. 11 gives this plot for standardized '
  44. 'residuals. other than some strange behavior in the middle of '
  45. 'the series,',
  46. 'id_within_doc': 98},
  47. {'Rank': 5,
  48. 'Search Score': 100.9896,
  49. 'doc_dir': 'course-slides',
  50. 'doc_name': 'OCR_ATS_Slides_v220216__7',
  51. 'doc_relative_loc': 0.0,
  52. 'doc_text': 'arima, sarima & garch fitting an arima in r plausible models '
  53. 'for the logged oil prices after inspection of acfipacf of the '
  54. 'differenced series ( that seems stationary ) : arima ( 1, 1, 1 '
  55. ') or arima ( 2, 1, 1 ), the former has lower aic arima ( lop, '
  56. 'order - c ( 11, 1 ) ) coefficients : arl mal0. 2987 0. 5700 s. '
  57. 'e. 0. 2009 0. 1723 sigma ^ 2 = 0. 006642 : 11 261. 11, a = 518. '
  58. '22 alternative r command with equivalent result : arima ( drop, '
  59. 'order - c ( 1, 0, 1 ), include mean - false ) 291 arima, sarima '
  60. '& garch example : residuals for arima ( 1, 1, 1 ) residuals '
  61. 'from arima ( 1, 1, 1 ) rwwlimhlwkv wwmmv 5 1990 1995 2000 2005 '
  62. '3 3 3 g 8 3 3 3 2 8 3 3 5 10 15 20 25 30 35 5 10 15 20 25 30 35 '
  63. 'lag lag 292 ivrwukv arima, sarima & garch rewriting arima as '
  64. 'non - stationary arm',
  65. 'id_within_doc': 0},
  66. {'Rank': 6,
  67. 'Search Score': 100.7397,
  68. 'doc_dir': 'TB-forecasting-principles',
  69. 'doc_name': 'OCR_Ch-10-Dynamic regression models-FPP_',
  70. 'doc_relative_loc': 61.765,
  71. 'doc_text': 'more " wiggly " seasonal pattern and simpler arima models are '
  72. 'required to capture other dynamics. the aicc value is minimised '
  73. 'for k 5, with a significant jump going from k = 4 to k = 5, '
  74. 'hence the forecasts generated from this model would be the ones '
  75. 'used : cafe04 < window ( auscafe, start - 2004 ) plots < list ( '
  76. ') for ( i in seq ( 6 ) ) { fit < auto. arima ( cafe04, xreg '
  77. 'fourier ( cafe04, k = i ), seasonal false, iambda 0 ) plots [ [ '
  78. 'i ] ] < autoplot ( forecast ( fit, xreg - fourier ( cafe04, k = '
  79. 'i, h = 24 ) ) ) + xlab ( paste ( " k = " 1, aicc = " round ( '
  80. 'fit [ [ " aicc " ] ], 2 ) ) ) + ylab ( " " ) + ylim ( 1. 5, 4. '
  81. '7 ) gridextra : : grid. arrange ( plots [ [ 1 ] ], plots [ [ 2 '
  82. '] ], plots [ [ 3 ] ], plots [ [ 4 ] ], plots [ [ 5 ] ], plots [ '
  83. '[ 6',
  84. 'id_within_doc': 21}]
  85.  
  86.  
  87. --------------------------------------------------------------------------------------------------------
  88.  
  89. ## sentence-transformers/msmarco-bert-base-dot-v5
  90.  
  91. [{'Rank': 1,
  92. 'Search Score': 169.94,
  93. 'doc_dir': 'TB-forecasting-principles',
  94. 'doc_name': 'OCR_Ch-9-ARIMA models-FPP_',
  95. 'doc_relative_loc': 45.161,
  96. 'doc_text': 'an arima ( 3, 1, 0 ) model along with variations including '
  97. 'arima ( 4, 1, 0 ), arima ( 2, 1, 0 ), arima ( 3, 1, 1 ), etc. '
  98. 'of these, the arima ( 3, 1, 1 ) has a slightly smaller aicc '
  99. 'value. ( fit < arima ( eeadj order - c ( 3, 1, 1 ) ) ) # > '
  100. 'series : eeadj # > arima ( 3, 1, 1 ) # > # > coefficients : # > '
  101. 'arl ar2 ar3 # > 0. 004 0. 092 0. 370 mal 0. 392 # > 5. e. 0. '
  102. '220 0. 098 0. 067 0. 243 # > # > sigma ^ 2 estimated as 9. 58 : '
  103. 'log likelihood = - 492 7 # > aic - 995. 4 aicc = 995. 7 bic = '
  104. '1012 lag lag 6. the acf plot of the residuals from the arima ( '
  105. '3, 1, 1 ) model shows that all autocorrelations are within the '
  106. 'threshold limits, indicating that the residuals are behaving '
  107. 'like white noise. a portmantea',
  108. 'id_within_doc': 42},
  109. {'Rank': 2,
  110. 'Search Score': 168.7032,
  111. 'doc_dir': 'TB-forecasting-principles',
  112. 'doc_name': 'OCR_Ch-13-Some practical forecasting issues-FPP_',
  113. 'doc_relative_loc': 76.471,
  114. 'doc_text': 'test < arima ( test, model - cafe. train ) accuracy ( cafe. '
  115. 'test ) # > me rmse mae mpe mape # > training set 0 002622 0. '
  116. '04591 0. 034130. 07301 1. 002 # > mase acf1 # > train ing set 0 '
  117. '1899 ~ 0. 05704 note that arima ( does not re - estimate in '
  118. 'this case. instead, the model obtained previously ( and stored '
  119. 'as cafe. train ) is applied to the test data. because the model '
  120. 'was not re - estimated, the " residuals " obtained here are '
  121. 'actually one - step forecast errors consequently, the results '
  122. 'produced from the accuracy ( ) command are actually on the test '
  123. 'set ( despite the output saying ( training set " ) 12. 9 '
  124. 'dealing with missing values and outliers real data often '
  125. 'contains missing values, outlying observations, and other messy '
  126. 'features. dealing with them can sometimes be troublesome '
  127. 'missing values missing data can arise for many reasons, and it '
  128. 'is worth considering whether the missingness will induce bias '
  129. 'in the forecasting model. for example, suppose we are studying '
  130. 'sales data for a store, and missing values occur on public '
  131. 'holidays when the store is closed. the following day may have '
  132. 'increased sales as',
  133. 'id_within_doc': 26},
  134. {'Rank': 3,
  135. 'Search Score': 168.5765,
  136. 'doc_dir': 'course-script',
  137. 'doc_name': 'OCR_ATS_Script_v220214__6',
  138. 'doc_relative_loc': 55.0,
  139. 'doc_text': 'most plausible parsimonious integrated models include the arima '
  140. '( 0, 1, 1 ) and the arima ( 1, 1, 1 ). the former cannot '
  141. 'reasonably capture the dependencies ; the residuals are still '
  142. 'correlated and violate the white noise assumption. the arima ( '
  143. '1, 1, 1 ) is much better in this regard. however, its aic value '
  144. 'is worse than the one of the arima ( 2, 0, 1 ) considered '
  145. 'previously : we again employ auto. arima ( ) for a non - '
  146. 'stepwise grid search over all arima ( p, 1, 4 ) with p, q < 5 '
  147. 'and p + q < 5 _ since we want to avoid a drift - term and '
  148. 'directly work on the differenced data, we have to set allowmean '
  149. '- false _ fit < auto. arima ( diff ( tdf ) max p - 5, max 9 - '
  150. '5, stationary - true, allow mean - false, stepwise - false, ic '
  151. '= " a " ) 123 lag 6 sarima and garch models fit series : diff ( '
  152. 'tdf ) arima ( 2, 0, 1 ) with zero mean coefficients : arl ar2 '
  153. 'mal 0. 4219 0. 12490. 961',
  154. 'id_within_doc': 22},
  155. {'Rank': 4,
  156. 'Search Score': 168.526,
  157. 'doc_dir': 'TB-theory-and-methods-1992',
  158. 'doc_name': 'OCR_11_Model Building and Forecasting with ARIMA Processes_Time '
  159. 'Series Theory and Methods_',
  160. 'doc_relative_loc': 5.0,
  161. 'doc_text': 'an arima model is the slowly decaying positive sample '
  162. 'autocorrelation function seen in figure 9. 1. if therefore we '
  163. 'were given only the data and wished to find an appropriate '
  164. 'model it would be natural to apply the operator v = 1 b '
  165. 'repeatedly in the hope that for some j, { vix, } will have a '
  166. 'rapidly decaying sample autocorrelation function compatible '
  167. 'with that ofan arma process with no zeroes of the '
  168. 'autoregressive polynomial near the unit circle. for the '
  169. 'particular time series in this example, one application of the '
  170. 'operator produces the realization shown in figure 9. 2, whose '
  171. 'sample autocorrelation and partial autocorrelation functions '
  172. 'suggest an ar ( l ) model for { vx, } the maximum likelihood '
  173. 'estimates of $ and 02 obtained from pest ( under the assumption '
  174. 'that e ( vx, ) = 0 ) are. 808 and. 978 respectively, giving the '
  175. 'model, 89. 1. arima models for non - stationary time series 277 '
  176. '3 2 ~ 2 5 20 40 60 80 100 ( a ) 120 140 160 180 200 0. 9 0. 8 '
  177. '0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 8 & - & 0 _ ~ 0',
  178. 'id_within_doc': 6}]
  179.  
  180.  
  181.  
  182. ------------------------------------------------------------------------------
  183.  
  184. ## sentence-transformers/msmarco-distilbert-cos-v5
  185.  
  186.  
  187. [{'Rank': 1,
  188. 'Search Score': 0.5348,
  189. 'doc_dir': 'course-slides',
  190. 'doc_name': 'OCR_ATS_Slides_v220216__7',
  191. 'doc_relative_loc': 0.0,
  192. 'doc_text': 'arima, sarima & garch fitting an arima in r plausible models '
  193. 'for the logged oil prices after inspection of acfipacf of the '
  194. 'differenced series ( that seems stationary ) : arima ( 1, 1, 1 '
  195. ') or arima ( 2, 1, 1 ), the former has lower aic arima ( lop, '
  196. 'order - c ( 11, 1 ) ) coefficients : arl mal0. 2987 0. 5700 s. '
  197. 'e. 0. 2009 0. 1723 sigma ^ 2 = 0. 006642 : 11 261. 11, a = 518. '
  198. '22 alternative r command with equivalent result : arima ( drop, '
  199. 'order - c ( 1, 0, 1 ), include mean - false ) 291 arima, sarima '
  200. '& garch example : residuals for arima ( 1, 1, 1 ) residuals '
  201. 'from arima ( 1, 1, 1 ) rwwlimhlwkv wwmmv 5 1990 1995 2000 2005 '
  202. '3 3 3 g 8 3 3 3 2 8 3 3 5 10 15 20 25 30 35 5 10 15 20 25 30 35 '
  203. 'lag lag 292 ivrwukv arima, sarima & garch rewriting arima as '
  204. 'non - stationary arm',
  205. 'id_within_doc': 0},
  206. {'Rank': 3,
  207. 'Search Score': 0.5068,
  208. 'doc_dir': 'course-script',
  209. 'doc_name': 'OCR_ATS_Script_v220214__6',
  210. 'doc_relative_loc': 10.0,
  211. 'doc_text': 'is at 0. 3056, providing further evidence that the remaining '
  212. 'dependence is insignificant : 5. 5. 3 aic - based model choice '
  213. 'we have explained above how the order of arma ( p, q ) models '
  214. 'can be found by inspecting acf and pacf and complementing this '
  215. 'with classical model selection approaches and residual '
  216. 'analysis. another alternative is to run a criterion - based '
  217. 'model selection. in r, this is conveniently possible by using '
  218. 'the function auto arima ( ) from the library ( forecast ). '
  219. 'however, handle this with care : the function will always '
  220. 'identify a " best fitting " arma ( p, q ) model, but it is, of '
  221. 'course, not guaranteed that it fits the data well. moreover, '
  222. 'usage of the function is somewhat 112 5 stationary time series '
  223. 'models are tricky, as many arguments need to be set. we first '
  224. 'address the definition of the information criteria, as they are '
  225. 'central to the auto. arima ( ) function : aic = 2log ( l ) + 2 '
  226. '( p + q + k + 1 ) here, the first term measures how well the '
  227. 'model fits the training data with the value of the log - '
  228. 'likelihood function as the goodness - of - fit measure. the '
  229. 'second term penalizes model complexity, where p and',
  230. 'id_within_doc': 4},
  231. {'Rank': 5,
  232. 'Search Score': 0.4985,
  233. 'doc_dir': 'TS-Analysis-apps-in-R-pgsplit',
  234. 'doc_name': 'OCR_4_151_cryer - time series analysis apps in R_',
  235. 'doc_relative_loc': 91.667,
  236. 'doc_text': '( 1, 1 ) model for the color series coefficients : ar1 ma1 '
  237. 'intercept 0. 6721 ~ 0. 1467 74. 1730 s. e 0. 2147 0. 2742 2. '
  238. '1357 sigma ^ 2 estimated as 24. 63 : log - likelihood = 105. '
  239. '94, aic = 219. 88 arima ( color order - c ( 1, 0, 1 ) ) as we '
  240. 'have noted, any arma ( p, q ) model can be considered as '
  241. 'special case of a more general arma model with the additional '
  242. 'parameters equal t0 zero. however ; when generalizing arma '
  243. 'models, we must be aware of the problem of parameter redundancy '
  244. 'or lack of identifiability : to make these points clear ; '
  245. 'consider an arma ( 1, 2 ) model : yt = $ y _ 1 + e101e1 - 1 ~ '
  246. '02e, - 2 8. 2. 1 ) now replace t by t _ 1 to obtain yi _ 1 = $ '
  247. 'y, _ 2 + e _ 1 ~ 01e2 ~ 02e, - 3 8. 2. 2 ) if we multiply both '
  248. 'sides of equation ( 8. 2. 2 ) by any constant c and then '
  249. 'subtract it from',
  250. 'id_within_doc': 99},
  251. {'Rank': 7,
  252. 'Search Score': 0.4933,
  253. 'doc_dir': 'course-slides',
  254. 'doc_name': 'OCR_ATS_Slides_v220216__4',
  255. 'doc_relative_loc': 85.0,
  256. 'doc_text': '68 62839 f. arima mle < _ arima ( log ( lynx ) 1 order - c ( 2, '
  257. '0, 0 ) ) coefficients : arl ar2 intercept 1. 37760. 7399 6. 68 '
  258. '63 s. e. 0. 0614 0. 0612 0. 1349 sigma ^ 2 - 0. 271 ; log - '
  259. 'likelihood - - 88. 58 ; aic185. 15 while mle by default assumes '
  260. 'gaussian innovations, it performs reasonably in coefficient '
  261. 'estimation and points predictions for other distributions as '
  262. 'long as they are not extremely skewed or have very precarious '
  263. 'outliers. however, the standard errors are biased. 186 '
  264. 'autoregressive models practical aspects all four estimation '
  265. 'methods are asymptotically equivalent, and the differences are '
  266. 'usually small, even on finite samples. all four estimation '
  267. 'methods are non - robust against outliers and perform best on '
  268. 'approximately gaussian data : function arima ( ) provides '
  269. 'standard errors for m ; 01, 0 so p that statements about '
  270. 'significance become feasible, and confidence intervals for the '
  271. 'parameters can be built. ar. ols ( ), ar. yw ( ) & ar burg ( ) '
  272. 'allow for a convenient choice of the optimal',
  273. 'id_within_doc': 17}]
  274.  
  275.  
  276. --------------------------------------------------------------------------------------------------------
  277.  
  278. ## sentence-transformers/multi-qa-mpnet-base-dot-v1
  279.  
  280. [{'Rank': 1,
  281. 'Search Score': 23.9894,
  282. 'doc_dir': 'TB-time-seriesR-cowpertwait',
  283. 'doc_name': 'OCR_10_Non-stationary Models_intro time series in R - '
  284. 'cowperwait_',
  285. 'doc_relative_loc': 42.5,
  286. 'doc_text': 'range of models by a trial - and - error approach involving '
  287. 'just editing a command on each trial to see if an improvement '
  288. 'in the aic occurs. alternatively ; we could write a simple '
  289. 'function that fits a range of arima models and selects the best '
  290. '- fitting model this approach works better when the conditional '
  291. 'sum of squares method css is selected in the arima function ; '
  292. 'as the algorithm is more robust _ to avoid over parametrisation '
  293. '; the consistent akaike information criteria ( caic ; see '
  294. 'bozdogan ; 1987 ) can be used in model selection an example '
  295. 'program follows _ get. best arima < function ( x. ts, maxord c '
  296. '( 1, 1, 1, 1, 1, 1 ) ) best aic < 1e8 < length ( x. ts ) for ( '
  297. 'p in 0 : maxord [ 1 ] ) for ( d in 0 : maxord [ 2 ] ) for ( q '
  298. 'in 0 : maxord [ 3 ] ) for ( p in 0 : maxord [ 4 ] ) for ( d in '
  299. '0 : maxord [ 5 ] ) for ( q in 0 : maxord [ 6 ] ) { fit < arima '
  300. '( x. ts _ order c ( p, d, 9 ) seas list ( order c',
  301. 'id_within_doc': 17},
  302. {'Rank': 2,
  303. 'Search Score': 23.9238,
  304. 'doc_dir': 'TB-forecasting-principles',
  305. 'doc_name': 'OCR_Ch-9-ARIMA models-FPP_',
  306. 'doc_relative_loc': 75.269,
  307. 'doc_text': '0 ) 12 0. 0679 the models chosen manually and with auto. arimal '
  308. ') are both in the top four models based on their rmse values. '
  309. 'when models are compared using aicc values, it is important '
  310. 'that all models have the same orders of differencing : however, '
  311. 'when comparing models using a test set, it does not matter how '
  312. 'the forecasts were produced the comparisons are always valid '
  313. 'consequently, in the table above, we can include some models '
  314. 'with only seasonal differencing and some models with both first '
  315. 'and seasonal differencing, while in the earlier table '
  316. 'containing aicc values, we only compared models with seasonal '
  317. 'differencing but no first differencing : none of the models '
  318. 'considered here pass all of the residual tests. in practice, we '
  319. 'would normally use the best model we could find, even if it did '
  320. 'not pass all of the tests. forecasts from the arima ( 3, 0, 1 ) '
  321. '( 0, 1, 2 ) 12 model ( which has the lowest rmse value on the '
  322. 'test set, and the best aicc value amongst models with only '
  323. 'seasonal differencing ) are shown in figure 8. 26. h0z % > '
  324. 'arima ( order - c ( 3, 0, 1 ), seasonal - c',
  325. 'id_within_doc': 70},
  326. {'Rank': 3,
  327. 'Search Score': 23.4669,
  328. 'doc_dir': 'lecture-audio',
  329. 'doc_name': 'SC_lecture_7_apr_4_v_2_c_transcription_10',
  330. 'doc_relative_loc': 88.679,
  331. 'doc_text': "the other hand, it's also not so easy to develop a process that "
  332. "removes this dependency. you'd have to increase the model "
  333. 'orders quite a bit and estimate many more certifications, which '
  334. 'also brings some disadvantages, so to some extent, one '
  335. 'sometimes also accept is certainly a remaining dependency is a '
  336. "lot more disturbing if it's on the first couple of flags rather "
  337. "than besides at the higher lag. it's more tolerated if it's "
  338. "small in magnitude rather than when it's large and magnitude. "
  339. "it's more tolerated when it's only at the single lack, which "
  340. "here, in fact, it is not. there's a second in both, but it's "
  341. "very small. ya. so that's how modeling works. so you always "
  342. 'have this tirade off into the complexity of the model. if the '
  343. 'larger model does not clean advantages and practical '
  344. 'advantages, this is not just removing this, but also practical '
  345. 'advantages. one often proceeds with the smaller model oak. so '
  346. "that's at the end of this example, the end of this chapter on "
  347. 'armapcu. and well, we go to the first application of these '
  348. 'arima processes, which is serious regression at times. so let '
  349. 'me try to explain. so time',
  350. 'id_within_doc': 47},
  351. {'Rank': 4,
  352. 'Search Score': 23.1942,
  353. 'doc_dir': 'TB-intro-TS-and-Forecasting-Brockwell',
  354. 'doc_name': 'OCR_21_Index_Introduction to Time Series and Forecasting_',
  355. 'doc_relative_loc': 40.0,
  356. 'doc_text': 'based 0n confidence regions, forecasting arima processes, 173 - '
  357. '177 369 - 370 forecast function, 182 - 183 uniformly most '
  358. 'powerful test ; 369 h - step predictor ; 175 mean square error '
  359. '0f, 174 forecast density, 289 forward prediction errors, 130 '
  360. 'iarch ( o ) process, 209 fourier frequencies, 107, 109 igarch ( '
  361. 'p, q ) process, 208 fourier indices, 11 independent random '
  362. 'variables, 30, 36, 214 fractionally integrated arma process, '
  363. '339 identification techniques, 163 - 169 estimation of, 340 for '
  364. 'arma processes, 164 422 index identification techniques ( cont '
  365. ': ) for ar ( p ) processes, 142 for ma ( q ) processes, 153 for '
  366. 'seasonal arima processes, 177 igarch ( p, 4 ) process, 208, 209 '
  367. 'iid noise, 6 _ 7, 14 sample acf of, 53 multivariate, 235 '
  368. 'innovations, 62, 271 innovations algorithm, 62 - 65, 132 - 137 '
  369. 'fitted innovations ma ( m ) model, 133 multivariate, 247 input, '
  370. '45, 112, 333 integrated volatility, 217, 218, 220, 226 '
  371. 'intervention analysis, 331 - 334 invertible arma process, 76 '
  372. 'multivariate arma process, 244 investment strategy, 221',
  373. 'id_within_doc': 10}]
  374.  
  375. --------------------------------------------------------------------------------------------------------
  376.  
  377. ## sentence-transformers/msmarco-MiniLM-L6-cos-v5
  378.  
  379. [{'Rank': 1,
  380. 'Search Score': 0.6511,
  381. 'doc_dir': 'TB-time-seriesR-cowpertwait',
  382. 'doc_name': 'OCR_10_Non-stationary Models_intro time series in R - '
  383. 'cowperwait_',
  384. 'doc_relative_loc': 42.5,
  385. 'doc_text': 'range of models by a trial - and - error approach involving '
  386. 'just editing a command on each trial to see if an improvement '
  387. 'in the aic occurs. alternatively ; we could write a simple '
  388. 'function that fits a range of arima models and selects the best '
  389. '- fitting model this approach works better when the conditional '
  390. 'sum of squares method css is selected in the arima function ; '
  391. 'as the algorithm is more robust _ to avoid over parametrisation '
  392. '; the consistent akaike information criteria ( caic ; see '
  393. 'bozdogan ; 1987 ) can be used in model selection an example '
  394. 'program follows _ get. best arima < function ( x. ts, maxord c '
  395. '( 1, 1, 1, 1, 1, 1 ) ) best aic < 1e8 < length ( x. ts ) for ( '
  396. 'p in 0 : maxord [ 1 ] ) for ( d in 0 : maxord [ 2 ] ) for ( q '
  397. 'in 0 : maxord [ 3 ] ) for ( p in 0 : maxord [ 4 ] ) for ( d in '
  398. '0 : maxord [ 5 ] ) for ( q in 0 : maxord [ 6 ] ) { fit < arima '
  399. '( x. ts _ order c ( p, d, 9 ) seas list ( order c',
  400. 'id_within_doc': 17},
  401. {'Rank': 2,
  402. 'Search Score': 0.6476,
  403. 'doc_dir': 'TB-forecasting-principles',
  404. 'doc_name': 'OCR_Ch-9-ARIMA models-FPP_',
  405. 'doc_relative_loc': 73.118,
  406. 'doc_text': ': the model can still be used for forecasting, but the '
  407. 'prediction intervals may not be accurate due to the correlated '
  408. 'residuals. next we will try using the automatic arima algorithm '
  409. ': running auto. arimal ) with all arguments left at their '
  410. 'default values led to an arima ( 2, 1, 3 ) ( 0, 1, 1 ) 12 '
  411. 'model. however ; the model still fails the ljung - box test : '
  412. 'sometimes it is just not possible to find a model that passes '
  413. 'all of the tests. test set evaluation : we will compare some of '
  414. 'the models fitted so far using a test set consisting of the '
  415. 'last two years of data : thus, we fit the models using data '
  416. 'from july 1991 to june 2006, and forecast the script sales for '
  417. 'july 2006 june 2008. the results are summarised in the '
  418. 'following table table 8. 2 : rmse values for various arima '
  419. 'models applied to the hoz monthly script sales data : model '
  420. 'rmse arima ( 3, 0, 1 ) ( 0, 1, 2 ) 12 0. 0622 arima ( 3, 0, 1 ) '
  421. '( 1, 1, 1 ) 12 0. 0630 arima ( 2, 1, 4 ) ( 0, 1, 1 ) 12 0. 0632',
  422. 'id_within_doc': 68},
  423. {'Rank': 3,
  424. 'Search Score': 0.6426,
  425. 'doc_dir': 'course-script',
  426. 'doc_name': 'OCR_ATS_Script_v220214__6',
  427. 'doc_relative_loc': 72.5,
  428. 'doc_text': 'searching for cut - offs. mostly, these are far from evident ; '
  429. 'and thus, an often applied alternative is to consider all '
  430. 'models with p, 9, p, q < 2 and doing an aic - based grid '
  431. 'search, function auto _ arima ( ) may be very handy for this '
  432. 'task for our example, the sarima ( 2, 1, 2 2 ) ( 2, 1, 2 ) " 2 '
  433. 'has the lowest value and also shows satisfactory residuals, '
  434. 'although it seems to perform slightly less well than the sarima '
  435. "( 14, 1, 11 ) 00, 1, 0 )'12 the r - command for the former is : "
  436. 'fit < = arima ( log ( beer ) order - c ( 2, 1, 2 ) seasonal = c '
  437. '( 2, 1, 2 ) ) forecast of log ( beer ) with sarima ( 2, 1, 2 ) '
  438. '( 2, 1, 2 ) 3 3 [ 5 3 9 3 1985 wu 1986 1987 1988 time 1989 1990 '
  439. '1991 as it was mentioned in the introduction to this section, '
  440. 'one of the main advantages of arima and sarima models is that '
  441. 'they allow for quick and convenient forecasting : while this '
  442. 'will be discussed in depth later in section 8, we here provide '
  443. 'a first example to show the',
  444. 'id_within_doc': 29},
  445. {'Rank': 8,
  446. 'Search Score': 0.61,
  447. 'doc_dir': 'course-slides',
  448. 'doc_name': 'OCR_ATS_Slides_v220216__7',
  449. 'doc_relative_loc': 0.0,
  450. 'doc_text': 'arima, sarima & garch fitting an arima in r plausible models '
  451. 'for the logged oil prices after inspection of acfipacf of the '
  452. 'differenced series ( that seems stationary ) : arima ( 1, 1, 1 '
  453. ') or arima ( 2, 1, 1 ), the former has lower aic arima ( lop, '
  454. 'order - c ( 11, 1 ) ) coefficients : arl mal0. 2987 0. 5700 s. '
  455. 'e. 0. 2009 0. 1723 sigma ^ 2 = 0. 006642 : 11 261. 11, a = 518. '
  456. '22 alternative r command with equivalent result : arima ( drop, '
  457. 'order - c ( 1, 0, 1 ), include mean - false ) 291 arima, sarima '
  458. '& garch example : residuals for arima ( 1, 1, 1 ) residuals '
  459. 'from arima ( 1, 1, 1 ) rwwlimhlwkv wwmmv 5 1990 1995 2000 2005 '
  460. '3 3 3 g 8 3 3 3 2 8 3 3 5 10 15 20 25 30 35 5 10 15 20 25 30 35 '
  461. 'lag lag 292 ivrwukv arima, sarima & garch rewriting arima as '
  462. 'non - stationary arm',
  463. 'id_within_doc': 0}]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement