Guest User

Untitled

a guest
Jan 17th, 2018
99
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 30.90 KB | None | 0 0
  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# GLM, variance function, var_weights and scale\n",
  8. "## or families faking each other\n",
  9. "\n",
  10. "This notebook illustrates a few results for GLM and shows that parameter estimates and their standard errors only depend on the link and the variance but not on other properties of the distribution, i.e. family. The variance is composed of three terms, variance function implied by the model, pre-defined var_weights and the overall scale.\n",
  11. "\n",
  12. "In the first example we use Gaussian and Poisson to replicate equivalent results."
  13. ]
  14. },
  15. {
  16. "cell_type": "code",
  17. "execution_count": 1,
  18. "metadata": {
  19. "collapsed": true
  20. },
  21. "outputs": [],
  22. "source": [
  23. "import numpy as np\n",
  24. "import pandas as pd\n",
  25. "#import statsmodels.api as sm\n",
  26. "from statsmodels.tools.tools import add_constant\n",
  27. "from statsmodels.genmod.generalized_linear_model import GLM\n",
  28. "from statsmodels.genmod.families import family\n",
  29. "from statsmodels.genmod.families import links\n",
  30. "from statsmodels.genmod.families import varfuncs\n",
  31. "from statsmodels.iolib.summary2 import summary_col"
  32. ]
  33. },
  34. {
  35. "cell_type": "code",
  36. "execution_count": 2,
  37. "metadata": {
  38. "collapsed": false
  39. },
  40. "outputs": [],
  41. "source": [
  42. "nobs, k_exog = 100, 5\n",
  43. "np.random.seed(987125)\n",
  44. "x = np.random.randn(nobs, k_exog - 1)\n",
  45. "x = add_constant(x)\n",
  46. "\n",
  47. "y_true = x.sum(1) / 2\n",
  48. "y = y_true + 2 * np.random.randn(nobs)"
  49. ]
  50. },
  51. {
  52. "cell_type": "markdown",
  53. "metadata": {},
  54. "source": [
  55. "## Poisson or positive valued data\n",
  56. "\n",
  57. "In the first group of examples we use log link for positive valued data. We generate the dependent variable using Poisson but the same results can be obtained for continuous non-negative valued data."
  58. ]
  59. },
  60. {
  61. "cell_type": "code",
  62. "execution_count": 3,
  63. "metadata": {
  64. "collapsed": false
  65. },
  66. "outputs": [
  67. {
  68. "name": "stdout",
  69. "output_type": "stream",
  70. "text": [
  71. " Generalized Linear Model Regression Results \n",
  72. "==============================================================================\n",
  73. "Dep. Variable: y No. Observations: 100\n",
  74. "Model: GLM Df Residuals: 95\n",
  75. "Model Family: Poisson Df Model: 4\n",
  76. "Link Function: log Scale: 1.0000\n",
  77. "Method: IRLS Log-Likelihood: -154.10\n",
  78. "Date: Mon, 08 Jan 2018 Deviance: 92.192\n",
  79. "Time: 17:22:10 Pearson chi2: 86.3\n",
  80. "No. Iterations: 5 Covariance Type: nonrobust\n",
  81. "==============================================================================\n",
  82. " coef std err z P>|z| [0.025 0.975]\n",
  83. "------------------------------------------------------------------------------\n",
  84. "const 0.4741 0.090 5.283 0.000 0.298 0.650\n",
  85. "x1 0.5156 0.070 7.349 0.000 0.378 0.653\n",
  86. "x2 0.4700 0.056 8.323 0.000 0.359 0.581\n",
  87. "x3 0.5534 0.067 8.216 0.000 0.421 0.685\n",
  88. "x4 0.5583 0.073 7.674 0.000 0.416 0.701\n",
  89. "==============================================================================\n"
  90. ]
  91. }
  92. ],
  93. "source": [
  94. "yp = np.random.poisson(np.exp(y_true))\n",
  95. "\n",
  96. "fam = family.Poisson()\n",
  97. "mod_poi = GLM(yp, x, family=fam)\n",
  98. "res_poi = mod_poi.fit()\n",
  99. "print(res_poi.summary())"
  100. ]
  101. },
  102. {
  103. "cell_type": "code",
  104. "execution_count": 4,
  105. "metadata": {
  106. "collapsed": false
  107. },
  108. "outputs": [
  109. {
  110. "name": "stdout",
  111. "output_type": "stream",
  112. "text": [
  113. " Generalized Linear Model Regression Results \n",
  114. "==============================================================================\n",
  115. "Dep. Variable: y No. Observations: 100\n",
  116. "Model: GLM Df Residuals: 95\n",
  117. "Model Family: Gaussian Df Model: 4\n",
  118. "Link Function: log Scale: 1.8064\n",
  119. "Method: IRLS Log-Likelihood: -168.96\n",
  120. "Date: Mon, 08 Jan 2018 Deviance: 171.60\n",
  121. "Time: 17:22:11 Pearson chi2: 172.\n",
  122. "No. Iterations: 8 Covariance Type: nonrobust\n",
  123. "==============================================================================\n",
  124. " coef std err z P>|z| [0.025 0.975]\n",
  125. "------------------------------------------------------------------------------\n",
  126. "const 0.4049 0.082 4.940 0.000 0.244 0.566\n",
  127. "x1 0.5119 0.045 11.335 0.000 0.423 0.600\n",
  128. "x2 0.5252 0.037 14.119 0.000 0.452 0.598\n",
  129. "x3 0.6049 0.041 14.804 0.000 0.525 0.685\n",
  130. "x4 0.5444 0.053 10.241 0.000 0.440 0.649\n",
  131. "==============================================================================\n"
  132. ]
  133. }
  134. ],
  135. "source": [
  136. "fam_gau = family.Gaussian(link=links.log)\n",
  137. "mod_gau = GLM(yp, x, family=fam_gau)\n",
  138. "res_gau = mod_gau.fit()\n",
  139. "print(res_gau.summary())"
  140. ]
  141. },
  142. {
  143. "cell_type": "code",
  144. "execution_count": 5,
  145. "metadata": {
  146. "collapsed": false
  147. },
  148. "outputs": [
  149. {
  150. "name": "stdout",
  151. "output_type": "stream",
  152. "text": [
  153. " Generalized Linear Model Regression Results \n",
  154. "==============================================================================\n",
  155. "Dep. Variable: y No. Observations: 100\n",
  156. "Model: GLM Df Residuals: 95\n",
  157. "Model Family: Poisson Df Model: 4\n",
  158. "Link Function: log Scale: 0.90875\n",
  159. "Method: IRLS Log-Likelihood: -169.57\n",
  160. "Date: Mon, 08 Jan 2018 Deviance: 92.192\n",
  161. "Time: 17:22:11 Pearson chi2: 86.3\n",
  162. "No. Iterations: 7 Covariance Type: nonrobust\n",
  163. "==============================================================================\n",
  164. " coef std err z P>|z| [0.025 0.975]\n",
  165. "------------------------------------------------------------------------------\n",
  166. "const 0.4741 0.086 5.542 0.000 0.306 0.642\n",
  167. "x1 0.5156 0.067 7.709 0.000 0.385 0.647\n",
  168. "x2 0.4700 0.054 8.731 0.000 0.365 0.576\n",
  169. "x3 0.5534 0.064 8.618 0.000 0.428 0.679\n",
  170. "x4 0.5583 0.069 8.050 0.000 0.422 0.694\n",
  171. "==============================================================================\n"
  172. ]
  173. }
  174. ],
  175. "source": [
  176. "fam = family.Poisson()\n",
  177. "mod_poiq = GLM(yp, x, family=fam)\n",
  178. "res_poiq = mod_poiq.fit(scale='x2')\n",
  179. "print(res_poiq.summary())"
  180. ]
  181. },
  182. {
  183. "cell_type": "code",
  184. "execution_count": 6,
  185. "metadata": {
  186. "collapsed": false
  187. },
  188. "outputs": [
  189. {
  190. "name": "stdout",
  191. "output_type": "stream",
  192. "text": [
  193. " Generalized Linear Model Regression Results \n",
  194. "==============================================================================\n",
  195. "Dep. Variable: y No. Observations: 100\n",
  196. "Model: GLM Df Residuals: 95\n",
  197. "Model Family: Gaussian Df Model: 4\n",
  198. "Link Function: log Scale: 1.8064\n",
  199. "Method: IRLS Log-Likelihood: -168.96\n",
  200. "Date: Mon, 08 Jan 2018 Deviance: 171.60\n",
  201. "Time: 17:22:11 Pearson chi2: 172.\n",
  202. "No. Iterations: 8 Covariance Type: nonrobust\n",
  203. "==============================================================================\n",
  204. " coef std err z P>|z| [0.025 0.975]\n",
  205. "------------------------------------------------------------------------------\n",
  206. "const 0.4049 0.082 4.940 0.000 0.244 0.566\n",
  207. "x1 0.5119 0.045 11.335 0.000 0.423 0.600\n",
  208. "x2 0.5252 0.037 14.119 0.000 0.452 0.598\n",
  209. "x3 0.6049 0.041 14.804 0.000 0.525 0.685\n",
  210. "x4 0.5444 0.053 10.241 0.000 0.440 0.649\n",
  211. "==============================================================================\n"
  212. ]
  213. }
  214. ],
  215. "source": [
  216. "fam_gaup = family.Gaussian(link=links.log)\n",
  217. "mod_gaup = GLM(yp, x, family=fam_gau, var_weights=1/res_poi.fittedvalues)\n",
  218. "res_gaup = mod_gaup.fit()\n",
  219. "print(res_gau.summary())"
  220. ]
  221. },
  222. {
  223. "cell_type": "code",
  224. "execution_count": 7,
  225. "metadata": {
  226. "collapsed": false
  227. },
  228. "outputs": [
  229. {
  230. "name": "stdout",
  231. "output_type": "stream",
  232. "text": [
  233. " Generalized Linear Model Regression Results \n",
  234. "==============================================================================\n",
  235. "Dep. Variable: y No. Observations: 100\n",
  236. "Model: GLM Df Residuals: 95\n",
  237. "Model Family: Gaussian Df Model: 4\n",
  238. "Link Function: log Scale: 1.0000\n",
  239. "Method: IRLS Log-Likelihood: -159.62\n",
  240. "Date: Mon, 08 Jan 2018 Deviance: 86.331\n",
  241. "Time: 17:22:11 Pearson chi2: 86.3\n",
  242. "No. Iterations: 8 Covariance Type: nonrobust\n",
  243. "==============================================================================\n",
  244. " coef std err z P>|z| [0.025 0.975]\n",
  245. "------------------------------------------------------------------------------\n",
  246. "const 0.4741 0.090 5.283 0.000 0.298 0.650\n",
  247. "x1 0.5156 0.070 7.349 0.000 0.378 0.653\n",
  248. "x2 0.4700 0.056 8.323 0.000 0.359 0.581\n",
  249. "x3 0.5534 0.067 8.216 0.000 0.421 0.685\n",
  250. "x4 0.5583 0.073 7.674 0.000 0.416 0.701\n",
  251. "==============================================================================\n"
  252. ]
  253. }
  254. ],
  255. "source": [
  256. "fam_gaup = family.Gaussian(link=links.log)\n",
  257. "mod_gaups = GLM(yp, x, family=fam_gau, var_weights=1/res_poi.fittedvalues)\n",
  258. "res_gaups = mod_gaups.fit(scale=1.)\n",
  259. "print(res_gaups.summary())"
  260. ]
  261. },
  262. {
  263. "cell_type": "code",
  264. "execution_count": 8,
  265. "metadata": {
  266. "collapsed": false
  267. },
  268. "outputs": [
  269. {
  270. "name": "stdout",
  271. "output_type": "stream",
  272. "text": [
  273. " Generalized Linear Model Regression Results \n",
  274. "==============================================================================\n",
  275. "Dep. Variable: y No. Observations: 100\n",
  276. "Model: GLM Df Residuals: 95\n",
  277. "Model Family: Poisson Df Model: 4\n",
  278. "Link Function: log Scale: 1.8064\n",
  279. "Method: IRLS Log-Likelihood: -281.10\n",
  280. "Date: Mon, 08 Jan 2018 Deviance: 192.31\n",
  281. "Time: 17:22:11 Pearson chi2: 172.\n",
  282. "No. Iterations: 7 Covariance Type: nonrobust\n",
  283. "==============================================================================\n",
  284. " coef std err z P>|z| [0.025 0.975]\n",
  285. "------------------------------------------------------------------------------\n",
  286. "const 0.4049 0.082 4.940 0.000 0.244 0.566\n",
  287. "x1 0.5119 0.045 11.335 0.000 0.423 0.600\n",
  288. "x2 0.5252 0.037 14.119 0.000 0.452 0.598\n",
  289. "x3 0.6049 0.041 14.804 0.000 0.525 0.685\n",
  290. "x4 0.5444 0.053 10.241 0.000 0.440 0.649\n",
  291. "==============================================================================\n"
  292. ]
  293. }
  294. ],
  295. "source": [
  296. "mod_poig = GLM(yp, x, family=family.Poisson(), var_weights=res_gau.fittedvalues)\n",
  297. "res_poig = mod_poig.fit(scale='x2')\n",
  298. "print(res_poig.summary())"
  299. ]
  300. },
  301. {
  302. "cell_type": "code",
  303. "execution_count": 9,
  304. "metadata": {
  305. "collapsed": false
  306. },
  307. "outputs": [
  308. {
  309. "data": {
  310. "text/html": [
  311. "<table class=\"simpletable\">\n",
  312. "<tr>\n",
  313. " <td></td> <th>Gaussian</th> <th>Poisson_g</th> <th>Poisson</th> <th>Gauss_p</th> <th>Poisson-QMLE</th> <th>Gaussian-varweights</th>\n",
  314. "</tr>\n",
  315. "<tr>\n",
  316. " <th>const</th> <td>0.4049</td> <td>0.4049</td> <td>0.4741</td> <td>0.4741</td> <td>0.4741</td> <td>0.4741</td> \n",
  317. "</tr>\n",
  318. "<tr>\n",
  319. " <th></th> <td>(0.0820)</td> <td>(0.0820)</td> <td>(0.0897)</td> <td>(0.0897)</td> <td>(0.0855)</td> <td>(0.0855)</td> \n",
  320. "</tr>\n",
  321. "<tr>\n",
  322. " <th>x1</th> <td>0.5119</td> <td>0.5119</td> <td>0.5156</td> <td>0.5156</td> <td>0.5156</td> <td>0.5156</td> \n",
  323. "</tr>\n",
  324. "<tr>\n",
  325. " <th></th> <td>(0.0452)</td> <td>(0.0452)</td> <td>(0.0702)</td> <td>(0.0702)</td> <td>(0.0669)</td> <td>(0.0669)</td> \n",
  326. "</tr>\n",
  327. "<tr>\n",
  328. " <th>x2</th> <td>0.5252</td> <td>0.5252</td> <td>0.4700</td> <td>0.4700</td> <td>0.4700</td> <td>0.4700</td> \n",
  329. "</tr>\n",
  330. "<tr>\n",
  331. " <th></th> <td>(0.0372)</td> <td>(0.0372)</td> <td>(0.0565)</td> <td>(0.0565)</td> <td>(0.0538)</td> <td>(0.0538)</td> \n",
  332. "</tr>\n",
  333. "<tr>\n",
  334. " <th>x3</th> <td>0.6049</td> <td>0.6049</td> <td>0.5534</td> <td>0.5534</td> <td>0.5534</td> <td>0.5534</td> \n",
  335. "</tr>\n",
  336. "<tr>\n",
  337. " <th></th> <td>(0.0409)</td> <td>(0.0409)</td> <td>(0.0674)</td> <td>(0.0674)</td> <td>(0.0642)</td> <td>(0.0642)</td> \n",
  338. "</tr>\n",
  339. "<tr>\n",
  340. " <th>x4</th> <td>0.5444</td> <td>0.5444</td> <td>0.5583</td> <td>0.5583</td> <td>0.5583</td> <td>0.5583</td> \n",
  341. "</tr>\n",
  342. "<tr>\n",
  343. " <th></th> <td>(0.0532)</td> <td>(0.0532)</td> <td>(0.0728)</td> <td>(0.0728)</td> <td>(0.0694)</td> <td>(0.0694)</td> \n",
  344. "</tr>\n",
  345. "</table>"
  346. ],
  347. "text/plain": [
  348. "<class 'statsmodels.iolib.summary2.Summary'>\n",
  349. "\"\"\"\n",
  350. "\n",
  351. "===========================================================================\n",
  352. " Gaussian Poisson_g Poisson Gauss_p Poisson-QMLE Gaussian-varweights\n",
  353. "---------------------------------------------------------------------------\n",
  354. "const 0.4049 0.4049 0.4741 0.4741 0.4741 0.4741 \n",
  355. " (0.0820) (0.0820) (0.0897) (0.0897) (0.0855) (0.0855) \n",
  356. "x1 0.5119 0.5119 0.5156 0.5156 0.5156 0.5156 \n",
  357. " (0.0452) (0.0452) (0.0702) (0.0702) (0.0669) (0.0669) \n",
  358. "x2 0.5252 0.5252 0.4700 0.4700 0.4700 0.4700 \n",
  359. " (0.0372) (0.0372) (0.0565) (0.0565) (0.0538) (0.0538) \n",
  360. "x3 0.6049 0.6049 0.5534 0.5534 0.5534 0.5534 \n",
  361. " (0.0409) (0.0409) (0.0674) (0.0674) (0.0642) (0.0642) \n",
  362. "x4 0.5444 0.5444 0.5583 0.5583 0.5583 0.5583 \n",
  363. " (0.0532) (0.0532) (0.0728) (0.0728) (0.0694) (0.0694) \n",
  364. "===========================================================================\n",
  365. "Standard errors in parentheses.\n",
  366. "\"\"\""
  367. ]
  368. },
  369. "execution_count": 9,
  370. "metadata": {},
  371. "output_type": "execute_result"
  372. }
  373. ],
  374. "source": [
  375. "model_list = [res_gau, res_poig, res_poi, res_gaups, res_poiq, res_gaup]\n",
  376. "model_names = 'Gaussian Poisson_g Poisson Gauss_p Poisson-QMLE Gaussian-varweights'.split()\n",
  377. "summary_col(model_list, model_names=model_names)"
  378. ]
  379. },
  380. {
  381. "cell_type": "markdown",
  382. "metadata": {},
  383. "source": [
  384. "### Interpretation of results\n",
  385. "\n",
  386. "`Gaussian` and `Poisson` are the standard likelihood models both with log-link. The second model in the table is a Poisson model that replicates the variance of the Gaussian model. The remaining three models use the variance function corresponding to Poisson but use different scale estimates. The scale estimate only affects the standard errors but not the parameter estimate.\n",
  387. "\n",
  388. "- Poisson_g : Poisson model that replicates the standard Gaussian model using var_weights and Pearson chi2 scale estimate\n",
  389. "- Gauss_p : Gaussian with var_weights corresponding to the Poisson estimate and scale fixed at 1. This replicates parameters and standard errors of the standard Poisson MLE\n",
  390. "- Poisson_QMLE : This model estimates a standard Poisson model, but uses Pearson chi-square as scale estimate. This is the standard QMLE in the GLM literature.\n",
  391. "- Gaussian-varweights : Gaussian model with var_weights with scale estimated by default using Pearson chi-square. This reproduces Poisson-QMLE."
  392. ]
  393. },
  394. {
  395. "cell_type": "code",
  396. "execution_count": null,
  397. "metadata": {
  398. "collapsed": true
  399. },
  400. "outputs": [],
  401. "source": []
  402. },
  403. {
  404. "cell_type": "markdown",
  405. "metadata": {},
  406. "source": [
  407. "## Faking Binomial Model with Poisson\n",
  408. "\n",
  409. "In this example we create Binomial or Bernoulli data with logit link and replicate parameters and their standard errors of the binomial GLM by Poisson with adjusted variance. We override the variance in two ways, in the first we change the underlying variance function. In the second case we use var_weights to replicate the implicite variance of the Binomial model."
  410. ]
  411. },
  412. {
  413. "cell_type": "code",
  414. "execution_count": 10,
  415. "metadata": {
  416. "collapsed": false
  417. },
  418. "outputs": [
  419. {
  420. "name": "stdout",
  421. "output_type": "stream",
  422. "text": [
  423. " Generalized Linear Model Regression Results \n",
  424. "==============================================================================\n",
  425. "Dep. Variable: y No. Observations: 100\n",
  426. "Model: GLM Df Residuals: 95\n",
  427. "Model Family: Binomial Df Model: 4\n",
  428. "Link Function: logit Scale: 1.0000\n",
  429. "Method: IRLS Log-Likelihood: -56.779\n",
  430. "Date: Mon, 08 Jan 2018 Deviance: 113.56\n",
  431. "Time: 17:22:11 Pearson chi2: 99.3\n",
  432. "No. Iterations: 4 Covariance Type: nonrobust\n",
  433. "==============================================================================\n",
  434. " coef std err z P>|z| [0.025 0.975]\n",
  435. "------------------------------------------------------------------------------\n",
  436. "const 0.4600 0.232 1.986 0.047 0.006 0.914\n",
  437. "x1 0.6948 0.253 2.745 0.006 0.199 1.191\n",
  438. "x2 0.5570 0.226 2.470 0.014 0.115 0.999\n",
  439. "x3 0.5921 0.243 2.441 0.015 0.117 1.068\n",
  440. "x4 0.3510 0.238 1.472 0.141 -0.116 0.818\n",
  441. "==============================================================================\n"
  442. ]
  443. }
  444. ],
  445. "source": [
  446. "yb = np.random.binomial(1, 1 / (1 + np.exp(-y_true)))\n",
  447. "\n",
  448. "mod_bin = GLM(yb, x, family=family.Binomial())\n",
  449. "res_bin = mod_bin.fit()\n",
  450. "print(res_bin.summary())"
  451. ]
  452. },
  453. {
  454. "cell_type": "code",
  455. "execution_count": 11,
  456. "metadata": {
  457. "collapsed": false
  458. },
  459. "outputs": [
  460. {
  461. "name": "stdout",
  462. "output_type": "stream",
  463. "text": [
  464. " Generalized Linear Model Regression Results \n",
  465. "==============================================================================\n",
  466. "Dep. Variable: y No. Observations: 100\n",
  467. "Model: GLM Df Residuals: 95\n",
  468. "Model Family: Poisson Df Model: 4\n",
  469. "Link Function: logit Scale: 1.0000\n",
  470. "Method: IRLS Log-Likelihood: -85.525\n",
  471. "Date: Mon, 08 Jan 2018 Deviance: 51.050\n",
  472. "Time: 17:22:11 Pearson chi2: 99.3\n",
  473. "No. Iterations: 5 Covariance Type: nonrobust\n",
  474. "==============================================================================\n",
  475. " coef std err z P>|z| [0.025 0.975]\n",
  476. "------------------------------------------------------------------------------\n",
  477. "const 0.4600 0.232 1.986 0.047 0.006 0.914\n",
  478. "x1 0.6948 0.253 2.745 0.006 0.199 1.191\n",
  479. "x2 0.5570 0.226 2.470 0.014 0.115 0.999\n",
  480. "x3 0.5921 0.243 2.441 0.015 0.117 1.068\n",
  481. "x4 0.3510 0.238 1.472 0.141 -0.116 0.818\n",
  482. "==============================================================================\n"
  483. ]
  484. },
  485. {
  486. "name": "stderr",
  487. "output_type": "stream",
  488. "text": [
  489. "m:\\josef_new\\eclipse_ws\\statsmodels\\statsmodels_py34_pr\\statsmodels\\genmod\\generalized_linear_model.py:309: DomainWarning: The logit link function does not respect the domain of the Poisson family.\n",
  490. " DomainWarning)\n"
  491. ]
  492. }
  493. ],
  494. "source": [
  495. "fam = family.Poisson()\n",
  496. "# need to work around link restriction\n",
  497. "if links.Logit not in fam.links:\n",
  498. " fam.links.append(links.Logit)\n",
  499. "fam.link=links.logit()\n",
  500. "fam.variance = varfuncs.Binomial(n=1)\n",
  501. "mod_poi_vb = GLM(yb, x, family=fam)\n",
  502. "res_poi_vb = mod_poi_vb.fit()\n",
  503. "print(res_poi_vb.summary())"
  504. ]
  505. },
  506. {
  507. "cell_type": "code",
  508. "execution_count": 12,
  509. "metadata": {
  510. "collapsed": false
  511. },
  512. "outputs": [
  513. {
  514. "name": "stdout",
  515. "output_type": "stream",
  516. "text": [
  517. " Generalized Linear Model Regression Results \n",
  518. "==============================================================================\n",
  519. "Dep. Variable: y No. Observations: 100\n",
  520. "Model: GLM Df Residuals: 95\n",
  521. "Model Family: Poisson Df Model: 4\n",
  522. "Link Function: logit Scale: 1.0000\n",
  523. "Method: IRLS Log-Likelihood: -353.77\n",
  524. "Date: Mon, 08 Jan 2018 Deviance: 154.74\n",
  525. "Time: 17:22:11 Pearson chi2: 99.3\n",
  526. "No. Iterations: 21 Covariance Type: nonrobust\n",
  527. "==============================================================================\n",
  528. " coef std err z P>|z| [0.025 0.975]\n",
  529. "------------------------------------------------------------------------------\n",
  530. "const 0.4600 0.232 1.986 0.047 0.006 0.914\n",
  531. "x1 0.6948 0.253 2.745 0.006 0.199 1.191\n",
  532. "x2 0.5570 0.226 2.470 0.014 0.115 0.999\n",
  533. "x3 0.5921 0.243 2.441 0.015 0.117 1.068\n",
  534. "x4 0.3510 0.238 1.472 0.141 -0.116 0.818\n",
  535. "==============================================================================\n"
  536. ]
  537. },
  538. {
  539. "name": "stderr",
  540. "output_type": "stream",
  541. "text": [
  542. "m:\\josef_new\\eclipse_ws\\statsmodels\\statsmodels_py34_pr\\statsmodels\\genmod\\generalized_linear_model.py:309: DomainWarning: The logit link function does not respect the domain of the Poisson family.\n",
  543. " DomainWarning)\n"
  544. ]
  545. }
  546. ],
  547. "source": [
  548. "var_binom = res_bin.model.family.variance(res_bin.fittedvalues)\n",
  549. "\n",
  550. "if links.Logit not in fam.links:\n",
  551. " family.Poisson.links.append(links.Logit)\n",
  552. "\n",
  553. "fam = family.Poisson(links.logit)\n",
  554. "mod_poi_vw = GLM(yb, x, family=fam, var_weights = res_bin.fittedvalues/var_binom)\n",
  555. "res_poi_vw = mod_poi_vw.fit()\n",
  556. "print(res_poi_vw.summary())"
  557. ]
  558. },
  559. {
  560. "cell_type": "code",
  561. "execution_count": 13,
  562. "metadata": {
  563. "collapsed": false
  564. },
  565. "outputs": [
  566. {
  567. "data": {
  568. "text/html": [
  569. "<table class=\"simpletable\">\n",
  570. "<tr>\n",
  571. " <td></td> <th>Binomial</th> <th>Poisson_varfunc</th> <th>Poisson-varweights</th>\n",
  572. "</tr>\n",
  573. "<tr>\n",
  574. " <th>const</th> <td>0.4600</td> <td>0.4600</td> <td>0.4600</td> \n",
  575. "</tr>\n",
  576. "<tr>\n",
  577. " <th></th> <td>(0.2316)</td> <td>(0.2316)</td> <td>(0.2316)</td> \n",
  578. "</tr>\n",
  579. "<tr>\n",
  580. " <th>x1</th> <td>0.6948</td> <td>0.6948</td> <td>0.6948</td> \n",
  581. "</tr>\n",
  582. "<tr>\n",
  583. " <th></th> <td>(0.2531)</td> <td>(0.2531)</td> <td>(0.2531)</td> \n",
  584. "</tr>\n",
  585. "<tr>\n",
  586. " <th>x2</th> <td>0.5570</td> <td>0.5570</td> <td>0.5570</td> \n",
  587. "</tr>\n",
  588. "<tr>\n",
  589. " <th></th> <td>(0.2255)</td> <td>(0.2255)</td> <td>(0.2255)</td> \n",
  590. "</tr>\n",
  591. "<tr>\n",
  592. " <th>x3</th> <td>0.5921</td> <td>0.5921</td> <td>0.5921</td> \n",
  593. "</tr>\n",
  594. "<tr>\n",
  595. " <th></th> <td>(0.2426)</td> <td>(0.2426)</td> <td>(0.2426)</td> \n",
  596. "</tr>\n",
  597. "<tr>\n",
  598. " <th>x4</th> <td>0.3510</td> <td>0.3510</td> <td>0.3510</td> \n",
  599. "</tr>\n",
  600. "<tr>\n",
  601. " <th></th> <td>(0.2384)</td> <td>(0.2384)</td> <td>(0.2384)</td> \n",
  602. "</tr>\n",
  603. "</table>"
  604. ],
  605. "text/plain": [
  606. "<class 'statsmodels.iolib.summary2.Summary'>\n",
  607. "\"\"\"\n",
  608. "\n",
  609. "=================================================\n",
  610. " Binomial Poisson_varfunc Poisson-varweights\n",
  611. "-------------------------------------------------\n",
  612. "const 0.4600 0.4600 0.4600 \n",
  613. " (0.2316) (0.2316) (0.2316) \n",
  614. "x1 0.6948 0.6948 0.6948 \n",
  615. " (0.2531) (0.2531) (0.2531) \n",
  616. "x2 0.5570 0.5570 0.5570 \n",
  617. " (0.2255) (0.2255) (0.2255) \n",
  618. "x3 0.5921 0.5921 0.5921 \n",
  619. " (0.2426) (0.2426) (0.2426) \n",
  620. "x4 0.3510 0.3510 0.3510 \n",
  621. " (0.2384) (0.2384) (0.2384) \n",
  622. "=================================================\n",
  623. "Standard errors in parentheses.\n",
  624. "\"\"\""
  625. ]
  626. },
  627. "execution_count": 13,
  628. "metadata": {},
  629. "output_type": "execute_result"
  630. }
  631. ],
  632. "source": [
  633. "model_list = [res_bin, res_poi_vb, res_poi_vw]\n",
  634. "model_names = 'Binomial Poisson_varfunc Poisson-varweights'.split()\n",
  635. "summary_col(model_list, model_names=model_names)"
  636. ]
  637. },
  638. {
  639. "cell_type": "code",
  640. "execution_count": null,
  641. "metadata": {
  642. "collapsed": true
  643. },
  644. "outputs": [],
  645. "source": []
  646. }
  647. ],
  648. "metadata": {
  649. "kernelspec": {
  650. "display_name": "Python 3",
  651. "language": "python",
  652. "name": "python3"
  653. },
  654. "language_info": {
  655. "codemirror_mode": {
  656. "name": "ipython",
  657. "version": 3
  658. },
  659. "file_extension": ".py",
  660. "mimetype": "text/x-python",
  661. "name": "python",
  662. "nbconvert_exporter": "python",
  663. "pygments_lexer": "ipython3",
  664. "version": "3.4.4"
  665. }
  666. },
  667. "nbformat": 4,
  668. "nbformat_minor": 1
  669. }
Add Comment
Please, Sign In to add comment