Advertisement
Guest User

Untitled

a guest
Jun 19th, 2019
95
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 7.47 KB | None | 0 0
  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# Exercícios"
  8. ]
  9. },
  10. {
  11. "cell_type": "markdown",
  12. "metadata": {},
  13. "source": [
  14. "## 1) Considerando o código abaixo, faça:\n",
  15. " -> Crie um df a partir de data cujo index seja labels. Faça os imports necessários\n",
  16. " -> Preencha os valores faltantes com a média dos valores\n",
  17. " -> Retorne uma lista única dos animais\n",
  18. " -> Compute as estatísticas básicas (contagem, soma, média, desvio padrão e variância)"
  19. ]
  20. },
  21. {
  22. "cell_type": "code",
  23. "execution_count": 69,
  24. "metadata": {},
  25. "outputs": [
  26. {
  27. "name": "stdout",
  28. "output_type": "stream",
  29. "text": [
  30. " age animal priority visits\n",
  31. "a 2.5000 cat yes 1\n",
  32. "b 3.0000 cat yes 3\n",
  33. "c 0.5000 snake no 2\n",
  34. "d 3.4375 dog yes 3\n",
  35. "e 5.0000 dog no 2\n",
  36. "f 2.0000 cat no 3\n",
  37. "g 4.5000 snake no 1\n",
  38. "h 3.4375 cat yes 1\n",
  39. "i 7.0000 dog no 2\n",
  40. "j 3.0000 dog no 1\n",
  41. "a cat\n",
  42. "b cat\n",
  43. "c snake\n",
  44. "d dog\n",
  45. "e dog\n",
  46. "f cat\n",
  47. "g snake\n",
  48. "h cat\n",
  49. "i dog\n",
  50. "j dog\n",
  51. "Name: animal, dtype: object\n",
  52. " age visits\n",
  53. "count 10.000000 10.000000\n",
  54. "mean 3.437500 1.900000\n",
  55. "std 1.770711 0.875595\n",
  56. "min 0.500000 1.000000\n",
  57. "25% 2.625000 1.000000\n",
  58. "50% 3.218750 2.000000\n",
  59. "75% 4.234375 2.750000\n",
  60. "max 7.000000 3.000000\n"
  61. ]
  62. }
  63. ],
  64. "source": [
  65. "import pandas as pd\n",
  66. "import numpy as np\n",
  67. "from pandas import DataFrame\n",
  68. "\n",
  69. "\n",
  70. "\n",
  71. "data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],\n",
  72. " 'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],\n",
  73. " 'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],\n",
  74. " 'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}\n",
  75. "\n",
  76. "\n",
  77. "\n",
  78. "labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']\n",
  79. "\n",
  80. "obj = DataFrame(data, index=labels)\n",
  81. "\n",
  82. "print(obj.fillna(obj.mean()))\n",
  83. "print(obj.animal)\n",
  84. "print(obj.fillna(obj.mean()).describe())"
  85. ]
  86. },
  87. {
  88. "cell_type": "code",
  89. "execution_count": null,
  90. "metadata": {},
  91. "outputs": [],
  92. "source": [
  93. "# resposta 1"
  94. ]
  95. },
  96. {
  97. "cell_type": "markdown",
  98. "metadata": {},
  99. "source": [
  100. "## 2) Considerando o mesmo código, faça:\n",
  101. " -> Crie uma função que multiplique o número de visitas por 2. Retorne o data frame completo com valores atualizados\n",
  102. " -> Crie uma função que insere uma coluna que contenha um ranking dos animais mais visitados. Retorne o dataframe ordenado de forma descendente\n",
  103. " -> Verifique se os animais da seguinte lista pertencem aos dados originais: New_animals = [‘cow’,’horse’,shark’]"
  104. ]
  105. },
  106. {
  107. "cell_type": "code",
  108. "execution_count": 74,
  109. "metadata": {},
  110. "outputs": [
  111. {
  112. "name": "stdout",
  113. "output_type": "stream",
  114. "text": [
  115. " age animal priority visits ranked_animal\n",
  116. "a 2.5 cat yes 1 8.5\n",
  117. "g 4.5 snake no 1 8.5\n",
  118. "h NaN cat yes 1 8.5\n",
  119. "j 3.0 dog no 1 8.5\n",
  120. "c 0.5 snake no 2 5.0\n",
  121. "e 5.0 dog no 2 5.0\n",
  122. "i 7.0 dog no 2 5.0\n",
  123. "b 3.0 cat yes 3 2.0\n",
  124. "d NaN dog yes 3 2.0\n",
  125. "f 2.0 cat no 3 2.0\n",
  126. " age animal priority visits ranked_animal validation\n",
  127. "a 2.5 cat yes 1 8.5 False\n",
  128. "b 3.0 cat yes 3 2.0 False\n",
  129. "c 0.5 snake no 2 5.0 False\n",
  130. "d NaN dog yes 3 2.0 False\n",
  131. "e 5.0 dog no 2 5.0 False\n",
  132. "f 2.0 cat no 3 2.0 False\n",
  133. "g 4.5 snake no 1 8.5 False\n",
  134. "h NaN cat yes 1 8.5 False\n",
  135. "i 7.0 dog no 2 5.0 False\n",
  136. "j 3.0 dog no 1 8.5 False\n"
  137. ]
  138. },
  139. {
  140. "name": "stderr",
  141. "output_type": "stream",
  142. "text": [
  143. "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\ipykernel_launcher.py:8: FutureWarning: by argument to sort_index is deprecated, please use .sort_values(by=...)\n",
  144. " \n"
  145. ]
  146. }
  147. ],
  148. "source": [
  149. "# resposta 2\n",
  150. "\n",
  151. "#obj['visits'] = obj['visits'].map(lambda x: x*2)\n",
  152. "#print(obj)\n",
  153. "\n",
  154. "obj['ranked_animal'] = obj['visits'].rank(ascending=0)\n",
  155. "\n",
  156. "print(obj.sort_index(by='ranked_animal',ascending=False))\n",
  157. "\n",
  158. "obj['validation'] = obj['animal'].isin(['cow','horse','shark'])\n",
  159. "print(obj)"
  160. ]
  161. },
  162. {
  163. "cell_type": "markdown",
  164. "metadata": {},
  165. "source": [
  166. "## 3) Crie uma função que filtre as linhas do df no código abaixo de acordo com a palavra-chave passada como parâmetro (apenas estado):"
  167. ]
  168. },
  169. {
  170. "cell_type": "code",
  171. "execution_count": 56,
  172. "metadata": {},
  173. "outputs": [
  174. {
  175. "name": "stdout",
  176. "output_type": "stream",
  177. "text": [
  178. " DateofBirth State\n",
  179. "Jane 1986-11-11 NY\n",
  180. "Nick 1999-05-12 TX\n",
  181. "Aaron 1976-01-01 FL\n",
  182. "Penelope 1986-06-01 AL\n",
  183. "Dean 1983-06-04 AK\n",
  184. "Christina 1990-03-07 TX\n",
  185. "Cornelia 1999-07-09 TX\n"
  186. ]
  187. }
  188. ],
  189. "source": [
  190. "df = pd.DataFrame({'DateofBirth':['1986-11-11','1999-05-12','1976-01-01',\n",
  191. " '1986-06-01','1983-06-04','1990-03-07',\n",
  192. " '1999-07-09'],\n",
  193. " 'State':['NY','TX','FL','AL','AK','TX','TX']},\n",
  194. " index=['Jane','Nick','Aaron','Penelope','Dean',\n",
  195. " 'Christina','Cornelia'])\n",
  196. "print(df)"
  197. ]
  198. },
  199. {
  200. "cell_type": "code",
  201. "execution_count": 59,
  202. "metadata": {},
  203. "outputs": [
  204. {
  205. "name": "stdout",
  206. "output_type": "stream",
  207. "text": [
  208. " DateofBirth State\n",
  209. "Nick 1999-05-12 TX\n",
  210. "Christina 1990-03-07 TX\n",
  211. "Cornelia 1999-07-09 TX\n"
  212. ]
  213. }
  214. ],
  215. "source": [
  216. "\n",
  217. "def filtra_estado(df,estado):\n",
  218. " return df.loc[df['State'] == estado]\n",
  219. "\n",
  220. "print(filtra_estado(df,'TX'))\n",
  221. " "
  222. ]
  223. }
  224. ],
  225. "metadata": {
  226. "kernelspec": {
  227. "display_name": "Python 3",
  228. "language": "python",
  229. "name": "python3"
  230. },
  231. "language_info": {
  232. "codemirror_mode": {
  233. "name": "ipython",
  234. "version": 3
  235. },
  236. "file_extension": ".py",
  237. "mimetype": "text/x-python",
  238. "name": "python",
  239. "nbconvert_exporter": "python",
  240. "pygments_lexer": "ipython3",
  241. "version": "3.6.4"
  242. }
  243. },
  244. "nbformat": 4,
  245. "nbformat_minor": 2
  246. }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement