Advertisement
Guest User

Untitled

a guest
Oct 23rd, 2019
81
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 16.22 KB | None | 0 0
  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
  8. "\n",
  9. "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name and collaborators below:"
  10. ]
  11. },
  12. {
  13. "cell_type": "code",
  14. "execution_count": 1,
  15. "metadata": {},
  16. "outputs": [],
  17. "source": [
  18. "NAME = \"Lucas Mudo de Araujo\"\n",
  19. "COLLABORATORS = \"\""
  20. ]
  21. },
  22. {
  23. "cell_type": "markdown",
  24. "metadata": {},
  25. "source": [
  26. "---"
  27. ]
  28. },
  29. {
  30. "cell_type": "markdown",
  31. "metadata": {
  32. "deletable": false,
  33. "editable": false,
  34. "nbgrader": {
  35. "checksum": "69779d40de39e2fcb822ffd099ed946a",
  36. "grade": false,
  37. "grade_id": "cell-361a67c139252f60",
  38. "locked": true,
  39. "schema_version": 1,
  40. "solution": false
  41. }
  42. },
  43. "source": [
  44. "# CS110 Pre-class Work 5.1\n",
  45. "\n",
  46. "## Question 1.\n",
  47. "\n",
  48. "### Question 1a.\n",
  49. "\n",
  50. "Write the worst sort function in the world, `worst_sort`. This function takes a list, and then randomly shuffles it until it happens to be in sorted order. Once it is in sorted order then the list is returned.\n"
  51. ]
  52. },
  53. {
  54. "cell_type": "code",
  55. "execution_count": 5,
  56. "metadata": {
  57. "deletable": false,
  58. "nbgrader": {
  59. "checksum": "c144735a0a3c9ee1e5a60e781fec46a9",
  60. "grade": false,
  61. "grade_id": "cell-ead9c74054a1c5da",
  62. "locked": false,
  63. "schema_version": 1,
  64. "solution": true
  65. }
  66. },
  67. "outputs": [
  68. {
  69. "data": {
  70. "text/plain": [
  71. "[2, 3, 4, 5]"
  72. ]
  73. },
  74. "execution_count": 5,
  75. "metadata": {},
  76. "output_type": "execute_result"
  77. }
  78. ],
  79. "source": [
  80. "import random, numpy as np\n",
  81. "def worst_sort(A):\n",
  82. " \"\"\"\n",
  83. " Sort A in ascending order by randomly shuffling its \n",
  84. " elements until they are in order.\n",
  85. " \n",
  86. " Input:\n",
  87. " - A: list of numerical values\n",
  88. " \n",
  89. " Output:\n",
  90. " - A: Sorted list\n",
  91. " \"\"\"\n",
  92. " B = A.copy()\n",
  93. " while A != sorted(B):\n",
  94. " A = randomize_in_place(A)\n",
  95. " return A\n",
  96. " \n",
  97. "def randomize_in_place(A):\n",
  98. " for i in range (len(A)):\n",
  99. " random_i = np.random.randint(0,len(A))\n",
  100. " A[i],A[random_i] = A[random_i],A[i]\n",
  101. " return A\n",
  102. "\n",
  103. "worst_sort([4,5,3,2])"
  104. ]
  105. },
  106. {
  107. "cell_type": "code",
  108. "execution_count": 6,
  109. "metadata": {
  110. "deletable": false,
  111. "editable": false,
  112. "nbgrader": {
  113. "checksum": "09c3f02f46b5ba86c3aa80498b9c8c34",
  114. "grade": true,
  115. "grade_id": "cell-ff4db3a3e8b04515",
  116. "locked": true,
  117. "points": 1,
  118. "schema_version": 1,
  119. "solution": false
  120. }
  121. },
  122. "outputs": [],
  123. "source": [
  124. "# Please ignore this cell. This cell is for us to implement the tests \n",
  125. "# to see if your code works properly. "
  126. ]
  127. },
  128. {
  129. "cell_type": "markdown",
  130. "metadata": {
  131. "deletable": false,
  132. "editable": false,
  133. "nbgrader": {
  134. "checksum": "18d2bd43c2db165cb6806cb1e14df4f1",
  135. "grade": false,
  136. "grade_id": "cell-3ffcfdae9ec3ea41",
  137. "locked": true,
  138. "schema_version": 1,
  139. "solution": false
  140. }
  141. },
  142. "source": [
  143. "### Question 1b.\n",
  144. "What is the best case complexity of this algorithm?"
  145. ]
  146. },
  147. {
  148. "cell_type": "markdown",
  149. "metadata": {
  150. "deletable": false,
  151. "nbgrader": {
  152. "checksum": "76bf94004078f73d21791758ac6db72b",
  153. "grade": true,
  154. "grade_id": "cell-afac80171b3b6232",
  155. "locked": false,
  156. "points": 0,
  157. "schema_version": 1,
  158. "solution": true
  159. }
  160. },
  161. "source": [
  162. "When the first permutation results in the ascending sorted array. Because I am using one of the randomization method introduced by Cormen et al. that produces a uniform random permutation, the probability of such combination is $\\frac{1}{n!}$."
  163. ]
  164. },
  165. {
  166. "cell_type": "markdown",
  167. "metadata": {
  168. "deletable": false,
  169. "editable": false,
  170. "nbgrader": {
  171. "checksum": "1b1210d5a27f9933086d0f3a0d234361",
  172. "grade": false,
  173. "grade_id": "cell-d98018127b7e79f4",
  174. "locked": true,
  175. "schema_version": 1,
  176. "solution": false
  177. }
  178. },
  179. "source": [
  180. "### Question 1c.\n",
  181. "\n",
  182. "What is the average case complexity?\n"
  183. ]
  184. },
  185. {
  186. "cell_type": "markdown",
  187. "metadata": {
  188. "deletable": false,
  189. "nbgrader": {
  190. "checksum": "97d1e74314efdae93ea73d510e233468",
  191. "grade": true,
  192. "grade_id": "cell-9d52d34daa1e3478",
  193. "locked": false,
  194. "points": 0,
  195. "schema_version": 1,
  196. "solution": true
  197. }
  198. },
  199. "source": [
  200. "We can expect the average-case complexity for such an algorithm to be the probability of finding the right permutation. Since such chance is $1/n!$, we can expect an average performance to be $n!$, assuming that we would obtain the desired combination at least once after trying $n!$ permutations. "
  201. ]
  202. },
  203. {
  204. "cell_type": "markdown",
  205. "metadata": {
  206. "deletable": false,
  207. "editable": false,
  208. "nbgrader": {
  209. "checksum": "fdbf5403114d2a68085e8172d0122685",
  210. "grade": false,
  211. "grade_id": "cell-d6d8ad45088488eb",
  212. "locked": true,
  213. "schema_version": 1,
  214. "solution": false
  215. }
  216. },
  217. "source": [
  218. "### Question 1d.\n",
  219. "\n",
  220. "For what size lists is this a feasible method?"
  221. ]
  222. },
  223. {
  224. "cell_type": "markdown",
  225. "metadata": {
  226. "deletable": false,
  227. "nbgrader": {
  228. "checksum": "080a5eecbe3afcffed278b7cbba7f51a",
  229. "grade": true,
  230. "grade_id": "cell-ab40f08768579798",
  231. "locked": false,
  232. "points": 0,
  233. "schema_version": 1,
  234. "solution": true
  235. }
  236. },
  237. "source": [
  238. "This sorting algorithm is reasonable only for inputs values that are substantially small. For big input size, this approach is very ineffective and doesn't apply any optmization technique. "
  239. ]
  240. },
  241. {
  242. "cell_type": "markdown",
  243. "metadata": {
  244. "deletable": false,
  245. "editable": false,
  246. "nbgrader": {
  247. "checksum": "cbe19c6348c19b5df04e84b537f6ebf0",
  248. "grade": false,
  249. "grade_id": "cell-56ae4e75a4086ee8",
  250. "locked": true,
  251. "schema_version": 1,
  252. "solution": false
  253. }
  254. },
  255. "source": [
  256. "### [Optional] Question 1e.\n",
  257. "\n",
  258. "Can you think of an even worse sorting method? In such a case, what would its complexity be? How big a list could you sort?"
  259. ]
  260. },
  261. {
  262. "cell_type": "markdown",
  263. "metadata": {
  264. "deletable": false,
  265. "nbgrader": {
  266. "checksum": "564b860df18fb1bb2fc987ec4240f98d",
  267. "grade": true,
  268. "grade_id": "cell-ac05806942341926",
  269. "locked": false,
  270. "points": 0,
  271. "schema_version": 1,
  272. "solution": true
  273. }
  274. },
  275. "source": [
  276. "YOUR ANSWER HERE"
  277. ]
  278. },
  279. {
  280. "cell_type": "markdown",
  281. "metadata": {
  282. "deletable": false,
  283. "editable": false,
  284. "nbgrader": {
  285. "checksum": "c8d4696c279cfb37108a84fdde271057",
  286. "grade": false,
  287. "grade_id": "cell-16012d2af0ef00ec",
  288. "locked": true,
  289. "schema_version": 1,
  290. "solution": false
  291. }
  292. },
  293. "source": [
  294. "## Question 2.\n",
  295. "Approximate median finder. Given a list and δ (a number between 0 and 0.5), the approximate median finder function returns a number that is guaranteed to lie between the (50-δ/2)% and (50+δ/2)% percentiles. Implement such a function by randomly selecting an element from the list and testing whether or not it lies within the bounds. If it doesn’t then retry with a new random element, but only a limited number of retries are allowed (you decide the maximum number of retries.)\n",
  296. "\n",
  297. "### Question 2a.\n",
  298. "Complete the following function\n"
  299. ]
  300. },
  301. {
  302. "cell_type": "code",
  303. "execution_count": 7,
  304. "metadata": {
  305. "deletable": false,
  306. "nbgrader": {
  307. "checksum": "9ce93a3c3e022b8a93a85566123df36f",
  308. "grade": false,
  309. "grade_id": "cell-2ab65512a6148d3c",
  310. "locked": false,
  311. "schema_version": 1,
  312. "solution": true
  313. }
  314. },
  315. "outputs": [],
  316. "source": [
  317. "import statistics \n",
  318. "def check_approx_med(A, num, delta):\n",
  319. " isApproxMed = False\n",
  320. " if len(A) == 1:\n",
  321. " return isApproxMed\n",
  322. " else:\n",
  323. " #Sorting the list to make sure that the percentiles\n",
  324. " #property holds\n",
  325. " A = sorted(A)\n",
  326. " #calculating the index of the left and right-hand side\n",
  327. " #percentiles \n",
  328. " low = int((0.5-delta/2)*len(A))\n",
  329. " high = int((0.5+delta/2)*len(A))\n",
  330. " if (num >= A[low] and num <= A[high]):\n",
  331. " isApproxMed = True\n",
  332. " return isApproxMed\n",
  333. " \"\"\"\n",
  334. " Given a list and a number in a list, determine whether it is within (50-delta/2)% and \n",
  335. " (50+delta/2)% percentiles.\n",
  336. " X% percentile of a list is defined to be the smallest number in the list that is as large as at least\n",
  337. " X% numbers in the list. \n",
  338. " \n",
  339. " Input:\n",
  340. " - A: list of numerical values\n",
  341. " - num: the number of interest\n",
  342. " - delta: the error, lies between 0 and 0.5\n",
  343. " \n",
  344. " Output:\n",
  345. " - isApproxMed: a boolean value indicating whether num is within the bound\n",
  346. " \"\"\"\n",
  347. "assert(check_approx_med([0,1], 0, .25) == True)\n",
  348. "assert(check_approx_med([0,1,2,3,4,5,6], 3, .25) == True)\n",
  349. "assert(check_approx_med([0], 0, .25) == False)"
  350. ]
  351. },
  352. {
  353. "cell_type": "code",
  354. "execution_count": 8,
  355. "metadata": {
  356. "deletable": false,
  357. "editable": false,
  358. "nbgrader": {
  359. "checksum": "e36da35c2400ca389ff7bb609167a787",
  360. "grade": true,
  361. "grade_id": "cell-df56d7433abb6517",
  362. "locked": true,
  363. "points": 1,
  364. "schema_version": 1,
  365. "solution": false
  366. }
  367. },
  368. "outputs": [],
  369. "source": [
  370. "assert(check_approx_med([0,1], 0, .25) == True)\n",
  371. "assert(check_approx_med([0], 0, .25) == False)"
  372. ]
  373. },
  374. {
  375. "cell_type": "markdown",
  376. "metadata": {
  377. "deletable": false,
  378. "editable": false,
  379. "nbgrader": {
  380. "checksum": "ea4f76410780e7134e2bc46fbe5663e9",
  381. "grade": false,
  382. "grade_id": "cell-18a16dfd102f137c",
  383. "locked": true,
  384. "schema_version": 1,
  385. "solution": false
  386. }
  387. },
  388. "source": [
  389. "### Question 2b.\n",
  390. "Complete the following function that makes use of `check_approx_med` above.\n"
  391. ]
  392. },
  393. {
  394. "cell_type": "code",
  395. "execution_count": 14,
  396. "metadata": {
  397. "deletable": false,
  398. "nbgrader": {
  399. "checksum": "2b3cfdf82e0efce4078cc52b3680f7be",
  400. "grade": false,
  401. "grade_id": "cell-e19685b545147dde",
  402. "locked": false,
  403. "schema_version": 1,
  404. "solution": true
  405. }
  406. },
  407. "outputs": [
  408. {
  409. "name": "stdout",
  410. "output_type": "stream",
  411. "text": [
  412. "2\n"
  413. ]
  414. },
  415. {
  416. "data": {
  417. "text/plain": [
  418. "4"
  419. ]
  420. },
  421. "execution_count": 14,
  422. "metadata": {},
  423. "output_type": "execute_result"
  424. }
  425. ],
  426. "source": [
  427. "def approx_med_finder(A, delta):\n",
  428. " A = sorted(A)\n",
  429. " \n",
  430. " for trial in range(int(100/delta)):\n",
  431. " guess = np.random.randint(0,len(A)+1)\n",
  432. " if (check_approx_med(A, guess, delta)):\n",
  433. " print(trial)\n",
  434. " return guess\n",
  435. " return None\n",
  436. " \"\"\"\n",
  437. " Given a list, find a number in the list that is between 50-delta/2% and 50+delta/2% percentiles\n",
  438. " \n",
  439. " Input:\n",
  440. " - A: list of numerical values\n",
  441. " - delta: the error, lies between 0 and 0.5\n",
  442. " \n",
  443. " Output:\n",
  444. " - num: the approximated median, if it is found within 100//delta trials (this is one possible \n",
  445. " maximum number of retries. While you are encouraged to play around with another limits, the \n",
  446. " submitted work must use this number.)\n",
  447. " - None: finding failed, if nothing found within 100//delta trials\n",
  448. "\n",
  449. " Note: the 100//delta is chosen here as it is the expected number of trials for the first successful finding to occur (using geometric distribution)\n",
  450. " \"\"\"\n",
  451. " \n",
  452. " # YOUR CODE HERE\n",
  453. " raise NotImplementedError()\n",
  454. "approx_med_finder([1,2,3,4,5,6,7],0.25)"
  455. ]
  456. },
  457. {
  458. "cell_type": "markdown",
  459. "metadata": {
  460. "deletable": false,
  461. "editable": false,
  462. "nbgrader": {
  463. "checksum": "77b77f7047dea39f9b7526a785f0e652",
  464. "grade": false,
  465. "grade_id": "cell-1863e0846b176982",
  466. "locked": true,
  467. "schema_version": 1,
  468. "solution": false
  469. }
  470. },
  471. "source": [
  472. "### Question 2c.\n",
  473. "\n",
  474. "What is the probability of failure in each random trial? What is the probability of failure after all the allowed random trials? Does this scale with δ or N (the number of elements in a list)?\n"
  475. ]
  476. },
  477. {
  478. "cell_type": "markdown",
  479. "metadata": {
  480. "deletable": false,
  481. "nbgrader": {
  482. "checksum": "f7d6ed353a91d12d66b53a735d84f81b",
  483. "grade": true,
  484. "grade_id": "cell-efe7ab76f6a2546a",
  485. "locked": false,
  486. "points": 0,
  487. "schema_version": 1,
  488. "solution": true
  489. }
  490. },
  491. "source": [
  492. "Since _numpy.random.randint_ generates random numbers based on a uniform distribution; the probability of failure directly depends on the delta. Since delta defines which interval in the array can be considered, with a delta of 0.25, there is a 75% chance of failure, since we are only interested in sampling elements that compose 25% of the array. <p>\n",
  493. "The probability of failures after a given number of $x$ trials is given by $0.75^x$. For example, the likelihood of all three trails failing is the probability of the first trial, times the possibility of the second (considering that the first failed), and so on. <p>\n",
  494. "As delta increases, the probability of failure also decreases. If delta is equaled to 0.5, the first trial has only a 50% chance of failure, while the second has 25% and so forth. Values of N will change the absolute values of these representations, although it will not change the probability."
  495. ]
  496. },
  497. {
  498. "cell_type": "markdown",
  499. "metadata": {
  500. "deletable": false,
  501. "editable": false,
  502. "nbgrader": {
  503. "checksum": "a602a3cad0ff35029b447d00dcee6b6c",
  504. "grade": false,
  505. "grade_id": "cell-a41fe377076cd283",
  506. "locked": true,
  507. "schema_version": 1,
  508. "solution": false
  509. }
  510. },
  511. "source": [
  512. "### Question 2d.\n",
  513. "Analyze the expected runtime of `approx_med_finder`. Note that because the function uses `check_approx_med`, you will most likely need to analyze the runtime of that function, too."
  514. ]
  515. },
  516. {
  517. "cell_type": "markdown",
  518. "metadata": {
  519. "deletable": false,
  520. "nbgrader": {
  521. "checksum": "255d88cf8e4f6ff8b3eb05cab2aa5f48",
  522. "grade": true,
  523. "grade_id": "cell-86b199798477fc2a",
  524. "locked": false,
  525. "points": 0,
  526. "schema_version": 1,
  527. "solution": true
  528. }
  529. },
  530. "source": [
  531. "For `approx_med_finder`, we know that there is a delta probability of sucess, since we are sampling from the entire list. Thus, after $1/\\delta$ trails we are expected the sample the correct number. For example, if Delta is 50%, half of the array will be in the desired interval. Thus, we have 50% of chance of selecting one of those numbers. Moreover, `check_approx_med` performs a constant time analysis (only checks if the number is in the desired interval with a if statement). <p>\n",
  532. "Thus, we can say that the runtime depends on $\\delta$ instead of $n$, and we expect to run `approx_med_finder`, in average, $1/\\delta$ times"
  533. ]
  534. },
  535. {
  536. "cell_type": "code",
  537. "execution_count": null,
  538. "metadata": {},
  539. "outputs": [],
  540. "source": []
  541. }
  542. ],
  543. "metadata": {
  544. "kernelspec": {
  545. "display_name": "Python 3",
  546. "language": "python",
  547. "name": "python3"
  548. },
  549. "language_info": {
  550. "codemirror_mode": {
  551. "name": "ipython",
  552. "version": 3
  553. },
  554. "file_extension": ".py",
  555. "mimetype": "text/x-python",
  556. "name": "python",
  557. "nbconvert_exporter": "python",
  558. "pygments_lexer": "ipython3",
  559. "version": "3.7.1"
  560. }
  561. },
  562. "nbformat": 4,
  563. "nbformat_minor": 2
  564. }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement