Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
- "\n",
- "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name and collaborators below:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [],
- "source": [
- "NAME = \"Lucas Mudo de Araujo\"\n",
- "COLLABORATORS = \"\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "69779d40de39e2fcb822ffd099ed946a",
- "grade": false,
- "grade_id": "cell-361a67c139252f60",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "# CS110 Pre-class Work 5.1\n",
- "\n",
- "## Question 1.\n",
- "\n",
- "### Question 1a.\n",
- "\n",
- "Write the worst sort function in the world, `worst_sort`. This function takes a list, and then randomly shuffles it until it happens to be in sorted order. Once it is in sorted order then the list is returned.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "c144735a0a3c9ee1e5a60e781fec46a9",
- "grade": false,
- "grade_id": "cell-ead9c74054a1c5da",
- "locked": false,
- "schema_version": 1,
- "solution": true
- }
- },
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[2, 3, 4, 5]"
- ]
- },
- "execution_count": 5,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import random, numpy as np\n",
- "def worst_sort(A):\n",
- " \"\"\"\n",
- " Sort A in ascending order by randomly shuffling its \n",
- " elements until they are in order.\n",
- " \n",
- " Input:\n",
- " - A: list of numerical values\n",
- " \n",
- " Output:\n",
- " - A: Sorted list\n",
- " \"\"\"\n",
- " B = A.copy()\n",
- " while A != sorted(B):\n",
- " A = randomize_in_place(A)\n",
- " return A\n",
- " \n",
- "def randomize_in_place(A):\n",
- " for i in range (len(A)):\n",
- " random_i = np.random.randint(0,len(A))\n",
- " A[i],A[random_i] = A[random_i],A[i]\n",
- " return A\n",
- "\n",
- "worst_sort([4,5,3,2])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "09c3f02f46b5ba86c3aa80498b9c8c34",
- "grade": true,
- "grade_id": "cell-ff4db3a3e8b04515",
- "locked": true,
- "points": 1,
- "schema_version": 1,
- "solution": false
- }
- },
- "outputs": [],
- "source": [
- "# Please ignore this cell. This cell is for us to implement the tests \n",
- "# to see if your code works properly. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "18d2bd43c2db165cb6806cb1e14df4f1",
- "grade": false,
- "grade_id": "cell-3ffcfdae9ec3ea41",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### Question 1b.\n",
- "What is the best case complexity of this algorithm?"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "76bf94004078f73d21791758ac6db72b",
- "grade": true,
- "grade_id": "cell-afac80171b3b6232",
- "locked": false,
- "points": 0,
- "schema_version": 1,
- "solution": true
- }
- },
- "source": [
- "When the first permutation results in the ascending sorted array. Because I am using one of the randomization method introduced by Cormen et al. that produces a uniform random permutation, the probability of such combination is $\\frac{1}{n!}$."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "1b1210d5a27f9933086d0f3a0d234361",
- "grade": false,
- "grade_id": "cell-d98018127b7e79f4",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### Question 1c.\n",
- "\n",
- "What is the average case complexity?\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "97d1e74314efdae93ea73d510e233468",
- "grade": true,
- "grade_id": "cell-9d52d34daa1e3478",
- "locked": false,
- "points": 0,
- "schema_version": 1,
- "solution": true
- }
- },
- "source": [
- "We can expect the average-case complexity for such an algorithm to be the probability of finding the right permutation. Since such chance is $1/n!$, we can expect an average performance to be $n!$, assuming that we would obtain the desired combination at least once after trying $n!$ permutations. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "fdbf5403114d2a68085e8172d0122685",
- "grade": false,
- "grade_id": "cell-d6d8ad45088488eb",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### Question 1d.\n",
- "\n",
- "For what size lists is this a feasible method?"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "080a5eecbe3afcffed278b7cbba7f51a",
- "grade": true,
- "grade_id": "cell-ab40f08768579798",
- "locked": false,
- "points": 0,
- "schema_version": 1,
- "solution": true
- }
- },
- "source": [
- "This sorting algorithm is reasonable only for inputs values that are substantially small. For big input size, this approach is very ineffective and doesn't apply any optmization technique. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "cbe19c6348c19b5df04e84b537f6ebf0",
- "grade": false,
- "grade_id": "cell-56ae4e75a4086ee8",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### [Optional] Question 1e.\n",
- "\n",
- "Can you think of an even worse sorting method? In such a case, what would its complexity be? How big a list could you sort?"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "564b860df18fb1bb2fc987ec4240f98d",
- "grade": true,
- "grade_id": "cell-ac05806942341926",
- "locked": false,
- "points": 0,
- "schema_version": 1,
- "solution": true
- }
- },
- "source": [
- "YOUR ANSWER HERE"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "c8d4696c279cfb37108a84fdde271057",
- "grade": false,
- "grade_id": "cell-16012d2af0ef00ec",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "## Question 2.\n",
- "Approximate median finder. Given a list and δ (a number between 0 and 0.5), the approximate median finder function returns a number that is guaranteed to lie between the (50-δ/2)% and (50+δ/2)% percentiles. Implement such a function by randomly selecting an element from the list and testing whether or not it lies within the bounds. If it doesn’t then retry with a new random element, but only a limited number of retries are allowed (you decide the maximum number of retries.)\n",
- "\n",
- "### Question 2a.\n",
- "Complete the following function\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "9ce93a3c3e022b8a93a85566123df36f",
- "grade": false,
- "grade_id": "cell-2ab65512a6148d3c",
- "locked": false,
- "schema_version": 1,
- "solution": true
- }
- },
- "outputs": [],
- "source": [
- "import statistics \n",
- "def check_approx_med(A, num, delta):\n",
- " isApproxMed = False\n",
- " if len(A) == 1:\n",
- " return isApproxMed\n",
- " else:\n",
- " #Sorting the list to make sure that the percentiles\n",
- " #property holds\n",
- " A = sorted(A)\n",
- " #calculating the index of the left and right-hand side\n",
- " #percentiles \n",
- " low = int((0.5-delta/2)*len(A))\n",
- " high = int((0.5+delta/2)*len(A))\n",
- " if (num >= A[low] and num <= A[high]):\n",
- " isApproxMed = True\n",
- " return isApproxMed\n",
- " \"\"\"\n",
- " Given a list and a number in a list, determine whether it is within (50-delta/2)% and \n",
- " (50+delta/2)% percentiles.\n",
- " X% percentile of a list is defined to be the smallest number in the list that is as large as at least\n",
- " X% numbers in the list. \n",
- " \n",
- " Input:\n",
- " - A: list of numerical values\n",
- " - num: the number of interest\n",
- " - delta: the error, lies between 0 and 0.5\n",
- " \n",
- " Output:\n",
- " - isApproxMed: a boolean value indicating whether num is within the bound\n",
- " \"\"\"\n",
- "assert(check_approx_med([0,1], 0, .25) == True)\n",
- "assert(check_approx_med([0,1,2,3,4,5,6], 3, .25) == True)\n",
- "assert(check_approx_med([0], 0, .25) == False)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "e36da35c2400ca389ff7bb609167a787",
- "grade": true,
- "grade_id": "cell-df56d7433abb6517",
- "locked": true,
- "points": 1,
- "schema_version": 1,
- "solution": false
- }
- },
- "outputs": [],
- "source": [
- "assert(check_approx_med([0,1], 0, .25) == True)\n",
- "assert(check_approx_med([0], 0, .25) == False)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "ea4f76410780e7134e2bc46fbe5663e9",
- "grade": false,
- "grade_id": "cell-18a16dfd102f137c",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### Question 2b.\n",
- "Complete the following function that makes use of `check_approx_med` above.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "2b3cfdf82e0efce4078cc52b3680f7be",
- "grade": false,
- "grade_id": "cell-e19685b545147dde",
- "locked": false,
- "schema_version": 1,
- "solution": true
- }
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "2\n"
- ]
- },
- {
- "data": {
- "text/plain": [
- "4"
- ]
- },
- "execution_count": 14,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "def approx_med_finder(A, delta):\n",
- " A = sorted(A)\n",
- " \n",
- " for trial in range(int(100/delta)):\n",
- " guess = np.random.randint(0,len(A)+1)\n",
- " if (check_approx_med(A, guess, delta)):\n",
- " print(trial)\n",
- " return guess\n",
- " return None\n",
- " \"\"\"\n",
- " Given a list, find a number in the list that is between 50-delta/2% and 50+delta/2% percentiles\n",
- " \n",
- " Input:\n",
- " - A: list of numerical values\n",
- " - delta: the error, lies between 0 and 0.5\n",
- " \n",
- " Output:\n",
- " - num: the approximated median, if it is found within 100//delta trials (this is one possible \n",
- " maximum number of retries. While you are encouraged to play around with another limits, the \n",
- " submitted work must use this number.)\n",
- " - None: finding failed, if nothing found within 100//delta trials\n",
- "\n",
- " Note: the 100//delta is chosen here as it is the expected number of trials for the first successful finding to occur (using geometric distribution)\n",
- " \"\"\"\n",
- " \n",
- " # YOUR CODE HERE\n",
- " raise NotImplementedError()\n",
- "approx_med_finder([1,2,3,4,5,6,7],0.25)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "77b77f7047dea39f9b7526a785f0e652",
- "grade": false,
- "grade_id": "cell-1863e0846b176982",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### Question 2c.\n",
- "\n",
- "What is the probability of failure in each random trial? What is the probability of failure after all the allowed random trials? Does this scale with δ or N (the number of elements in a list)?\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "f7d6ed353a91d12d66b53a735d84f81b",
- "grade": true,
- "grade_id": "cell-efe7ab76f6a2546a",
- "locked": false,
- "points": 0,
- "schema_version": 1,
- "solution": true
- }
- },
- "source": [
- "Since _numpy.random.randint_ generates random numbers based on a uniform distribution; the probability of failure directly depends on the delta. Since delta defines which interval in the array can be considered, with a delta of 0.25, there is a 75% chance of failure, since we are only interested in sampling elements that compose 25% of the array. <p>\n",
- "The probability of failures after a given number of $x$ trials is given by $0.75^x$. For example, the likelihood of all three trails failing is the probability of the first trial, times the possibility of the second (considering that the first failed), and so on. <p>\n",
- "As delta increases, the probability of failure also decreases. If delta is equaled to 0.5, the first trial has only a 50% chance of failure, while the second has 25% and so forth. Values of N will change the absolute values of these representations, although it will not change the probability."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "editable": false,
- "nbgrader": {
- "checksum": "a602a3cad0ff35029b447d00dcee6b6c",
- "grade": false,
- "grade_id": "cell-a41fe377076cd283",
- "locked": true,
- "schema_version": 1,
- "solution": false
- }
- },
- "source": [
- "### Question 2d.\n",
- "Analyze the expected runtime of `approx_med_finder`. Note that because the function uses `check_approx_med`, you will most likely need to analyze the runtime of that function, too."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "deletable": false,
- "nbgrader": {
- "checksum": "255d88cf8e4f6ff8b3eb05cab2aa5f48",
- "grade": true,
- "grade_id": "cell-86b199798477fc2a",
- "locked": false,
- "points": 0,
- "schema_version": 1,
- "solution": true
- }
- },
- "source": [
- "For `approx_med_finder`, we know that there is a delta probability of sucess, since we are sampling from the entire list. Thus, after $1/\\delta$ trails we are expected the sample the correct number. For example, if Delta is 50%, half of the array will be in the desired interval. Thus, we have 50% of chance of selecting one of those numbers. Moreover, `check_approx_med` performs a constant time analysis (only checks if the number is in the desired interval with a if statement). <p>\n",
- "Thus, we can say that the runtime depends on $\\delta$ instead of $n$, and we expect to run `approx_med_finder`, in average, $1/\\delta$ times"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.1"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement