Untitled

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
    "\n",
    "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name and collaborators below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "NAME = \"Lucas Mudo de Araujo\"\n",
    "COLLABORATORS = \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "69779d40de39e2fcb822ffd099ed946a",
     "grade": false,
     "grade_id": "cell-361a67c139252f60",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "# CS110 Pre-class Work 5.1\n",
    "\n",
    "## Question 1.\n",
    "\n",
    "### Question 1a.\n",
    "\n",
    "Write the worst sort function in the world, `worst_sort`. This function takes a list, and then randomly shuffles it until it happens to be in sorted order. Once it is in sorted order then the list is returned.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "c144735a0a3c9ee1e5a60e781fec46a9",
     "grade": false,
     "grade_id": "cell-ead9c74054a1c5da",
     "locked": false,
     "schema_version": 1,
     "solution": true
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 3, 4, 5]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import random, numpy as np\n",
    "def worst_sort(A):\n",
    "    \"\"\"\n",
    "    Sort A in ascending order by randomly shuffling its \n",
    "    elements until they are in order.\n",
    "    \n",
    "    Input:\n",
    "    - A: list of numerical values\n",
    "    \n",
    "    Output:\n",
    "    - A: Sorted list\n",
    "    \"\"\"\n",
    "    B = A.copy()\n",
    "    while A != sorted(B):\n",
    "        A = randomize_in_place(A)\n",
    "    return A\n",
    "    \n",
    "def randomize_in_place(A):\n",
    "    for i in range (len(A)):\n",
    "        random_i =  np.random.randint(0,len(A))\n",
    "        A[i],A[random_i] = A[random_i],A[i]\n",
    "    return A\n",
    "\n",
    "worst_sort([4,5,3,2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "09c3f02f46b5ba86c3aa80498b9c8c34",
     "grade": true,
     "grade_id": "cell-ff4db3a3e8b04515",
     "locked": true,
     "points": 1,
     "schema_version": 1,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# Please ignore this cell. This cell is for us to implement the tests \n",
    "# to see if your code works properly. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "18d2bd43c2db165cb6806cb1e14df4f1",
     "grade": false,
     "grade_id": "cell-3ffcfdae9ec3ea41",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### Question 1b.\n",
    "What is the best case complexity of this algorithm?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "76bf94004078f73d21791758ac6db72b",
     "grade": true,
     "grade_id": "cell-afac80171b3b6232",
     "locked": false,
     "points": 0,
     "schema_version": 1,
     "solution": true
    }
   },
   "source": [
    "When the first permutation results in the ascending sorted array. Because I am using one of the randomization method introduced by Cormen et al. that produces a uniform random permutation, the probability of such combination is $\\frac{1}{n!}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "1b1210d5a27f9933086d0f3a0d234361",
     "grade": false,
     "grade_id": "cell-d98018127b7e79f4",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### Question 1c.\n",
    "\n",
    "What is the average case complexity?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "97d1e74314efdae93ea73d510e233468",
     "grade": true,
     "grade_id": "cell-9d52d34daa1e3478",
     "locked": false,
     "points": 0,
     "schema_version": 1,
     "solution": true
    }
   },
   "source": [
    "We can expect the average-case complexity for such an algorithm to be the probability of finding the right permutation. Since such chance is $1/n!$, we can expect an average performance to be $n!$, assuming that we would obtain the desired combination at least once after trying $n!$ permutations. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "fdbf5403114d2a68085e8172d0122685",
     "grade": false,
     "grade_id": "cell-d6d8ad45088488eb",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### Question 1d.\n",
    "\n",
    "For what size lists is this a feasible method?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "080a5eecbe3afcffed278b7cbba7f51a",
     "grade": true,
     "grade_id": "cell-ab40f08768579798",
     "locked": false,
     "points": 0,
     "schema_version": 1,
     "solution": true
    }
   },
   "source": [
    "This sorting algorithm is reasonable only for inputs values that are substantially small. For big input size, this approach is very ineffective and doesn't apply any optmization technique. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "cbe19c6348c19b5df04e84b537f6ebf0",
     "grade": false,
     "grade_id": "cell-56ae4e75a4086ee8",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### [Optional] Question 1e.\n",
    "\n",
    "Can you think of an even worse sorting method? In such a case, what would its complexity be? How big a list could you sort?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "564b860df18fb1bb2fc987ec4240f98d",
     "grade": true,
     "grade_id": "cell-ac05806942341926",
     "locked": false,
     "points": 0,
     "schema_version": 1,
     "solution": true
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "c8d4696c279cfb37108a84fdde271057",
     "grade": false,
     "grade_id": "cell-16012d2af0ef00ec",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "## Question 2.\n",
    "Approximate median finder. Given a list and δ (a number between 0 and 0.5), the approximate median finder function returns a number that is guaranteed to lie between the (50-δ/2)% and (50+δ/2)% percentiles. Implement such a function by randomly selecting an element from the list and testing whether or not it lies within the bounds. If it doesn’t then retry with a new random element, but only a limited number of retries are allowed (you decide the maximum number of retries.)\n",
    "\n",
    "### Question 2a.\n",
    "Complete the following function\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "9ce93a3c3e022b8a93a85566123df36f",
     "grade": false,
     "grade_id": "cell-2ab65512a6148d3c",
     "locked": false,
     "schema_version": 1,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "import statistics \n",
    "def check_approx_med(A, num, delta):\n",
    "    isApproxMed = False\n",
    "    if len(A) == 1:\n",
    "        return isApproxMed\n",
    "    else:\n",
    "        #Sorting the list to make sure that the percentiles\n",
    "        #property holds\n",
    "        A = sorted(A)\n",
    "        #calculating the index of the left and right-hand side\n",
    "        #percentiles \n",
    "        low = int((0.5-delta/2)*len(A))\n",
    "        high = int((0.5+delta/2)*len(A))\n",
    "        if (num >= A[low] and num <= A[high]):\n",
    "            isApproxMed = True\n",
    "    return isApproxMed\n",
    "    \"\"\"\n",
    "    Given a list and a number in a list, determine whether it is within (50-delta/2)% and \n",
    "    (50+delta/2)% percentiles.\n",
    "    X% percentile of a list is defined to be the smallest number in the list that is as large as at least\n",
    "    X% numbers in the list. \n",
    "    \n",
    "    Input:\n",
    "    - A: list of numerical values\n",
    "    - num: the number of interest\n",
    "    - delta: the error, lies between 0 and 0.5\n",
    "    \n",
    "    Output:\n",
    "    - isApproxMed: a boolean value indicating whether num is within the bound\n",
    "    \"\"\"\n",
    "assert(check_approx_med([0,1], 0, .25) == True)\n",
    "assert(check_approx_med([0,1,2,3,4,5,6], 3, .25) == True)\n",
    "assert(check_approx_med([0], 0, .25) == False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "e36da35c2400ca389ff7bb609167a787",
     "grade": true,
     "grade_id": "cell-df56d7433abb6517",
     "locked": true,
     "points": 1,
     "schema_version": 1,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "assert(check_approx_med([0,1], 0, .25) == True)\n",
    "assert(check_approx_med([0], 0, .25) == False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "ea4f76410780e7134e2bc46fbe5663e9",
     "grade": false,
     "grade_id": "cell-18a16dfd102f137c",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### Question 2b.\n",
    "Complete the following function that makes use of `check_approx_med` above.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "2b3cfdf82e0efce4078cc52b3680f7be",
     "grade": false,
     "grade_id": "cell-e19685b545147dde",
     "locked": false,
     "schema_version": 1,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def approx_med_finder(A, delta):\n",
    "    A = sorted(A)\n",
    "    \n",
    "    for trial in range(int(100/delta)):\n",
    "        guess = np.random.randint(0,len(A)+1)\n",
    "        if (check_approx_med(A, guess, delta)):\n",
    "            print(trial)\n",
    "            return guess\n",
    "    return None\n",
    "    \"\"\"\n",
    "    Given a list, find a number in the list that is between 50-delta/2% and 50+delta/2% percentiles\n",
    "    \n",
    "    Input:\n",
    "    - A: list of numerical values\n",
    "    - delta: the error, lies between 0 and 0.5\n",
    "    \n",
    "    Output:\n",
    "    - num: the approximated median, if it is found within 100//delta trials (this is one possible \n",
    "    maximum number of retries. While you are encouraged to play around with another limits, the \n",
    "    submitted work must use this number.)\n",
    "    - None: finding failed, if nothing found within 100//delta trials\n",
    "\n",
    "    Note: the 100//delta is chosen here as it is the expected number of trials for the first successful finding to occur (using geometric distribution)\n",
    "    \"\"\"\n",
    "    \n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "approx_med_finder([1,2,3,4,5,6,7],0.25)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "77b77f7047dea39f9b7526a785f0e652",
     "grade": false,
     "grade_id": "cell-1863e0846b176982",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### Question 2c.\n",
    "\n",
    "What is the probability of failure in each random trial? What is the probability of failure after all the allowed random trials? Does this scale with δ or N (the number of elements in a list)?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "f7d6ed353a91d12d66b53a735d84f81b",
     "grade": true,
     "grade_id": "cell-efe7ab76f6a2546a",
     "locked": false,
     "points": 0,
     "schema_version": 1,
     "solution": true
    }
   },
   "source": [
    "Since _numpy.random.randint_ generates random numbers based on a uniform distribution; the probability of failure directly depends on the delta. Since delta defines which interval in the array can be considered, with a delta of 0.25, there is a 75% chance of failure, since we are only interested in sampling elements that compose 25% of the array. <p>\n",
    "The probability of failures after a given number of $x$ trials is given by $0.75^x$. For example, the likelihood of all three trails failing is the probability of the first trial, times the possibility of the second (considering that the first failed), and so on. <p>\n",
    "As delta increases, the probability of failure also decreases. If delta is equaled to 0.5, the first trial has only a 50% chance of failure, while the second has 25% and so forth. Values of N will change the absolute values of these representations, although it will not change the probability."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "a602a3cad0ff35029b447d00dcee6b6c",
     "grade": false,
     "grade_id": "cell-a41fe377076cd283",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "### Question 2d.\n",
    "Analyze the expected runtime of `approx_med_finder`. Note that because the function uses `check_approx_med`, you will most likely need to analyze the runtime of that function, too."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "255d88cf8e4f6ff8b3eb05cab2aa5f48",
     "grade": true,
     "grade_id": "cell-86b199798477fc2a",
     "locked": false,
     "points": 0,
     "schema_version": 1,
     "solution": true
    }
   },
   "source": [
    "For `approx_med_finder`, we know that there is a delta probability of sucess, since we are sampling from the entire list. Thus, after $1/\\delta$ trails we are expected the sample the correct number. For example, if Delta is 50%, half of the array will be in the desired interval. Thus, we have 50% of chance of selecting one of those numbers. Moreover, `check_approx_med` performs a constant time analysis (only checks if the number is in the desired interval with a if statement). <p>\n",
    "Thus, we can say that the runtime depends on $\\delta$ instead of $n$, and we expect to run `approx_med_finder`, in average, $1/\\delta$ times"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}