Advertisement
Guest User

Untitled

a guest
Jul 23rd, 2019
116
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 10.00 KB | None | 0 0
  1. g belongs to O(f):
  2.  
  3. - if g is bounded above by a constant multiple of f
  4. - if g grows no faster than (a constant multiple of) f
  5. - if the ratio g/f is bounded above by a constant
  6.  
  7. There exists a C greater than 0 and an n0 greater than 0 for which all n > n0 g(n) <= C x f(n)
  8.  
  9. - if lim n-->infinity g(n)/f(n) = 0, then for any C >0 the ratio is less than C as long as n is sufficiently large. If it tends to infinity, then there is no C >0 such that g(n) <= C x f(n) for sufficiently large n
  10.  
  11. L'Hospitals Rule:
  12.  
  13. If the ratio of limits lim n--> infinity g(n)/f(n) is an indeterminate form, then take the derivative of the top and bottom and then apply the limit
  14.  
  15.  
  16. Geometric sum to n: 1 - r^n/1 - r
  17.  
  18. If r < 1, geometric sum to infinity = 1/(1-a)
  19.  
  20. Sum of first n integers: n(n+1)/2
  21.  
  22. nth Harmonic number: Hn = sum (i =1 to n) 1/i
  23.  
  24. Hn = theta(logN)
  25.  
  26.  
  27. - In probability theory, an event is a subset of the sample space. (So an event is a set of outcomes)
  28.  
  29. Expectation:
  30.  
  31. - The expected value, or expectation, of a random variable X represents its "average value"/
  32. - Formally: if X is a random variable with a finite number of possible values, E(X) = Sum(x belongs to X) of x x P(X = x)
  33. E(X + Y) = E(X) + E(Y)
  34.  
  35.  
  36. -----------------------------
  37.  
  38. Dynamic Arrays:
  39.  
  40. - Similar to arrays, but size can be increased or decreased. ArrayList in java, list in Python
  41.  
  42.  
  43. - Inserting or deleting an item in the middle of linked list is fast. Accessing a cell given its index (i.e., finding the kth item in the list) is slow
  44.  
  45. - Hashing is fast: O(1) expected time for access, insertion
  46.  
  47.  
  48. Hashing Disadvantages:
  49.  
  50. - Access time (except for cuckoo hashing) and insertion time (for all strategies) is expected time, not worse-case time. So there is no absolute guarantee on performance
  51. - Performance depends on the load factor: number of items stored/table size
  52. - Hashing cannot tell us about nearby keys - Exact match query, successor query
  53.  
  54.  
  55. Facts about binary trees:
  56.  
  57. - there are at most 2^k nodes at level k
  58. - a binary tree with depth d has:
  59. - at most 2^d keaves and at most 2^(d+1) - 1 nodes
  60. - a binary tree with n leaves has depth >= ceiling(logn)
  61.  
  62.  
  63. Binary search trees:
  64.  
  65. - Find, insert and remove can all be done in O(h) time where h is tree height and O(logn) for AVL trees
  66.  
  67.  
  68.  
  69. Lower bound on the worst case number of 3 way comparisons (<, >, ==) required to find an item in an array is floor(logn) +1
  70. Hence any algorithm for locating an item in an array of size n using only comparisons must perform at least floor(logn) +1 comparisons in the worst case
  71. Hence binary-search is optimal with respect to worst-case performance
  72.  
  73. --------------------------------------------------
  74.  
  75. SORTING: measure of time = number of comparisons
  76.  
  77.  
  78. - an inversion in a sequence or list is a pair of items such that the larger one precedes the smaller one
  79.  
  80.  
  81. Insertion Sort:
  82.  
  83. - Work from left to right across array
  84. - insert each item in correct position with respect to (sorted) elements to its left.
  85. (Like arranging a hand of cards)
  86.  
  87. def insertionSort(n, A):
  88. for k = 1 to n-1:
  89. if x = A[k]
  90. j = k-1
  91. while (j >= 0) and (A[j] > x )
  92. A[j+1] = A[j]
  93. j = j -1
  94. A[j+1] = x
  95.  
  96.  
  97.  
  98. Analysis of Insertion Sort:
  99.  
  100. - Storage: in place: O(1) extra storage
  101. - Worse-case running time: theta(n^2)
  102. - Average-case time: approximately theta(n^2)
  103. Insertion sort is efficient if the input is almost sorted ( Time <= n - 1 + (# inversions) )
  104.  
  105.  
  106. Selection Sort:
  107.  
  108. 2 variations:
  109.  
  110. - Repeatedly (for i from 0 to n-1) find the minimum value, output it, delete it
  111. - Repeatedly (for i from n-1 down to 1) find the maximum of A[0] ... A[i] and swap this value with A[i]
  112.  
  113. - Both variations run in O(n^2) time if we use the straightforward approach to finding the max/min. They can be improved by treating the items as items in an appropriately designed priority queue
  114.  
  115.  
  116. Quick Sort:
  117.  
  118. function quicksort(A, first, last):
  119. splitpoint = split(A, first, last)
  120. quicksort(A, first, splitpoint)
  121. quicksort(A, splitpoint + 1, last)
  122.  
  123. def split(A, first, last):
  124. splitpoint = first
  125. x = A[first]
  126. for k = first+1 to last do:
  127. if A[k] < x:
  128. A[splitpoint+1] <--> A[k}
  129. splitpoint = splitpoint + 1
  130. A[first] <--> A[splitpoint]
  131. return splitpoint
  132.  
  133. Worst Case Running Time - theta(n^2) (worst case number of comparisons = O(n^2))
  134.  
  135.  
  136. Quicksort Key Fact: During the run of Quicksort, two values x and y get compared if and only if the first key in the range [x..y] to be chosen as pivot is either x or y
  137.  
  138. Average time for quicksort is O(nlogn)
  139.  
  140.  
  141. Implementation tricks for improving Quicksort:
  142.  
  143. 1. Better choice of pivot element:
  144. - Instead of a single element, choose median of 3 ( or 5 or 7 )
  145. - Choose a random element
  146. - Combine
  147.  
  148. 2. Reduce procedure call overhead
  149. - Explicitly manipulate the stack in the program (rather than making recursive calls)
  150. - for small lists, use a nonrecursive sort like insertion sort or selection sort
  151.  
  152. 3. Reduce stack space
  153. - Push the larger sublist and immediately working on the smaller sublist
  154. - Reduces worst-case stack usage from O(n) to O(logn)
  155.  
  156.  
  157.  
  158. Mergesort
  159.  
  160. function mergeSort(A, first, last):
  161. if first < last:
  162. mid = floor[(first+last)/2]
  163. mergeSort(A, first, mid)
  164. mergeSort(A, mid+1, last)
  165. merge(A, first, mid, mid + 1, last)
  166.  
  167. function merge(A, first1, last1, first2, last2):
  168. index1 = first1
  169. index2 = first 2
  170. tempIndex = 0
  171.  
  172. while (index1 <= last1) and (index2 <= last2):
  173. if A[index1] < A[index2]:
  174. temp[tempIndex++] = A[index1++]
  175.  
  176. else:
  177. temp[tempIndex++] = A[index2++]
  178.  
  179. while(index1 <= last1):
  180. temp[tempIndex++] = A[index1++]
  181. while(index2 <= last2):
  182. temp[tempIndex++] = A[index2++]
  183.  
  184. temp = 0, index = first
  185. while (index <= last2): A[index++] = temp[tempIndex++]
  186.  
  187.  
  188. Asymptotic Solution of mergesort recurrence equation: T(n) = theta(nlogn)
  189.  
  190.  
  191. To count inversions using mergesort, add an invCount variable in the merge step and at each addition into tempIndex, do:
  192. invCount = invCount + last1 - index1 + 1
  193. and then return it at the end
  194.  
  195. - if we also additionally, wanted to report the inversions, the extra running time is proportional to the number of inversions reported: Report inversions in O(nlogn + k) time
  196.  
  197.  
  198. ---------------------------------------------------------------------------
  199.  
  200. - Priority queues are implemented in the form of a heap
  201.  
  202.  
  203. Observation: Logn (where n is the number of nodes in the binary tree) tells you the depth of the binary tree with the depth of the root being 0
  204.  
  205.  
  206. Viewing the array as a binary tree:
  207.  
  208. - left child of H[i] = H[2i+1]
  209. - right child of H[i] H[2i+2]
  210. - parent of H[i] = H[(i-1)2]
  211.  
  212.  
  213. Binary Heap:
  214.  
  215. - for a SiftUp, at most 1 comparison at each level, so total time is O(logn)
  216. - for a SiftDown, at most 2 comparisons at each level, so total time is O(logn)
  217.  
  218.  
  219. - Insert operation: O(logn) (because SiftUp time dominates)
  220. - Deletion: Siftup/ siftdown time dominates, so total time is O(logn)
  221. - Extractmax: Delete time dominates, so total time is O(logn)
  222.  
  223.  
  224. - To construct a heap in O(n) time, put the data in H in arbitrary order and then run the heapify function:
  225.  
  226. function heapify(H, n):
  227. for i = [(n-1)/2] down to 0: //start with the last parent node
  228. SiftDown(H, i)
  229.  
  230.  
  231. def heapsort(A, n)
  232. heapify(A, n)
  233. for k = n-1 down to 1 do:
  234. A[k] = ExtractMax(A)
  235.  
  236. Total Time is O(nlogn)
  237.  
  238. - if we use a min heap instead, we can just output elements in sorted order as opposed to a max heap where we would have to store the elements back in an array
  239.  
  240.  
  241.  
  242.  
  243. Comparison-based sorts:Summary
  244.  
  245. Insertion sort
  246. Worst Case time: theta(n^2)
  247. Storage requirement: In-place
  248. Remarks: Good if input is almost sorted
  249. Stable
  250.  
  251. QuickSort
  252. Worst Case time: theta(n^2)
  253. Storage requirement: O(logn) extra for stack
  254. Remarks: O(nlogn) expected time
  255. Not stable
  256.  
  257. Mergesort
  258. Worse Case time: theta(nlogn)
  259. Storage requirement: O(n) extra for merge
  260. Stable
  261.  
  262. Heapsort
  263. Worse case time: theta(nlogn)
  264. Storage Requirement: In-place
  265. Remarks: Can output k smallest in sorted order in O(n + klogn) time
  266. Not stable
  267.  
  268.  
  269.  
  270. - Any algorithm for sorting a list of size n using only comparisons must perform at least ceiling(logn!) comparisons in the worst case (decision tree with left:right)
  271. ceiling(lgn!) = theta(nlogn). hence any algorithm for sorting a list of size n using only comparisons must perform at least theta(nlogn) comparisons in the worst case
  272. Hence heapsort and mergesort are asymptotically optimal. The lower bound is asymptotically tight (cannot be improved asymptotically)
  273.  
  274.  
  275.  
  276.  
  277.  
  278.  
  279. Bucket Sort: Storage is O(n+b)
  280. Runtime = O(s^2) where s is worst case number of items in a bucket
  281.  
  282. Special case Bucket sort = the number of buckets is equal to the number of keys. In which case the keys are distributed independently and uniformly over the buckets.
  283. Then the expected total cost of the intra-bucket sorts is O(n)
  284.  
  285.  
  286. - Suppose we have n integers in the range 1..b. Then we can use b buckets and get a runtime of O(n + b) worst case
  287.  
  288.  
  289. Radix sort:
  290.  
  291. Useful for sorting multi-field keys (e.g dates) and multi-digit numbers
  292.  
  293. algorithm radix_sort(A, n):
  294. for field ranging from rightmost (least significant) to leftmost (most significant) do:
  295. sort L on field using a stable sort
  296.  
  297. External Sorting:
  298.  
  299. When sorting a filer larger than available memory, with n records in file and m records:
  300.  
  301. - Read in groups of m records
  302. - Sort each group
  303. - Write each run (sorted group) to a separate output file
  304. - Choose f files (or all that are left if there are fewer than f)
  305. - Merge the contents of the f input files into a new output file
  306.  
  307.  
  308. Replacement Selection:
  309.  
  310. make initial runs longer by using this improvement
  311.  
  312. - When a key is written and the next key is read, if the new key is >= the last key written, make it a part of the current run. If not, save it for the next run
  313. On average, replacement selection doubles the sizes of the runs, assuming uniform distribution of the sort keys
  314.  
  315.  
  316. - A sorting algorithm is stable if whenever two keys are equal the algorithm preserves their order (i.e, does not reverse them.)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement