Advertisement
Jakzon123

cis4930 study guide

Mar 12th, 2023
611
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 11.38 KB | None | 0 0
  1. Python:
  2. - data types
  3. 1. Numbers - integers, floating-point numbers, and complex numbers.
  4. 2. Strings - a sequence of characters enclosed in single or double quotes.
  5. 3. Boolean - a type that can only take the values True or False.
  6. 4. Lists - a collection of elements of any data type, enclosed in square brackets and separated by commas.
  7. 5. Tuples - similar to lists, but they are immutable and enclosed in parentheses.
  8. 6. Sets - an unordered collection of unique elements, enclosed in curly braces.
  9. 7. Dictionaries - a collection of key-value pairs, enclosed in curly braces, where the keys must be unique.
  10. - loops & conditionals
  11. 1. for loops
  12. 2. while loops
  13. 3. if, else, elif loops
  14. - functions
  15. def function(x, y):
  16. z = y + x
  17. return z
  18. - modules
  19. - import ____
  20. - import ____ as ____
  21. - from ____ import ____
  22.  
  23. Key Value Databases:
  24. - Arrays To Key Value DB (?)
  25. - Essential Features of Key-Value Databases
  26. - scalable
  27. - high performance
  28. - high availability
  29. - flexibility
  30. - ease of use
  31. - low latency
  32. - Keys
  33. - a key is a unique identifier for a particular piece of data.
  34. - The key is used to locate and retrieve the corresponding value in the database.
  35. - Characteristics of Values
  36. - a value is the data that is associated with a particular key.
  37. - In other words, it is the information that is stored in the database and can be retrieved using the corresponding key.
  38.  
  39. Key-Value Database Terminology:
  40. - Key-Value Database Data Modeling Terms
  41. - relational schema: formal description of a table, blueprint of information needed to make a table
  42. - attribute domain: set of values an attribute may take
  43. - Key-Value Architecture Terms
  44. - Tables: structures that store information
  45. - candidate keys: when multiple attributes can serve as primary keys, we call these...
  46. - primary key: main identifier for a row in a table
  47. - foreign key: a key used to link multiple tables to one another
  48. - Key-Value Implementation Term (?)
  49.  
  50. Designing for Key-Value Databases:
  51. - Key Design and Partitioning
  52. - following a naming convention
  53. - range-based components (date/int counter)
  54. - use a common delimiter
  55. - partitioning can be done by range or by hash
  56. - Designing Structured Values
  57. - common cases: attributes that are used together
  58. - store commonly used values in RAM, store logically linked info together
  59. - duplication of data can improve performance (denormalization)
  60. - Limitations of Key-Value Databases
  61. - lookups are only possible by key
  62. - range queries are not supported by default
  63. - no standard query language
  64. - Design Patterns for Key-Value Databases
  65. - TTL keys
  66. - keys that expire after some amount of time
  67. - Emulating Tables
  68. - implement get and set operations so attributes can be assigned/retrived
  69. - Aggregates
  70. - using a common table to store attributes of subtypes
  71. - Atomic Aggregates
  72. - all properties must be updated at the same time or not at all
  73. - Enumerable Keys
  74. - using counters/sequences to create keys
  75.  
  76. PickleDB:
  77. import pickledb
  78.  
  79. # Create a new PickleDB
  80. db = pickledb.load('example.db', True)
  81.  
  82. # Add key-value attributes to the PickleDB
  83. db.set('name', 'John')
  84. db.set('age', 25)
  85. db.set('city', 'New York')
  86.  
  87. # Update a key-value attribute in the PickleDB
  88. db.set('age', 26)
  89.  
  90. # Delete a key-value attribute from the PickleDB
  91. db.rem('city')
  92.  
  93. # Locate and display key-value attributes from the PickleDB
  94. name = db.get('name')
  95. age = db.get('age')
  96. city = db.get('city')
  97.  
  98. print('Name:', name)
  99. print('Age:', age)
  100. print('City:', city)
  101.  
  102. Document Databases:
  103. - What Is a Document?
  104. - A document is a self-contained data structure that contains all the information related to a specific object or entity.
  105. - The document can be thought of as a unit of storage, and it typically contains multiple fields or attributes that describe the properties of the object.
  106. - Avoid Explicit Schema Definitions
  107. - This allows your document database to be flexible and store blob data more easily
  108. - Basic Operations on Document Databases
  109. 1. Create: A new document can be created by inserting a new JSON or BSON object into the database.
  110. 2. Read: Documents can be retrieved from the database using various query operators that filter, sort, and limit the results.
  111. 3. Update: Documents can be updated by modifying one or more fields or attributes of the document.
  112. 4. Delete: Documents can be deleted from the database using a delete operation that specifies the criteria for selecting the documents to be deleted.
  113. 5. Query: Documents can be queried using a query language that supports filtering, sorting, and aggregation.
  114. 6. Indexing: Documents can be indexed for fast retrieval of data.
  115. 7. Transaction: Documents can be updated or deleted as part of a transaction, which ensures that all changes are either committed or rolled back together.
  116.  
  117. Document Database Terminology:
  118. - Document and Collection Terms
  119. - document: a set of ordered key-value pairs
  120. - collection: group of related documents
  121. - embedded document: a document being stored within another document
  122. - polymorphic schema: documents within a collection have multiple different forms
  123. - schemaless: do not require specification step before adding document to a collection
  124. - Types of Partitions
  125. - vertical partitioning: within one server, breaking down columns in a relational table into multiple tables
  126. - horizonal partitioning, across multiple servers
  127. - partitioning algorithm: ranges, lists, or hash values
  128. - Data Modeling and Query Processing
  129. - deletion anomaly: when removing an entry removes a piece of data that was only found there
  130. - insertion anomaly: cannot insert partial information into table
  131. - update anomaly: when one fact changes and must be updated in multiple places
  132. - normalization means there are no modification anomalies
  133. - this can be done by joining tables together
  134. - query processor: takes input queries and data abt document collections and creates operations to retrieve that data
  135.  
  136. Designing Document Databases:
  137. - Normalization, Denormalization, and the Search for Proper Balance
  138. - In a document database, normalization involves breaking down the data into separate collections or documents to avoid redundancy.
  139. - Instead of storing all the data in a single document, the data is split across multiple documents, and relationships between them are established using references or embedded documents.
  140. - Planning for Mutable Documents
  141. - allocate extra memory ahead of time to reduce the chance of needing to move and free document location
  142. - The Goldilocks Zone of Indexes
  143. - create a good number of indices to keep overhead low while maintaining read speed
  144. - Modeling Common Relations
  145. - one to many relationship (embed a document within another document)
  146. - many to many relationship (with two documents, embed document within the other)
  147. - heirarchies (contain a reference to the parent object within the child)
  148. - being able to create JSON Files / JSON Formatting
  149. {
  150. "name": "John Smith",
  151. "age": 35,
  152. "email": "[email protected]",
  153. "address": {
  154. "street": "123 Main St",
  155. "city": "Anytown",
  156. "state": "CA",
  157. "zip": "12345"
  158. },
  159. "phoneNumbers": [
  160. {
  161. "type": "home",
  162. "number": "555-1234"
  163. },
  164. {
  165. "type": "work",
  166. "number": "555-5678"
  167. }
  168. ]
  169. }
  170.  
  171. Mongo DB and Python:
  172. import pymongo
  173.  
  174. # Connect to your local MongoDB
  175. client = pymongo.MongoClient("mongodb://localhost:27017/")
  176.  
  177. # Drop your document database (if it exists)
  178. client.drop_database("mydb")
  179.  
  180. # Create your document database
  181. mydb = client["mydb"]
  182.  
  183. # Create a collection in your document database
  184. mycol = mydb["mycollection"]
  185.  
  186. # Insert items into your collection
  187. mydict1 = { "name": "John", "address": "Highway 37" }
  188. mydict2 = { "name": "Jane", "address": "Baker Street 221B" }
  189. mycol.insert_many([mydict1, mydict2])
  190.  
  191. # Using find, display all items in your collection to the screen
  192. for x in mycol.find():
  193. print(x)
  194.  
  195. in Python locate items in a Document DB and displays the results with limiting attributes, limiting results, sorting
  196. import pymongo
  197.  
  198. # Connect to your local MongoDB
  199. client = pymongo.MongoClient("mongodb://localhost:27017/")
  200.  
  201. # Retrieve a collection named "mycollection"
  202. mycol = client["mydb"]["mycollection"]
  203.  
  204. # Define the query object with limiting attributes
  205. query = { "name": "John" }
  206. projection = { "name": 1, "age": 1 }
  207.  
  208. # Perform the find query with limiting attributes, limiting results, and sorting
  209. result = mycol.find(query, projection).sort("age", pymongo.ASCENDING).limit(10)
  210.  
  211. # Print the results to the screen
  212. for x in result:
  213. print(x)
  214. In this example, we define a query object with a limiting attribute "name": "John".
  215. We also define a projection object with limiting attributes "name": 1 and "age": 1.
  216. This will limit the results to only include the "name" and "age" fields.
  217. We then perform the find query with the limiting attributes, sort the results by the "age" field in ascending order using sort, and limit the results to a maximum of 10 documents using limit.
  218.  
  219. In a single Python script :
  220. import pymongo
  221.  
  222. # Connect to your local MongoDB
  223. client = pymongo.MongoClient("mongodb://localhost:27017/")
  224.  
  225. # Using find, display all items in your collection to the screen
  226. mycol = client["mydb"]["mycollection"]
  227. for x in mycol.find():
  228. print(x)
  229.  
  230. # Create a find that using an $lt
  231. query = { "age": { "$lt": 30 } }
  232. result = mycol.find(query)
  233. print(f"Documents where age < 30: {result.count()}")
  234.  
  235. # Create a find that using an $gte
  236. query = { "age": { "$gte": 30 } }
  237. result = mycol.find(query)
  238. print(f"Documents where age >= 30: {result.count()}")
  239.  
  240. # Create a find that using an $eq
  241. query = { "name": { "$eq": "John" } }
  242. result = mycol.find(query)
  243. print(f"Documents where name = 'John': {result.count()}")
  244.  
  245. # Create a find that using an $ne
  246. query = { "name": { "$ne": "John" } }
  247. result = mycol.find(query)
  248. print(f"Documents where name != 'John': {result.count()}")
  249.  
  250. # Create a find that using an $or
  251. query = { "$or": [ { "name": "John" }, { "age": { "$lt": 30 } } ] }
  252. result = mycol.find(query)
  253. print(f"Documents where name = 'John' or age < 30: {result.count()}")
  254.  
  255. # Create a find that using an $and
  256. query = { "$and": [ { "name": "John" }, { "age": { "$lt": 30 } } ] }
  257. result = mycol.find(query)
  258. print(f"Documents where name = 'John' and age < 30: {result.count()}")
  259.  
  260. # Create a find that using an $not
  261. query = { "name": { "$not": { "$eq": "John" } } }
  262. result = mycol.find(query)
  263. print(f"Documents where name != 'John': {result.count()}")
  264.  
  265. # Create a find that using an $exists
  266. query = { "age": { "$exists": True } }
  267. result = mycol.find(query)
  268. print(f"Documents where age exists: {result.count()}")
  269.  
  270. # Create a find using {item: null } null search
  271. query = { "item": None }
  272. result = mycol.find(query)
  273. print(f"Documents where item is null: {result.count()}")
  274.  
  275. # Create a find using {item: {$exists : false} } null search
  276. query = { "item": { "$exists": False } }
  277. result = mycol.find(query)
  278. print(f"Documents where item does not exist: {result.count()}")
  279.  
  280. # Create a find using {item: {$type : 10} } null search
  281. query = { "item": { "$type": 10 } }
  282. result = mycol.find(query)
  283. print(f"Documents where item is null or undefined: {result.count()}")
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement