Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- # ** EXAM OBJECTIVES: INDEXING DATA + MAPPINGS AND TEXT ANALYSIS **
- # (remove, if present, any `hamlet*` index and index template)
- # Create the index `hamlet_raw`, with one primary shard and four replicas
- # Index in `hamlet_raw` a document with id "1", default type, and the field `line` with value "To be, or not to be: that is the question"
- # Update the document with id "1" by adding the field `line_number` with value "3.1.64"
- # Index in `hamlet_raw` a new document without specifying any id. The fields of this document are: (i) `text_entry` with value "Whether tis nobler in the mind to suffer"; (ii) `line_number` with value "3.1.66"
- # Update the precedent document by setting `line_number` to "3.1.65"
- # (in one request) Update all documents in `hamlet_raw` by adding a new field `speaker` with value "HAMLET"
- # Update the document with id "1" by renaming the field `line` into `text_entry`
- # Delete the `hamlet_raw` index
- # Create the index template `hamlet_template`, which satisfies the following criteria: (i) it matches the index patterns "hamlet_*" and "hamlet-*"; (ii) it allocates one primary shard and no replicas for each matching index
- # Create two indices named `hamlet2` and `hamlet_test`. Verify that `hamlet_template` applied only to `hamlet_test`
- # (in one request) Delete the `hamlet2` and `hamlet_test` indices
- # Update `hamlet_template` by defining a mapping that satisfies the following criteria: (i) it defines a "_doc" type, with three fields named `speaker`, `line_number` and `text_entry`; (ii) `speaker` and `line_number` map to non-analysed strings; (iii) `text_entry` is a text associated with the "english" analyzer
- # Create the index `hamlet_1`, and populate it by running the _bulk command with the request-body below
- {"index":{"_index":"hamlet_1","_id":0}}
- {"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos there?"}
- {"index":{"_index":"hamlet_1","_id":1}}
- {"line_number":"1.1.2","speaker":"FRANCISCO","text_entry":"Nay, answer me: stand, and unfold yourself."}
- {"index":{"_index":"hamlet_1","_id":2}}
- {"line_number":"1.1.3","speaker":"BERNARDO","text_entry":"Long live the king!"}
- {"index":{"_index":"hamlet_1","_id":3}}
- {"line_number":"1.2.1","speaker":"KING CLAUDIUS","text_entry":"Though yet of Hamlet our dear brothers death"}
- {"index":{"_index":"hamlet_1","_id":4}}
- {"line_number":"1.2.2","speaker":"KING CLAUDIUS","text_entry":"The memory be green, and that it us befitted"}
- {"index":{"_index":"hamlet_1","_id":5}}
- {"line_number":"1.3.1","speaker":"LAERTES","text_entry":"My necessaries are embarkd: farewell:"}
- {"index":{"_index":"hamlet_1","_id":6}}
- {"line_number":"1.3.4","speaker":"LAERTES","text_entry":"But let me hear from you."}
- {"index":{"_index":"hamlet_1","_id":7}}
- {"line_number":"1.3.5","speaker":"OPHELIA","text_entry":"Do you doubt that?"}
- {"index":{"_index":"hamlet_1","_id":8}}
- {"line_number":"1.4.1","speaker":"HAMLET","text_entry":"The air bites shrewdly; it is very cold."}
- {"index":{"_index":"hamlet_1","_id":9}}
- {"line_number":"1.4.2","speaker":"HORATIO","text_entry":"It is a nipping and an eager air."}
- {"index":{"_index":"hamlet_1","_id":10}}
- {"line_number":"1.4.3","speaker":"HAMLET","text_entry":"What hour now?"}
- {"index":{"_index":"hamlet_1","_id":11}}
- {"line_number":"1.5.2","speaker":"Ghost","text_entry":"Mark me."}
- {"index":{"_index":"hamlet_1","_id":12}}
- {"line_number":"1.5.3","speaker":"HAMLET","text_entry":"I will."}
- # Create the index `hamlet_2`, and populate it by running the _bulk command with the request-body below
- {"index":{"_index":"hamlet_2","_id":14}}
- {"line_number":"2.1.1","speaker":"LORD POLONIUS","text_entry":"Give him this money and these notes, Reynaldo."}
- {"index":{"_index":"hamlet_2","_id":15}}
- {"line_number":"2.1.2","speaker":"REYNALDO","text_entry":"I will, my lord."}
- {"index":{"_index":"hamlet_2","_id":16}}
- {"line_number":"2.1.3","speaker":"LORD POLONIUS","text_entry":"You shall do marvellous wisely, good Reynaldo,"}
- {"index":{"_index":"hamlet_2","_id":17}}
- {"line_number":"2.1.4","speaker":"LORD POLONIUS","text_entry":"Before you visit him, to make inquire"}
- {"index":{"_index":"hamlet_2","_id":18}}
- {"line_number":"2.2.1","speaker":"KING CLAUDIUS","text_entry":"Welcome, dear Rosencrantz and Guildenstern!"}
- {"index":{"_index":"hamlet_2","_id":19}}
- {"line_number":"2.2.2","speaker":"KING CLAUDIUS","text_entry":"Moreover that we much did long to see you,"}
- {"index":{"_index":"hamlet_2","_id":20}}
- {"line_number":"2.2.3","speaker":"KING CLAUDIUS","text_entry":"The need we have to use you did provoke"}
- # Create an alias named `hamlet` that maps both `hamlet_1` and `hamlet_2`
- # Verify that the documents grouped in `hamlet` are 20
- # Allow the `hamlet` alias to write on index `hamlet_1`
- # Index in `hamlet` a document with id "13", default type, and the following fields: (i) `text_entry` with value "My hour is almost come,"; (ii) `line_number` with value "1.5.4"; (iii) `speaker`, with value "Ghost"
- # Update the mapping of `hamlet_template`, satisfying the following criteria: (i) remove the definitions of the `line_number` and `speaker` fields, (ii) disable aggregations for `text_entry`
- # Update the mapping of `hamlet_template` by adding a dynamic mapping that satisfies the following criteria: (i) it assigns an integer type to any field starting by "number_"; (ii) it maps every string to a non-analysed text
- # Create the index `hamlet_3`, and populate it by running the _bulk command with the request-body below
- {"index":{"_index":"hamlet_3","_id":21}}
- {"line_number":"3.1.4","speaker":"KING CLAUDIUS","text_entry":"With turbulent and dangerous lunacy?"}
- {"index":{"_index":"hamlet_3","_id":22}}
- {"line_number":"3.1.5","speaker":"ROSENCRANTZ","text_entry":"He does confess he feels himself distracted;"}
- {"index":{"_index":"hamlet_3","_id":23}}
- {"line_number":"3.1.64","speaker":"HAMLET","text_entry":"To be, or not to be: that is the question:"}
- {"index":{"_index":"hamlet_3","_id":24}}
- {"line_number":"3.1.65","speaker":"HAMLET","text_entry":"Whether tis nobler in the mind to suffer"}
- {"index":{"_index":"hamlet_3","_id":25}}
- {"line_number":"3.1.66","speaker":"HAMLET","text_entry":"The slings and arrows of outrageous fortune,"}
- {"index":{"_index":"hamlet_3","_id":26}}
- {"line_number":"3.1.67","speaker":"HAMLET","text_entry":"Or to take arms against a sea of troubles,"}
- {"index":{"_index":"hamlet_3","_id":27}}
- {"line_number":"3.1.68","speaker":"HAMLET","text_entry":"And by opposing end them? To die: to sleep;"}
- {"index":{"_index":"hamlet_3","_id":28}}
- {"line_number":"3.1.69","speaker":"HAMLET","text_entry":"No more; and by a sleep to say we end"}
- # Store in the cluster state a new script named `control_reindex_batch`, which checks whether the `reindexBatch` field exists in a document. In the affirmative case, then the script increments the field value by a parameter named `increment`; otherwise, the script sets the field value to 1
- # Reindex `hamlet` into `hamlet_3`, satisfying the following criteria: (i) disable refreshes of `hamlet_3` during the operation; (ii) apply the `control_reindex_batch` script with the `increment` parameter set to 1; (iii) reindex in two parallel slices
- # (in one request) Add `hamlet_3` to the alias `hamlet`, and delete the `hamlet_1` and `hamlet_2` indices
- # Update all the documents in `hamlet_3` by running the `control_reindex_batch` script with an `increment` of 10
- # Remove from `hamlet_3` the documents that have "KING CLAUDIUS" as `speaker`
- # Store in the cluster state a new ingest pipeline named `split_act_scene_line`, which satisfies the following criteria: (i) it splits the value of `line_number` by using dots as the separator; (ii) it stores the split values into three new numeric fields, named `number_act`, `number_scene`, and `number_line`, respectively
- # Update all documents in `hamlet_3` using the `split_act_scene_line` pipeline
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement