Guest User

Simple Multi-Step Prompt For Dataset Augmentation

a guest
Jun 3rd, 2024
46
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 4.57 KB | Source Code | 0 0
  1. # Simple Multi-Step Prompt For Dataset Augmentation
  2.  
  3. # Assuming we have a function `chatbot` that takes a prompt and returns the chatbot's response
  4. def chatbot(prompt):
  5.     # This function is a placeholder for the actual method of interacting with the chatbot
  6.     pass
  7.  
  8. # Step 1: Create a large broad and comprehensive overview summary prompt.
  9. # **Output example**: list point of start: philosophy, math, physique, computer, english,...
  10. first_global_prompt = "Give a detailed list in bullet points of overall all subjects possible in the universe."
  11. first_global_list = chatbot(first_global_prompt)
  12.  
  13. # Step 2: Create a detailed distil prompt point branch descending tree from a chain of thought
  14. # **Output example**:  List of point branch: easy guide of learning english, most commun misstake in english, etc..
  15. start_point_branch_list = []
  16. for bulletPoint in first_global_list:
  17.     start_point_branch_prompt = "Give a detailed list in bullet points of possible topics on the subject: " + bulletPoint
  18.     start_point_branch_list += chatbot(start_point_branch_prompt)
  19.  
  20. # Step 3: Create a task specific to the subject
  21. # **Output example**:  List of possible augmented generation: Translation english to french, Summarization, Information retrieval, evaluate Multiple-choice question answering, Language modeling, Reading comprehension
  22. task_specific_prompts = []
  23. for point_branch in start_point_branch_list:
  24.     task_specific_prompt = "List of possible augmented generation tasks for the topic: " + point_branch
  25.     task_specific_prompts += chatbot(task_specific_prompt)
  26.  
  27. # Step 4: Perform data augmentation for a specific task
  28. for point_branch in start_point_branch_list:
  29.     data_augmentation_prompt = "Identify and determine the relevant fields that can be used to generate a plan for data augmentation for the topic: " + point_branch
  30.     data_augmentation_results = chatbot(data_augmentation_prompt)
  31.  
  32.  
  33. # Step 5: Generate a question and answer prompt
  34. for result in data_augmentation_results:
  35.     qa_prompt = "Based on the following, make a question and answer in the type of... **Instruction** [Question] n [**Input**] n [**Output**]"
  36.     qa_results = chatbot(qa_prompt)
  37.  
  38.  
  39. --------------
  40.  
  41. rawText to QA using LLM: (remove/replace [...]) Source: https://youtu.be/JJ5mcdEIbj8
  42. ```json
  43.  
  44. Return In the Valide JSON format of auch as:
  45.  
  46. [
  47.   {
  48.     "instruction": "Ask Question n here.",
  49.     "input": "Same global context for all question that will be stay: Touch Rugby International Playing Rules 2020.",
  50.     "output": "The Answer."
  51.   },
  52.   {
  53. ...
  54.  
  55. [INPUT TEXT]
  56. "Drop-Off A procedure used to determine a Winner following equal the expiration of normal Duration. Duration The length of time a competition match lasts, which is forty-five minutes, inclusive of a five (5) minute Half Time. End of Play When the Referee indicates completion of the match. Exclusion When a player is sent to the nearest Sin Bin Area fol penalties by the Defending Team upon entering their Seven Metre Zone. The player is counted as a player on the Field of Play an cannot be replaced or interchanged. FIT Playing Rules. 5th Edition Touch Football Aust 1FIT Federation of International rouge Field of Play The playing area boundéd by the Sidelines and Dea of which are out of bounds. See Appendix 1. Forced Interchange When a player is required to undertake a com an Infringement ruled more serious than a Penalty but less..."
  57.  
  58. [REQUEST for Q+A]
  59. Provide 5 question and answer pair(s) based on he text above. The answers should borrow, verbatim, and must incorporate ALL information from the [INPUT TEXT] above. In providing each question, consider that the reader does not see or have access to any of the other questions for context. Vary different style and format of high-quality of unique and not redendent dataset questions. Respond in JSON format on a new line for each question and answer. Do not include question numbers. Here is an example of two question answer pairs:
  60.  
  61. [
  62.   {
  63.     "instruction": "In the context of Touch Rugby and The International Playing Rules set in 2020, what does the Half refer to?nThe Half is the plaver who takes Possession following a Rollball.nnln the context of Touch Football Australia, what is the purpose of the TFA 8th Edition Playing Rules?",
  64.     "input": "Touch Rugby International Playing Rules 2020.",
  65.     "output": "The purpose of the TFA 8th Edition Playing Rules is to provide a standardized set of rules for the sport of touch football that can be applied across all levels of the game. from community competitions to the elite international level."
  66.   },
  67.   {
  68. ...
  69.  
  70. possibly repeating/redondant
Advertisement
Add Comment
Please, Sign In to add comment