Advertisement
Guest User

Untitled

a guest
Mar 28th, 2017
57
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.71 KB | None | 0 0
  1. # Natural Language Queries using the semparse demo
  2. The current NLQ demo, which I am calling `semparse` in this document, is not ready for prime time. It gives poor results on common non-NL queries (e.g. "Deep Learning"). We don't have good data on how it performs on NL queries, but anecdotally we've seen it give useful results. As I see it we have three options for using the existing component in production:
  3.  
  4. * Improve the `semparse` component to match the current customer experience.
  5. * Create a labs.semanticscholar.org site to allow customers to opt-in to semantic parsing.
  6. * Use prefix filtering to opt customers into semantic parsing.
  7.  
  8. All three options have widely varying costs and customer impact.
  9.  
  10. ## Recommended option
  11.  
  12. ### "Labs" search landing page.
  13.  
  14. I recommend we create a new landing page, or separate site, for natural language queries. This option is relatively cheap and has no impact on existing customers.
  15.  
  16. This solution would be composed of three new components.
  17.  
  18. 1. A `semparse` subproject in SBT.
  19. 1. Add a build step to make the serialized CcgParser
  20. 1. Fix up the dependencies (e.g. jklol.jar)
  21. 1. Trim the code to the bare essentials.
  22. 2. A new page, similar to the homepage, for NL queries.
  23. 1. A new `nql.jsx` page that contains a search bar.
  24. 1. Routes each search through `semparse`.
  25. 1. (Optional:) Constrains other searches to use `semparse` via a cookie or new SERP.
  26. 3. (Optional:) Potentially a route for the new URL
  27. 1. An ALB or Nginx route for the new URL to the new page.
  28.  
  29.  
  30. #### Pros
  31. * Cheap.
  32. * Preserves existing search experience.
  33.  
  34. #### Cons
  35. * Does nothing to directly improve the semantic parser.
  36. * Adds some (~300MB) memory pressure to the existing S2 app.
  37. * Potentially hard for customers to find and try.
  38. * No way to measure success.
  39.  
  40. ## Other options considered
  41.  
  42. ### Production-ize demo parser
  43.  
  44. This is very ambiguous. As an engineer with no prior NLU experience, I don't have great insight into how we address problems. I believe this is a large research project being done by other folks at AI2, so we may be able to leverage that.
  45.  
  46. #### Pros
  47. * Deliberate, measurable improvements to the parser.
  48.  
  49. #### Cons
  50. * Expensive, potentially very expense.
  51. * Ambiguous.
  52. * Requires investment in A/B testing, query scoring, etc...
  53.  
  54. ### Prefix filtering
  55.  
  56. Simply funnel any query that begins with a pre-defined set of prefixes (e.g. "Who", "What", "How"...) to the `semparse` module. Otherwise execute the normal query flow.
  57.  
  58. #### Pros
  59. * Cheapest option.
  60. * Most queries will be unaffected.
  61.  
  62. #### Cons
  63. * Hard for customers to discover.
  64. * May catch some not-NL queries accidentally.
  65. * No way to measure success.
  66. * Adds some (~300MB) memory pressure to the existing S2 app.
  67. * Does nothing to directly improve the semantic parser.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement