Advertisement
Guest User

Untitled

a guest
Oct 20th, 2019
121
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.29 KB | None | 0 0
  1. At Shopee, we strive to ensure fairness to both buyers and sellers, and improve user experience by identifying and discouraging negative behaviour. Listing quality is a major area where poor behaviours often occur.
  2.  
  3. Every transaction on Shopee starts from a product listing. In order to get more sales, sellers may commit certain behaviour to increase their listings’ exposure and gain an unfair advantage over other shops. An example of such behaviour is keyword spam, whereby sellers input irrelevant keywords in the listing title that do not accurately describe the products they are selling. For instance, the product title claims that the listing is for “pants, shirt, shoes”, while the item that is actually being sold is just a pair of pants. Sellers do this in the hope that when buyers search for “shirt” or “shoes”, their listings would also appear in the search result.
  4.  
  5. This behaviour of spamming irrelevant keywords in the title may confuse search engine and could affect the accuracy of search results, and therefore result in a poor user experience. Therefore, it is important to identify, punish and deter such behaviour from existing on Shopee.
  6.  
  7. However, at the same time, we also need to consider the case that sellers input multiple product keywords in the listing title but those keywords are relevant to the products. An example is that the underlying product is a pair of shoes, and seller describes it in the listing title as "shoes, sneakers". In this case, the seller is trying to increase their search exposure, but does not use a misleading product title, and therefore should not be penalized.
  8.  
  9. While it is important to deter negative behaviour, it is also very important to avoid wrongly discouraging positive behaviour.
  10.  
  11. Task:
  12.  
  13. Using the keyword directory, identify the product groups that are present in the product title.
  14.  
  15. Example:
  16.  
  17. Keyword list:
  18. Group: 0, Keywords: jacket
  19. Group: 1, Keywords: windbreaker, raincoat
  20. Product title:
  21. Index: 0, Name: red jacket windbreaker
  22. Since product title contains keywords from both group 0 and 1.
  23. --> groups found: [0,1]
  24.  
  25. Input
  26.  
  27. 1.Extra Material 2 - keyword list_with substring.csv: List of product keywords, separated into product groups. Each row is a product group.
  28.  
  29. The same keyword may appear in multiple groups (eg. notebook)
  30. Some of the keywords are substrings of other keywords. In this case, the longer word should take priority over the substring.
  31. 2.Keyword_spam_question.csv: File containing product name that you need to extract the product keyword groups.
  32.  
  33. Further Details
  34.  
  35. You will be given a directory of product keywords, organized into keyword groups. The .csv file provided will have 2 columns:
  36.  
  37. Group: arbitrary index of the product keyword grouping
  38.  
  39. Keywords: product keyword.
  40.  
  41. Keywords on the same row denote words that can refer to the same product, and therefore should be considered the same product type (eg. raincoat and windbreaker can refer to the same product)
  42. Keywords on different rows denote words that refer to different product types (eg. shirt and raincoat refer to different product types)
  43. One keyword may appear in multiple groups (eg. notebook could refer to a computing product or a stationary)
  44. Note: you do not need to look into the correctness of the grouping, and should use it as-is.
  45. Using the keyword directory, you need to identify the product groups that are present in the product title. If 2 product groups are both equally presentable in the result, choose the group with the smaller index.
  46.  
  47. Eg 1: White netbook, ultrabook and gaming mousepad should contain product groups [77, 85], because keyword netbook is in group 77; keyword ultrabook is also in group 77; keyword gaming mousepad is in group 85.
  48. Eg 2: Beautiful red notebook shirt jeans should contain product groups [6, 29, 77], because keyword notebook is in groups 77 and 204; keyword shirt is in group 29; keyword jeans is in group 6. Since using group 77 or group 204 will both result in 3 product groups, we will choose group 77 due to the smaller index.
  49. Eg 3: Printer toner wallpaper ink should contain product groups [81, 182], because keyword Printer toner is in group 81; keyword wallpaper is in group 182. Even though keyword Printer is in another group (79), it is a substring of Printer toner. Therefore 'Printer toner' takes priority over 'Printer'.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement