Guest User

Untitled

a guest
Oct 17th, 2018
89
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.04 KB | None | 0 0
  1. REGISTER 's3://bucket_name/udf/utils.py' using jython as utils;
  2.  
  3. # Cogroup the email events and segments
  4. # This gives you records consisting of bags of email_events and segments
  5. grp_email_segments = cogroup email_event by subscriber_key, segments by customer_id;
  6.  
  7. # Filter out the segments that don't match email events
  8. with_email = filter grp_email_segments by not IsEmpty(email_event);
  9.  
  10. # You may want to use these later, if just for error checking
  11. without_segment = filter grp_email_segments by IsEmpty(segment);
  12.  
  13. # Flatten everything
  14. email_segment_dates = foreach with_email generate
  15. flatten(email_event),
  16. flatten(segments);
  17.  
  18. # Run the range UDF
  19. # You'll have to run a "describe email_segment_dates" to get the correct naming
  20. email_segments = foreach email_segment_dates generate
  21. email_event.subscriber_key,
  22. email_event.send_id
  23. ...
  24. utils.is_contained_within(email_event.date, segment.start_date, segment.end_date) as is_contained;
  25.  
  26. # Filter out the false records without segments
  27. emails_with_segments = filter email_segments by (is_contained == 1);
Add Comment
Please, Sign In to add comment