Advertisement
Guest User

Untitled

a guest
Nov 27th, 2024
100
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.46 KB | None | 0 0
  1. -- Two Million Dataset
  2.  
  3. CREATE OR REPLACE TABLE infringing_data AS
  4. WITH data AS (
  5. SELECT * FROM
  6. read_json ( 'data/posts_*.jsonl', format = 'newline_delimited' )
  7. )
  8. SELECT *,
  9. ( 'https://bsky.app/profile/<did>/post/' || regexp_extract (uri, 'at://.*/app.bsky.feed.post/([a-z0-9]+)', 1) ) as post_url
  10. FROM data
  11. WHERE author = '<did>';
  12.  
  13. COPY ( select post_url FROM infringing_data ) TO 'data/infringing_data.csv';
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement