petrogradphilosopher

Removing watermarks from a PDF document under OS X

Nov 29th, 2016
63
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. #!/bin/sh
  2.  
  3. # USAGE: removewatermark file.pdf
  4. # outputs to $OUTDIR/file.pdf
  5.  
  6. # How to use this script to remove watermarks from PDF documents on OS X (MacOS):
  7. #
  8. # 1. port install qpdf
  9. #
  10. # 2. normally, you would also run "port install pdftk" but that doesn't work
  11. # on os X 10.11+ as of 2016-11, so instead download the installer from pdflabs
  12. # referenced in http://stackoverflow.com/a/33248310.
  13. #
  14. # 3. Write a program called "stripwm" that removes all watermarks from the
  15. # uncompressed pdf read from standard input and writes the result to standard
  16. # output. Save it in the current directory, or change the STRIPWM variable
  17. # to point to its location. See below for an example.
  18. #
  19. # References:
  20. #
  21. # http://superuser.com/questions/455462/how-to-remove-a-watermark-from-a-pdf-file
  22.  
  23. OUTDIR=fixed
  24. STRIPWM=./stripwm
  25. DECRYPTED=$(mktemp /tmp/fixit.XXXXXX)
  26. UNCOMPRESSED=$(mktemp /tmp/fixit.XXXXXX)
  27. DONE=$(mktemp /tmp/fixit.XXXXXX)
  28. qpdf --decrypt "$1" "${DECRYPTED}"
  29. pdftk "${DECRYPTED}" output "${UNCOMPRESSED}" uncompress
  30. "${STRIPWM}" <"${UNCOMPRESSED}" >"${DONE}"
  31. pdftk "${DONE}" output "${OUTDIR}/$1" compress
  32.  
  33. # Here is an example to remove watermarks from MAA books:
  34.  
  35. <<STRIPWM
  36. #!/usr/bin/perl
  37.  
  38. $_ = join("", <>);
  39.  
  40. s~
  41. (\n\d+\s+\d+\s+obj\s+
  42. <<\s+
  43. /Length\s+\d+\s+
  44. >>\s+
  45. stream\s+)
  46. Q\s+
  47. q\s+
  48. BT\s+
  49. 0\s+0\s+0\s+rg\s+
  50. [^\n]*Tf\s+
  51. [^\n]+Tm\s+
  52. [^\n]*Your\s+Name[^\n]*Tj\s+
  53. [^\n]*Tm\s+
  54. ET\s+
  55. Q\s+
  56. (endstream\s+
  57. endobj[^\S\n]*\n)~$1$2~sgx;
  58.  
  59. print;
  60. STRIPWM
RAW Paste Data