Guest User

ineptpdf 8.4.48

a guest
Nov 5th, 2010
11,702
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 114.72 KB | None | 0 0
  1. #! /usr/bin/python
  2.  
  3. # ineptpdf8.4.48.pyw
  4. # ineptpdf, version 8.4.48
  5.  
  6. # To run this program install Python 2.7 from http://www.python.org/download/
  7. #
  8. # PyCrypto from http://www.voidspace.org.uk/python/modules.shtml#pycrypto
  9. #
  10. # and PyWin Extension (Win32API module) from
  11. # http://sourceforge.net/projects/pywin32/files/
  12. #
  13. # Make sure to install the dedicated versions for Python 2.7.
  14. #
  15. # It's recommended to use the 32-Bit Python Windows versions (even with a 64-bit
  16. # Windows system).
  17. #
  18. # Save this script file as
  19. # ineptpdf8.4.48.pyw and double-click on it to run it.
  20.  
  21. # Revision history:
  22. #   1 - Initial release
  23. #   2 - Improved determination of key-generation algorithm
  24. #   3 - Correctly handle PDF >=1.5 cross-reference streams
  25. #   4 - Removal of ciando's personal ID (anon)
  26. #   5 - removing small bug with V3 ebooks (anon)
  27. #   6 - changed to adeptkey4.der format for 1.7.2 support (anon)
  28. #   6.1 - backward compatibility for 1.7.1 and old adeptkey.der (anon)
  29. #   7 - Get cross reference streams and object streams working for input.
  30. #       Not yet supported on output but this only effects file size,
  31. #       not functionality. (anon2)
  32. #   7.1 - Correct a problem when an old trailer is not followed by startxref (anon2)
  33. #   7.2 - Correct malformed Mac OS resource forks for Stanza
  34. #       - Support for cross ref streams on output (decreases file size) (anon2)
  35. #   7.3 - Correct bug in trailer with cross ref stream that caused the error (anon2)
  36. #         "The root object is missing or invalid" in Adobe Reader.
  37. #   7.4 - Force all generation numbers in output file to be 0, like in v6.
  38. #         Fallback code for wrong xref improved (search till last trailer
  39. #         instead of first) (anon2)
  40. #   8 - fileopen user machine identifier support (Tetrachroma)
  41. #   8.1 - fileopen user cookies support (Tetrachroma)
  42. #   8.2 - fileopen user name/password support (Tetrachroma)
  43. #   8.3 - fileopen session cookie support (Tetrachroma)
  44. #   8.3.1 - fix for the "specified key file does not exist" error (Tetrachroma)
  45. #   8.3.2 - improved server result parsing (Tetrachroma)
  46. #   8.4 - Ident4D and encrypted Uuid support (Tetrachroma)
  47. #   8.4.1 - improved MAC address processing (Tetrachroma)
  48. #   8.4.2 - FowP3Uuid fallback file processing (Tetrachroma)
  49. #   8.4.3 - improved user/password pdf file detection (Tetrachroma)
  50. #   8.4.4 - small bugfix (Tetrachroma)
  51. #   8.4.5 - improved cookie host searching (Tetrachroma)
  52. #   8.4.6 - STRICT parsing disabled (non-standard pdf processing) (Tetrachroma)
  53. #   8.4.7 - UTF-8 input file conversion (Tetrachroma)
  54. #   8.4.8 - fix for more rare utf8 problems (Tetrachroma)
  55. #   8.4.9 - solution for utf8 in comination with
  56. #           ident4id method (Tetrachroma)
  57. #   8.4.10 - line feed processing, non c system drive patch, nrbook support (Tetrachroma)
  58. #   8.4.11 - alternative ident4id calculation (Tetrachroma)
  59. #   8.4.12 - fix for capital username characters and
  60. #            other unusual user login names (Tetrachroma & ZeroPoint)
  61. #   8.4.13 - small bug fixes (Tetrachroma)
  62. #   8.4.14 - fix for non-standard-conform fileopen pdfs (Tetrachroma)
  63. #   8.4.15 - 'bad file descriptor'-fix (Tetrachroma)
  64. #   8.4.16 - improves user/pass detection (Tetrachroma)
  65. #   8.4.17 - fix for several '=' chars in a DPRM entity (Tetrachroma)
  66. #   8.4.18 - follow up bug fix for the DPRM problem,
  67. #            more readable error messages (Tetrachroma)
  68. #   8.4.19 - 2nd fix for 'bad file descriptor' problem (Tetrachroma)
  69. #   8.4.20 - follow up patch (Tetrachroma)
  70. #   8.4.21 - 3rd patch for 'bad file descriptor' (Tetrachroma)
  71. #   8.4.22 - disable prints for exception prevention (Tetrachroma)
  72. #   8.4.23 - check for additional security attributes (Tetrachroma)
  73. #   8.4.24 - improved cookie session support (Tetrachroma)
  74. #   8.4.25 - more compatibility with unicode files (Tetrachroma)
  75. #   8.4.26 - automated session/user cookie request function (works
  76. #            only with Firefox 3.x+) (Tetrachroma)
  77. #   8.4.27 - user/password fallback
  78. #   8.4.28 - AES decryption, improved misconfigured pdf handling,
  79. #            limited experimental APS support (Tetrachroma & Neisklar)
  80. #   8.4.29 - backport for bad formatted rc4 encrypted pdfs (Tetrachroma)
  81. #   8.4.30 - extended authorization attributes support (Tetrachroma)
  82. #   8.4.31 - improved session cookie and better server response error
  83. #            handling (Tetrachroma)
  84. #   8.4.33 - small cookie optimizations (Tetrachroma)
  85. #   8.4.33 - debug output option (Tetrachroma)
  86. #   8.4.34 - better user/password management
  87. #            handles the 'AskUnp' response) (Tetrachroma)
  88. #   8.4.35 - special handling for non-standard systems (Tetrachroma)
  89. #   8.4.36 - previous machine/disk handling [PrevMach/PrevDisk] (Tetrachroma)
  90. #   8.4.36 - FOPN_flock support (Tetrachroma)
  91. #   8.4.37 - patch for unicode paths/filenames (Tetrachroma)
  92. #   8.4.38 - small fix for user/password dialog (Tetrachroma)
  93. #   8.4.39 - sophisticated request mode differentiation, forced
  94. #            uuid calculation (Tetrachroma)
  95. #   8.4.40 - fix for non standard server responses (Tetrachroma)
  96. #   8.4.41 - improved user/password request windows,
  97. #            better server response tolerance (Tetrachroma)
  98. #   8.4.42 - improved nl/cr server response parsing (Tetrachroma)
  99. #   8.4.43 - fix for user names longer than 13 characters and special
  100. #            uuid encryption (Tetrachroma)
  101. #   8.4.44 - another fix for ident4d problem (Tetrachroma)
  102. #   8.4.45 - 2nd fix for ident4d problem (Tetrachroma)
  103. #   8.4.46 - script cleanup and optimizations (Tetrachroma)
  104. #   8.4.47 - script identification change to Adobe Reader (Tetrachroma)
  105. #   8.4.48 - improved tolerance for false file/registry entries (Tetrachroma)
  106.  
  107. """
  108. Decrypts Adobe ADEPT-encrypted and Fileopen PDF files.
  109. """
  110.  
  111. from __future__ import with_statement
  112.  
  113. __license__ = 'GPL v3'
  114.  
  115. import sys
  116. import os
  117. import re
  118. import zlib
  119. import struct
  120. import hashlib
  121. from itertools import chain, islice
  122. import xml.etree.ElementTree as etree
  123. import Tkinter
  124. import Tkconstants
  125. import tkFileDialog
  126. import tkMessageBox
  127. # added for fileopen support
  128. import urllib
  129. import urlparse
  130. import time
  131. import socket
  132. import string
  133. import uuid
  134. import subprocess
  135. import time
  136. import getpass
  137. from ctypes import *
  138. import traceback
  139. import inspect
  140. import tempfile
  141. import sqlite3
  142. try:
  143.     from Crypto.Cipher import ARC4
  144.     # needed for newer pdfs
  145.     from Crypto.Cipher import AES
  146.     from Crypto.Hash import SHA256
  147.     from Crypto.PublicKey import RSA
  148.    
  149. except ImportError:
  150.     ARC4 = None
  151.     RSA = None
  152. try:
  153.     from cStringIO import StringIO
  154. except ImportError:
  155.     from StringIO import StringIO
  156.  
  157. class ADEPTError(Exception):
  158.     pass
  159.  
  160. # global variable (needed for fileopen and password decryption)
  161. INPUTFILEPATH = ''
  162. KEYFILEPATH = ''
  163. PASSWORD = ''
  164. DEBUG_MODE = False
  165. IVERSION = '8.4.48'
  166.  
  167. # Do we generate cross reference streams on output?
  168. # 0 = never
  169. # 1 = only if present in input
  170. # 2 = always
  171.  
  172. GEN_XREF_STM = 1
  173.  
  174. # This is the value for the current document
  175. gen_xref_stm = False # will be set in PDFSerializer
  176.  
  177. ###
  178. ### ASN.1 parsing code from tlslite
  179.  
  180. def bytesToNumber(bytes):
  181.     total = 0L
  182.     for byte in bytes:
  183.         total = (total << 8) + byte
  184.     return total
  185.  
  186. class ASN1Error(Exception):
  187.     pass
  188.  
  189. class ASN1Parser(object):
  190.     class Parser(object):
  191.         def __init__(self, bytes):
  192.             self.bytes = bytes
  193.             self.index = 0
  194.    
  195.         def get(self, length):
  196.             if self.index + length > len(self.bytes):
  197.                 raise ASN1Error("Error decoding ASN.1")
  198.             x = 0
  199.             for count in range(length):
  200.                 x <<= 8
  201.                 x |= self.bytes[self.index]
  202.                 self.index += 1
  203.             return x
  204.    
  205.         def getFixBytes(self, lengthBytes):
  206.             bytes = self.bytes[self.index : self.index+lengthBytes]
  207.             self.index += lengthBytes
  208.             return bytes
  209.    
  210.         def getVarBytes(self, lengthLength):
  211.             lengthBytes = self.get(lengthLength)
  212.             return self.getFixBytes(lengthBytes)
  213.    
  214.         def getFixList(self, length, lengthList):
  215.             l = [0] * lengthList
  216.             for x in range(lengthList):
  217.                 l[x] = self.get(length)
  218.             return l
  219.    
  220.         def getVarList(self, length, lengthLength):
  221.             lengthList = self.get(lengthLength)
  222.             if lengthList % length != 0:
  223.                 raise ASN1Error("Error decoding ASN.1")
  224.             lengthList = int(lengthList/length)
  225.             l = [0] * lengthList
  226.             for x in range(lengthList):
  227.                 l[x] = self.get(length)
  228.             return l
  229.    
  230.         def startLengthCheck(self, lengthLength):
  231.             self.lengthCheck = self.get(lengthLength)
  232.             self.indexCheck = self.index
  233.    
  234.         def setLengthCheck(self, length):
  235.             self.lengthCheck = length
  236.             self.indexCheck = self.index
  237.    
  238.         def stopLengthCheck(self):
  239.             if (self.index - self.indexCheck) != self.lengthCheck:
  240.                 raise ASN1Error("Error decoding ASN.1")
  241.    
  242.         def atLengthCheck(self):
  243.             if (self.index - self.indexCheck) < self.lengthCheck:
  244.                 return False
  245.             elif (self.index - self.indexCheck) == self.lengthCheck:
  246.                 return True
  247.             else:
  248.                 raise ASN1Error("Error decoding ASN.1")
  249.  
  250.     def __init__(self, bytes):
  251.         p = self.Parser(bytes)
  252.         p.get(1)
  253.         self.length = self._getASN1Length(p)
  254.         self.value = p.getFixBytes(self.length)
  255.  
  256.     def getChild(self, which):
  257.         p = self.Parser(self.value)
  258.         for x in range(which+1):
  259.             markIndex = p.index
  260.             p.get(1)
  261.             length = self._getASN1Length(p)
  262.             p.getFixBytes(length)
  263.         return ASN1Parser(p.bytes[markIndex:p.index])
  264.  
  265.     def _getASN1Length(self, p):
  266.         firstLength = p.get(1)
  267.         if firstLength<=127:
  268.             return firstLength
  269.         else:
  270.             lengthLength = firstLength & 0x7F
  271.             return p.get(lengthLength)
  272.  
  273. ###
  274. ### PDF parsing routines from pdfminer, with changes for EBX_HANDLER
  275.  
  276. ##  Utilities
  277. ##
  278. def choplist(n, seq):
  279.     '''Groups every n elements of the list.'''
  280.     r = []
  281.     for x in seq:
  282.         r.append(x)
  283.         if len(r) == n:
  284.             yield tuple(r)
  285.             r = []
  286.     return
  287.  
  288. def nunpack(s, default=0):
  289.     '''Unpacks up to 4 bytes big endian.'''
  290.     l = len(s)
  291.     if not l:
  292.         return default
  293.     elif l == 1:
  294.         return ord(s)
  295.     elif l == 2:
  296.         return struct.unpack('>H', s)[0]
  297.     elif l == 3:
  298.         return struct.unpack('>L', '\x00'+s)[0]
  299.     elif l == 4:
  300.         return struct.unpack('>L', s)[0]
  301.     else:
  302.         return TypeError('invalid length: %d' % l)
  303.  
  304.  
  305. STRICT = 0
  306.  
  307.  
  308. ##  PS Exceptions
  309. ##
  310. class PSException(Exception): pass
  311. class PSEOF(PSException): pass
  312. class PSSyntaxError(PSException): pass
  313. class PSTypeError(PSException): pass
  314. class PSValueError(PSException): pass
  315.  
  316.  
  317. ##  Basic PostScript Types
  318. ##
  319.  
  320. # PSLiteral
  321. class PSObject(object): pass
  322.  
  323. class PSLiteral(PSObject):
  324.     '''
  325.    PS literals (e.g. "/Name").
  326.    Caution: Never create these objects directly.
  327.    Use PSLiteralTable.intern() instead.
  328.    '''
  329.     def __init__(self, name):
  330.         self.name = name
  331.         return
  332.    
  333.     def __repr__(self):
  334.         name = []
  335.         for char in self.name:
  336.             if not char.isalnum():
  337.                 char = '#%02x' % ord(char)
  338.             name.append(char)
  339.         return '/%s' % ''.join(name)
  340.  
  341. # PSKeyword
  342. class PSKeyword(PSObject):
  343.     '''
  344.    PS keywords (e.g. "showpage").
  345.    Caution: Never create these objects directly.
  346.    Use PSKeywordTable.intern() instead.
  347.    '''
  348.     def __init__(self, name):
  349.         self.name = name
  350.         return
  351.    
  352.     def __repr__(self):
  353.         return self.name
  354.  
  355. # PSSymbolTable
  356. class PSSymbolTable(object):
  357.    
  358.     '''
  359.    Symbol table that stores PSLiteral or PSKeyword.
  360.    '''
  361.    
  362.     def __init__(self, classe):
  363.         self.dic = {}
  364.         self.classe = classe
  365.         return
  366.    
  367.     def intern(self, name):
  368.         if name in self.dic:
  369.             lit = self.dic[name]
  370.         else:
  371.             lit = self.classe(name)
  372.             self.dic[name] = lit
  373.         return lit
  374.  
  375. PSLiteralTable = PSSymbolTable(PSLiteral)
  376. PSKeywordTable = PSSymbolTable(PSKeyword)
  377. LIT = PSLiteralTable.intern
  378. KWD = PSKeywordTable.intern
  379. KEYWORD_BRACE_BEGIN = KWD('{')
  380. KEYWORD_BRACE_END = KWD('}')
  381. KEYWORD_ARRAY_BEGIN = KWD('[')
  382. KEYWORD_ARRAY_END = KWD(']')
  383. KEYWORD_DICT_BEGIN = KWD('<<')
  384. KEYWORD_DICT_END = KWD('>>')
  385.  
  386.  
  387. def literal_name(x):
  388.     if not isinstance(x, PSLiteral):
  389.         if STRICT:
  390.             raise PSTypeError('Literal required: %r' % x)
  391.         else:
  392.             return str(x)
  393.     return x.name
  394.  
  395. def keyword_name(x):
  396.     if not isinstance(x, PSKeyword):
  397.         if STRICT:
  398.             raise PSTypeError('Keyword required: %r' % x)
  399.         else:
  400.             return str(x)
  401.     return x.name
  402.  
  403.  
  404. ##  PSBaseParser
  405. ##
  406. EOL = re.compile(r'[\r\n]')
  407. SPC = re.compile(r'\s')
  408. NONSPC = re.compile(r'\S')
  409. HEX = re.compile(r'[0-9a-fA-F]')
  410. END_LITERAL = re.compile(r'[#/%\[\]()<>{}\s]')
  411. END_HEX_STRING = re.compile(r'[^\s0-9a-fA-F]')
  412. HEX_PAIR = re.compile(r'[0-9a-fA-F]{2}|.')
  413. END_NUMBER = re.compile(r'[^0-9]')
  414. END_KEYWORD = re.compile(r'[#/%\[\]()<>{}\s]')
  415. END_STRING = re.compile(r'[()\134]')
  416. OCT_STRING = re.compile(r'[0-7]')
  417. ESC_STRING = { 'b':8, 't':9, 'n':10, 'f':12, 'r':13, '(':40, ')':41, '\\':92 }
  418.  
  419. class PSBaseParser(object):
  420.  
  421.     '''
  422.    Most basic PostScript parser that performs only basic tokenization.
  423.    '''
  424.     BUFSIZ = 4096
  425.  
  426.     def __init__(self, fp):
  427.         self.fp = fp
  428.         self.seek(0)
  429.         return
  430.  
  431.     def __repr__(self):
  432.         return '<PSBaseParser: %r, bufpos=%d>' % (self.fp, self.bufpos)
  433.  
  434.     def flush(self):
  435.         return
  436.    
  437.     def close(self):
  438.         self.flush()
  439.         return
  440.    
  441.     def tell(self):
  442.         return self.bufpos+self.charpos
  443.  
  444.     def poll(self, pos=None, n=80):
  445.         pos0 = self.fp.tell()
  446.         if not pos:
  447.             pos = self.bufpos+self.charpos
  448.         self.fp.seek(pos)
  449.         ##print >>sys.stderr, 'poll(%d): %r' % (pos, self.fp.read(n))
  450.         self.fp.seek(pos0)
  451.         return
  452.  
  453.     def seek(self, pos):
  454.         '''
  455.        Seeks the parser to the given position.
  456.        '''
  457.         self.fp.seek(pos)
  458.         # reset the status for nextline()
  459.         self.bufpos = pos
  460.         self.buf = ''
  461.         self.charpos = 0
  462.         # reset the status for nexttoken()
  463.         self.parse1 = self.parse_main
  464.         self.tokens = []
  465.         return
  466.  
  467.     def fillbuf(self):
  468.         if self.charpos < len(self.buf): return
  469.         # fetch next chunk.
  470.         self.bufpos = self.fp.tell()
  471.         self.buf = self.fp.read(self.BUFSIZ)
  472.         if not self.buf:
  473.             raise PSEOF('Unexpected EOF')
  474.         self.charpos = 0
  475.         return
  476.    
  477.     def parse_main(self, s, i):
  478.         m = NONSPC.search(s, i)
  479.         if not m:
  480.             return (self.parse_main, len(s))
  481.         j = m.start(0)
  482.         c = s[j]
  483.         self.tokenstart = self.bufpos+j
  484.         if c == '%':
  485.             self.token = '%'
  486.             return (self.parse_comment, j+1)
  487.         if c == '/':
  488.             self.token = ''
  489.             return (self.parse_literal, j+1)
  490.         if c in '-+' or c.isdigit():
  491.             self.token = c
  492.             return (self.parse_number, j+1)
  493.         if c == '.':
  494.             self.token = c
  495.             return (self.parse_float, j+1)
  496.         if c.isalpha():
  497.             self.token = c
  498.             return (self.parse_keyword, j+1)
  499.         if c == '(':
  500.             self.token = ''
  501.             self.paren = 1
  502.             return (self.parse_string, j+1)
  503.         if c == '<':
  504.             self.token = ''
  505.             return (self.parse_wopen, j+1)
  506.         if c == '>':
  507.             self.token = ''
  508.             return (self.parse_wclose, j+1)
  509.         self.add_token(KWD(c))
  510.         return (self.parse_main, j+1)
  511.                            
  512.     def add_token(self, obj):
  513.         self.tokens.append((self.tokenstart, obj))
  514.         return
  515.    
  516.     def parse_comment(self, s, i):
  517.         m = EOL.search(s, i)
  518.         if not m:
  519.             self.token += s[i:]
  520.             return (self.parse_comment, len(s))
  521.         j = m.start(0)
  522.         self.token += s[i:j]
  523.         # We ignore comments.
  524.         #self.tokens.append(self.token)
  525.         return (self.parse_main, j)
  526.    
  527.     def parse_literal(self, s, i):
  528.         m = END_LITERAL.search(s, i)
  529.         if not m:
  530.             self.token += s[i:]
  531.             return (self.parse_literal, len(s))
  532.         j = m.start(0)
  533.         self.token += s[i:j]
  534.         c = s[j]
  535.         if c == '#':
  536.             self.hex = ''
  537.             return (self.parse_literal_hex, j+1)
  538.         self.add_token(LIT(self.token))
  539.         return (self.parse_main, j)
  540.    
  541.     def parse_literal_hex(self, s, i):
  542.         c = s[i]
  543.         if HEX.match(c) and len(self.hex) < 2:
  544.             self.hex += c
  545.             return (self.parse_literal_hex, i+1)
  546.         if self.hex:
  547.             self.token += chr(int(self.hex, 16))
  548.         return (self.parse_literal, i)
  549.  
  550.     def parse_number(self, s, i):
  551.         m = END_NUMBER.search(s, i)
  552.         if not m:
  553.             self.token += s[i:]
  554.             return (self.parse_number, len(s))
  555.         j = m.start(0)
  556.         self.token += s[i:j]
  557.         c = s[j]
  558.         if c == '.':
  559.             self.token += c
  560.             return (self.parse_float, j+1)
  561.         try:
  562.             self.add_token(int(self.token))
  563.         except ValueError:
  564.             pass
  565.         return (self.parse_main, j)
  566.     def parse_float(self, s, i):
  567.         m = END_NUMBER.search(s, i)
  568.         if not m:
  569.             self.token += s[i:]
  570.             return (self.parse_float, len(s))
  571.         j = m.start(0)
  572.         self.token += s[i:j]
  573.         self.add_token(float(self.token))
  574.         return (self.parse_main, j)
  575.    
  576.     def parse_keyword(self, s, i):
  577.         m = END_KEYWORD.search(s, i)
  578.         if not m:
  579.             self.token += s[i:]
  580.             return (self.parse_keyword, len(s))
  581.         j = m.start(0)
  582.         self.token += s[i:j]
  583.         if self.token == 'true':
  584.             token = True
  585.         elif self.token == 'false':
  586.             token = False
  587.         else:
  588.             token = KWD(self.token)
  589.         self.add_token(token)
  590.         return (self.parse_main, j)
  591.  
  592.     def parse_string(self, s, i):
  593.         m = END_STRING.search(s, i)
  594.         if not m:
  595.             self.token += s[i:]
  596.             return (self.parse_string, len(s))
  597.         j = m.start(0)
  598.         self.token += s[i:j]
  599.         c = s[j]
  600.         if c == '\\':
  601.             self.oct = ''
  602.             return (self.parse_string_1, j+1)
  603.         if c == '(':
  604.             self.paren += 1
  605.             self.token += c
  606.             return (self.parse_string, j+1)
  607.         if c == ')':
  608.             self.paren -= 1
  609.             if self.paren:
  610.                 self.token += c
  611.                 return (self.parse_string, j+1)
  612.         self.add_token(self.token)
  613.         return (self.parse_main, j+1)
  614.     def parse_string_1(self, s, i):
  615.         c = s[i]
  616.         if OCT_STRING.match(c) and len(self.oct) < 3:
  617.             self.oct += c
  618.             return (self.parse_string_1, i+1)
  619.         if self.oct:
  620.             self.token += chr(int(self.oct, 8))
  621.             return (self.parse_string, i)
  622.         if c in ESC_STRING:
  623.             self.token += chr(ESC_STRING[c])
  624.         return (self.parse_string, i+1)
  625.  
  626.     def parse_wopen(self, s, i):
  627.         c = s[i]
  628.         if c.isspace() or HEX.match(c):
  629.             return (self.parse_hexstring, i)
  630.         if c == '<':
  631.             self.add_token(KEYWORD_DICT_BEGIN)
  632.             i += 1
  633.         return (self.parse_main, i)
  634.  
  635.     def parse_wclose(self, s, i):
  636.         c = s[i]
  637.         if c == '>':
  638.             self.add_token(KEYWORD_DICT_END)
  639.             i += 1
  640.         return (self.parse_main, i)
  641.  
  642.     def parse_hexstring(self, s, i):
  643.         m = END_HEX_STRING.search(s, i)
  644.         if not m:
  645.             self.token += s[i:]
  646.             return (self.parse_hexstring, len(s))
  647.         j = m.start(0)
  648.         self.token += s[i:j]
  649.         token = HEX_PAIR.sub(lambda m: chr(int(m.group(0), 16)),
  650.                                                  SPC.sub('', self.token))
  651.         self.add_token(token)
  652.         return (self.parse_main, j)
  653.  
  654.     def nexttoken(self):
  655.         while not self.tokens:
  656.             self.fillbuf()
  657.             (self.parse1, self.charpos) = self.parse1(self.buf, self.charpos)
  658.         token = self.tokens.pop(0)
  659.         return token
  660.  
  661.     def nextline(self):
  662.         '''
  663.        Fetches a next line that ends either with \\r or \\n.
  664.        '''
  665.         linebuf = ''
  666.         linepos = self.bufpos + self.charpos
  667.         eol = False
  668.         while 1:
  669.             self.fillbuf()
  670.             if eol:
  671.                 c = self.buf[self.charpos]
  672.                 # handle '\r\n'
  673.                 if c == '\n':
  674.                     linebuf += c
  675.                     self.charpos += 1
  676.                 break
  677.             m = EOL.search(self.buf, self.charpos)
  678.             if m:
  679.                 linebuf += self.buf[self.charpos:m.end(0)]
  680.                 self.charpos = m.end(0)
  681.                 if linebuf[-1] == '\r':
  682.                     eol = True
  683.                 else:
  684.                     break
  685.             else:
  686.                 linebuf += self.buf[self.charpos:]
  687.                 self.charpos = len(self.buf)
  688.         return (linepos, linebuf)
  689.  
  690.     def revreadlines(self):
  691.         '''
  692.        Fetches a next line backword. This is used to locate
  693.        the trailers at the end of a file.
  694.        '''
  695.         self.fp.seek(0, 2)
  696.         pos = self.fp.tell()
  697.         buf = ''
  698.         while 0 < pos:
  699.             prevpos = pos
  700.             pos = max(0, pos-self.BUFSIZ)
  701.             self.fp.seek(pos)
  702.             s = self.fp.read(prevpos-pos)
  703.             if not s: break
  704.             while 1:
  705.                 n = max(s.rfind('\r'), s.rfind('\n'))
  706.                 if n == -1:
  707.                     buf = s + buf
  708.                     break
  709.                 yield s[n:]+buf
  710.                 s = s[:n]
  711.                 buf = ''
  712.         return
  713.  
  714.  
  715. ##  PSStackParser
  716. ##
  717. class PSStackParser(PSBaseParser):
  718.  
  719.     def __init__(self, fp):
  720.         PSBaseParser.__init__(self, fp)
  721.         self.reset()
  722.         return
  723.    
  724.     def reset(self):
  725.         self.context = []
  726.         self.curtype = None
  727.         self.curstack = []
  728.         self.results = []
  729.         return
  730.  
  731.     def seek(self, pos):
  732.         PSBaseParser.seek(self, pos)
  733.         self.reset()
  734.         return
  735.  
  736.     def push(self, *objs):
  737.         self.curstack.extend(objs)
  738.         return
  739.     def pop(self, n):
  740.         objs = self.curstack[-n:]
  741.         self.curstack[-n:] = []
  742.         return objs
  743.     def popall(self):
  744.         objs = self.curstack
  745.         self.curstack = []
  746.         return objs
  747.     def add_results(self, *objs):
  748.         self.results.extend(objs)
  749.         return
  750.  
  751.     def start_type(self, pos, type):
  752.         self.context.append((pos, self.curtype, self.curstack))
  753.         (self.curtype, self.curstack) = (type, [])
  754.         return
  755.     def end_type(self, type):
  756.         if self.curtype != type:
  757.             raise PSTypeError('Type mismatch: %r != %r' % (self.curtype, type))
  758.         objs = [ obj for (_,obj) in self.curstack ]
  759.         (pos, self.curtype, self.curstack) = self.context.pop()
  760.         return (pos, objs)
  761.  
  762.     def do_keyword(self, pos, token):
  763.         return
  764.    
  765.     def nextobject(self, direct=False):
  766.         '''
  767.        Yields a list of objects: keywords, literals, strings,
  768.        numbers, arrays and dictionaries. Arrays and dictionaries
  769.        are represented as Python sequence and dictionaries.
  770.        '''
  771.         while not self.results:
  772.             (pos, token) = self.nexttoken()
  773.             ##print (pos,token), (self.curtype, self.curstack)
  774.             if (isinstance(token, int) or
  775.                     isinstance(token, float) or
  776.                     isinstance(token, bool) or
  777.                     isinstance(token, str) or
  778.                     isinstance(token, PSLiteral)):
  779.                 # normal token
  780.                 self.push((pos, token))
  781.             elif token == KEYWORD_ARRAY_BEGIN:
  782.                 # begin array
  783.                 self.start_type(pos, 'a')
  784.             elif token == KEYWORD_ARRAY_END:
  785.                 # end array
  786.                 try:
  787.                     self.push(self.end_type('a'))
  788.                 except PSTypeError:
  789.                     if STRICT: raise
  790.             elif token == KEYWORD_DICT_BEGIN:
  791.                 # begin dictionary
  792.                 self.start_type(pos, 'd')
  793.             elif token == KEYWORD_DICT_END:
  794.                 # end dictionary
  795.                 try:
  796.                     (pos, objs) = self.end_type('d')
  797.                     if len(objs) % 2 != 0:
  798.                         raise PSSyntaxError(
  799.                             'Invalid dictionary construct: %r' % objs)
  800.                     d = dict((literal_name(k), v) \
  801.                                  for (k,v) in choplist(2, objs))
  802.                     self.push((pos, d))
  803.                 except PSTypeError:
  804.                     if STRICT: raise
  805.             else:
  806.                 self.do_keyword(pos, token)
  807.             if self.context:
  808.                 continue
  809.             else:
  810.                 if direct:
  811.                     return self.pop(1)[0]
  812.                 self.flush()
  813.         obj = self.results.pop(0)
  814.         return obj
  815.  
  816.  
  817. LITERAL_CRYPT = PSLiteralTable.intern('Crypt')
  818. LITERALS_FLATE_DECODE = (PSLiteralTable.intern('FlateDecode'), PSLiteralTable.intern('Fl'))
  819. LITERALS_LZW_DECODE = (PSLiteralTable.intern('LZWDecode'), PSLiteralTable.intern('LZW'))
  820. LITERALS_ASCII85_DECODE = (PSLiteralTable.intern('ASCII85Decode'), PSLiteralTable.intern('A85'))
  821.  
  822.  
  823. ##  PDF Objects
  824. ##
  825. class PDFObject(PSObject): pass
  826.  
  827. class PDFException(PSException): pass
  828. class PDFTypeError(PDFException): pass
  829. class PDFValueError(PDFException): pass
  830. class PDFNotImplementedError(PSException): pass
  831.  
  832.  
  833. ##  PDFObjRef
  834. ##
  835. class PDFObjRef(PDFObject):
  836.    
  837.     def __init__(self, doc, objid, genno):
  838.         if objid == 0:
  839.             if STRICT:
  840.                 raise PDFValueError('PDF object id cannot be 0.')
  841.         self.doc = doc
  842.         self.objid = objid
  843.         self.genno = genno
  844.         return
  845.  
  846.     def __repr__(self):
  847.         return '<PDFObjRef:%d %d>' % (self.objid, self.genno)
  848.  
  849.     def resolve(self):
  850.         return self.doc.getobj(self.objid)
  851.  
  852.  
  853. # resolve
  854. def resolve1(x):
  855.     '''
  856.    Resolve an object. If this is an array or dictionary,
  857.    it may still contains some indirect objects inside.
  858.    '''
  859.     while isinstance(x, PDFObjRef):
  860.         x = x.resolve()
  861.     return x
  862.  
  863. def resolve_all(x):
  864.     '''
  865.    Recursively resolve X and all the internals.
  866.    Make sure there is no indirect reference within the nested object.
  867.    This procedure might be slow.
  868.    '''
  869.     while isinstance(x, PDFObjRef):
  870.         x = x.resolve()
  871.     if isinstance(x, list):
  872.         x = [ resolve_all(v) for v in x ]
  873.     elif isinstance(x, dict):
  874.         for (k,v) in x.iteritems():
  875.             x[k] = resolve_all(v)
  876.     return x
  877.  
  878. def decipher_all(decipher, objid, genno, x):
  879.     '''
  880.    Recursively decipher X.
  881.    '''
  882.     if isinstance(x, str):
  883.         return decipher(objid, genno, x)
  884.     decf = lambda v: decipher_all(decipher, objid, genno, v)
  885.     if isinstance(x, list):
  886.         x = [decf(v) for v in x]
  887.     elif isinstance(x, dict):
  888.         x = dict((k, decf(v)) for (k, v) in x.iteritems())
  889.     return x
  890.  
  891.  
  892. # Type cheking
  893. def int_value(x):
  894.     x = resolve1(x)
  895.     if not isinstance(x, int):
  896.         if STRICT:
  897.             raise PDFTypeError('Integer required: %r' % x)
  898.         return 0
  899.     return x
  900.  
  901. def float_value(x):
  902.     x = resolve1(x)
  903.     if not isinstance(x, float):
  904.         if STRICT:
  905.             raise PDFTypeError('Float required: %r' % x)
  906.         return 0.0
  907.     return x
  908.  
  909. def num_value(x):
  910.     x = resolve1(x)
  911.     if not (isinstance(x, int) or isinstance(x, float)):
  912.         if STRICT:
  913.             raise PDFTypeError('Int or Float required: %r' % x)
  914.         return 0
  915.     return x
  916.  
  917. def str_value(x):
  918.     x = resolve1(x)
  919.     if not isinstance(x, str):
  920.         if STRICT:
  921.             raise PDFTypeError('String required: %r' % x)
  922.         return ''
  923.     return x
  924.  
  925. def list_value(x):
  926.     x = resolve1(x)
  927.     if not (isinstance(x, list) or isinstance(x, tuple)):
  928.         if STRICT:
  929.             raise PDFTypeError('List required: %r' % x)
  930.         return []
  931.     return x
  932.  
  933. def dict_value(x):
  934.     x = resolve1(x)
  935.     if not isinstance(x, dict):
  936.         if STRICT:
  937.             raise PDFTypeError('Dict required: %r' % x)
  938.         return {}
  939.     return x
  940.  
  941. def stream_value(x):
  942.     x = resolve1(x)
  943.     if not isinstance(x, PDFStream):
  944.         if STRICT:
  945.             raise PDFTypeError('PDFStream required: %r' % x)
  946.         return PDFStream({}, '')
  947.     return x
  948.  
  949. # ascii85decode(data)
  950. def ascii85decode(data):
  951.   n = b = 0
  952.   out = ''
  953.   for c in data:
  954.     if '!' <= c and c <= 'u':
  955.       n += 1
  956.       b = b*85+(ord(c)-33)
  957.       if n == 5:
  958.         out += struct.pack('>L',b)
  959.         n = b = 0
  960.     elif c == 'z':
  961.       assert n == 0
  962.       out += '\0\0\0\0'
  963.     elif c == '~':
  964.       if n:
  965.         for _ in range(5-n):
  966.           b = b*85+84
  967.         out += struct.pack('>L',b)[:n-1]
  968.       break
  969.   return out
  970.  
  971.  
  972. ##  PDFStream type
  973. class PDFStream(PDFObject):
  974.     def __init__(self, dic, rawdata, decipher=None):
  975.         length = int_value(dic.get('Length', 0))
  976.         eol = rawdata[length:]
  977.         # quick and dirty fix for false length attribute,
  978.         # might not work if the pdf stream parser has a problem
  979.         if decipher != None and decipher.__name__ == 'decrypt_aes':
  980.             if (len(rawdata) % 16) != 0:
  981.                 cutdiv = len(rawdata) // 16
  982.                 rawdata = rawdata[:16*cutdiv]
  983.         else:
  984.             if eol in ('\r', '\n', '\r\n'):
  985.                 rawdata = rawdata[:length]
  986.                
  987.         self.dic = dic
  988.         self.rawdata = rawdata
  989.         self.decipher = decipher
  990.         self.data = None
  991.         self.decdata = None
  992.         self.objid = None
  993.         self.genno = None
  994.         return
  995.  
  996.     def set_objid(self, objid, genno):
  997.         self.objid = objid
  998.         self.genno = genno
  999.         return
  1000.    
  1001.     def __repr__(self):
  1002.         if self.rawdata:
  1003.             return '<PDFStream(%r): raw=%d, %r>' % \
  1004.                    (self.objid, len(self.rawdata), self.dic)
  1005.         else:
  1006.             return '<PDFStream(%r): data=%d, %r>' % \
  1007.                    (self.objid, len(self.data), self.dic)
  1008.  
  1009.     def decode(self):
  1010.         assert self.data is None and self.rawdata is not None
  1011.         data = self.rawdata
  1012.         if self.decipher:
  1013.             # Handle encryption
  1014.             data = self.decipher(self.objid, self.genno, data)
  1015.             if gen_xref_stm:
  1016.                 self.decdata = data # keep decrypted data
  1017.         if 'Filter' not in self.dic:
  1018.             self.data = data
  1019.             self.rawdata = None
  1020.             ##print self.dict
  1021.             return
  1022.         filters = self.dic['Filter']
  1023.         if not isinstance(filters, list):
  1024.             filters = [ filters ]
  1025.         for f in filters:
  1026.             if f in LITERALS_FLATE_DECODE:
  1027.                 # will get errors if the document is encrypted.
  1028.                 data = zlib.decompress(data)
  1029.             elif f in LITERALS_LZW_DECODE:
  1030.                 data = ''.join(LZWDecoder(StringIO(data)).run())
  1031.             elif f in LITERALS_ASCII85_DECODE:
  1032.                 data = ascii85decode(data)
  1033.             elif f == LITERAL_CRYPT:
  1034.                 raise PDFNotImplementedError('/Crypt filter is unsupported')
  1035.             else:
  1036.                 raise PDFNotImplementedError('Unsupported filter: %r' % f)
  1037.             # apply predictors
  1038.             if 'DP' in self.dic:
  1039.                 params = self.dic['DP']
  1040.             else:
  1041.                 params = self.dic.get('DecodeParms', {})
  1042.             if 'Predictor' in params:
  1043.                 pred = int_value(params['Predictor'])
  1044.                 if pred:
  1045.                     if pred != 12:
  1046.                         raise PDFNotImplementedError(
  1047.                             'Unsupported predictor: %r' % pred)
  1048.                     if 'Columns' not in params:
  1049.                         raise PDFValueError(
  1050.                             'Columns undefined for predictor=12')
  1051.                     columns = int_value(params['Columns'])
  1052.                     buf = ''
  1053.                     ent0 = '\x00' * columns
  1054.                     for i in xrange(0, len(data), columns+1):
  1055.                         pred = data[i]
  1056.                         ent1 = data[i+1:i+1+columns]
  1057.                         if pred == '\x02':
  1058.                             ent1 = ''.join(chr((ord(a)+ord(b)) & 255) \
  1059.                                                for (a,b) in zip(ent0,ent1))
  1060.                         buf += ent1
  1061.                         ent0 = ent1
  1062.                     data = buf
  1063.         self.data = data
  1064.         self.rawdata = None
  1065.         return
  1066.  
  1067.     def get_data(self):
  1068.         if self.data is None:
  1069.             self.decode()
  1070.         return self.data
  1071.  
  1072.     def get_rawdata(self):
  1073.         return self.rawdata
  1074.  
  1075.     def get_decdata(self):
  1076.         if self.decdata is not None:
  1077.             return self.decdata
  1078.         data = self.rawdata
  1079.         if self.decipher and data:
  1080.             # Handle encryption
  1081.             data = self.decipher(self.objid, self.genno, data)
  1082.         return data
  1083.  
  1084.        
  1085. ##  PDF Exceptions
  1086. ##
  1087. class PDFSyntaxError(PDFException): pass
  1088. class PDFNoValidXRef(PDFSyntaxError): pass
  1089. class PDFEncryptionError(PDFException): pass
  1090. class PDFPasswordIncorrect(PDFEncryptionError): pass
  1091.  
  1092. # some predefined literals and keywords.
  1093. LITERAL_OBJSTM = PSLiteralTable.intern('ObjStm')
  1094. LITERAL_XREF = PSLiteralTable.intern('XRef')
  1095. LITERAL_PAGE = PSLiteralTable.intern('Page')
  1096. LITERAL_PAGES = PSLiteralTable.intern('Pages')
  1097. LITERAL_CATALOG = PSLiteralTable.intern('Catalog')
  1098.  
  1099.  
  1100. ##  XRefs
  1101. ##
  1102.  
  1103. ##  PDFXRef
  1104. ##
  1105. class PDFXRef(object):
  1106.  
  1107.     def __init__(self):
  1108.         self.offsets = None
  1109.         return
  1110.  
  1111.     def __repr__(self):
  1112.         return '<PDFXRef: objs=%d>' % len(self.offsets)
  1113.  
  1114.     def objids(self):
  1115.         return self.offsets.iterkeys()
  1116.  
  1117.     def load(self, parser):
  1118.         self.offsets = {}
  1119.         while 1:
  1120.             try:
  1121.                 (pos, line) = parser.nextline()
  1122.             except PSEOF:
  1123.                 raise PDFNoValidXRef('Unexpected EOF - file corrupted?')
  1124.             if not line:
  1125.                 raise PDFNoValidXRef('Premature eof: %r' % parser)
  1126.             if line.startswith('trailer'):
  1127.                 parser.seek(pos)
  1128.                 break
  1129.             f = line.strip().split(' ')
  1130.             if len(f) != 2:
  1131.                 raise PDFNoValidXRef('Trailer not found: %r: line=%r' % (parser, line))
  1132.             try:
  1133.                 (start, nobjs) = map(int, f)
  1134.             except ValueError:
  1135.                 raise PDFNoValidXRef('Invalid line: %r: line=%r' % (parser, line))
  1136.             for objid in xrange(start, start+nobjs):
  1137.                 try:
  1138.                     (_, line) = parser.nextline()
  1139.                 except PSEOF:
  1140.                     raise PDFNoValidXRef('Unexpected EOF - file corrupted?')
  1141.                 f = line.strip().split(' ')
  1142.                 if len(f) != 3:
  1143.                     raise PDFNoValidXRef('Invalid XRef format: %r, line=%r' % (parser, line))
  1144.                 (pos, genno, use) = f
  1145.                 if use != 'n': continue
  1146.                 self.offsets[objid] = (int(genno), int(pos))
  1147.         self.load_trailer(parser)
  1148.         return
  1149.    
  1150.     KEYWORD_TRAILER = PSKeywordTable.intern('trailer')
  1151.     def load_trailer(self, parser):
  1152.         try:
  1153.             (_,kwd) = parser.nexttoken()
  1154.             assert kwd is self.KEYWORD_TRAILER
  1155.             (_,dic) = parser.nextobject(direct=True)
  1156.         except PSEOF:
  1157.             x = parser.pop(1)
  1158.             if not x:
  1159.                 raise PDFNoValidXRef('Unexpected EOF - file corrupted')
  1160.             (_,dic) = x[0]
  1161.         self.trailer = dict_value(dic)
  1162.         return
  1163.  
  1164.     def getpos(self, objid):
  1165.         try:
  1166.             (genno, pos) = self.offsets[objid]
  1167.         except KeyError:
  1168.             raise
  1169.         return (None, pos)
  1170.  
  1171.  
  1172. ##  PDFXRefStream
  1173. ##
  1174. class PDFXRefStream(object):
  1175.  
  1176.     def __init__(self):
  1177.         self.index = None
  1178.         self.data = None
  1179.         self.entlen = None
  1180.         self.fl1 = self.fl2 = self.fl3 = None
  1181.         return
  1182.  
  1183.     def __repr__(self):
  1184.         return '<PDFXRef: objids=%s>' % self.index
  1185.  
  1186.     def objids(self):
  1187.         for first, size in self.index:
  1188.             for objid in xrange(first, first + size):
  1189.                 yield objid
  1190.    
  1191.     def load(self, parser, debug=0):
  1192.         (_,objid) = parser.nexttoken() # ignored
  1193.         (_,genno) = parser.nexttoken() # ignored
  1194.         (_,kwd) = parser.nexttoken()
  1195.         (_,stream) = parser.nextobject()
  1196.         if not isinstance(stream, PDFStream) or \
  1197.            stream.dic['Type'] is not LITERAL_XREF:
  1198.             raise PDFNoValidXRef('Invalid PDF stream spec.')
  1199.         size = stream.dic['Size']
  1200.         index = stream.dic.get('Index', (0,size))
  1201.         self.index = zip(islice(index, 0, None, 2),
  1202.                          islice(index, 1, None, 2))
  1203.         (self.fl1, self.fl2, self.fl3) = stream.dic['W']
  1204.         self.data = stream.get_data()
  1205.         self.entlen = self.fl1+self.fl2+self.fl3
  1206.         self.trailer = stream.dic
  1207.         return
  1208.    
  1209.     def getpos(self, objid):
  1210.         offset = 0
  1211.         for first, size in self.index:
  1212.             if first <= objid  and objid < (first + size):
  1213.                 break
  1214.             offset += size
  1215.         else:
  1216.             raise KeyError(objid)
  1217.         i = self.entlen * ((objid - first) + offset)
  1218.         ent = self.data[i:i+self.entlen]
  1219.         f1 = nunpack(ent[:self.fl1], 1)
  1220.         if f1 == 1:
  1221.             pos = nunpack(ent[self.fl1:self.fl1+self.fl2])
  1222.             genno = nunpack(ent[self.fl1+self.fl2:])
  1223.             return (None, pos)
  1224.         elif f1 == 2:
  1225.             objid = nunpack(ent[self.fl1:self.fl1+self.fl2])
  1226.             index = nunpack(ent[self.fl1+self.fl2:])
  1227.             return (objid, index)
  1228.         # this is a free object
  1229.         raise KeyError(objid)
  1230.  
  1231.  
  1232. ##  PDFDocument
  1233. ##
  1234. ##  A PDFDocument object represents a PDF document.
  1235. ##  Since a PDF file is usually pretty big, normally it is not loaded
  1236. ##  at once. Rather it is parsed dynamically as processing goes.
  1237. ##  A PDF parser is associated with the document.
  1238. ##
  1239. class PDFDocument(object):
  1240.  
  1241.     def __init__(self):
  1242.         self.xrefs = []
  1243.         self.objs = {}
  1244.         self.parsed_objs = {}
  1245.         self.root = None
  1246.         self.catalog = None
  1247.         self.parser = None
  1248.         self.encryption = None
  1249.         self.decipher = None
  1250.         # dictionaries for fileopen
  1251.         self.fileopen = {}
  1252.         self.urlresult = {}        
  1253.         self.ready = False
  1254.         return
  1255.  
  1256.     # set_parser(parser)
  1257.     #   Associates the document with an (already initialized) parser object.
  1258.     def set_parser(self, parser):
  1259.         if self.parser: return
  1260.         self.parser = parser
  1261.         # The document is set to be temporarily ready during collecting
  1262.         # all the basic information about the document, e.g.
  1263.         # the header, the encryption information, and the access rights
  1264.         # for the document.
  1265.         self.ready = True
  1266.         # Retrieve the information of each header that was appended
  1267.         # (maybe multiple times) at the end of the document.
  1268.         self.xrefs = parser.read_xref()
  1269.         for xref in self.xrefs:
  1270.             trailer = xref.trailer
  1271.             if not trailer: continue
  1272.  
  1273.             # If there's an encryption info, remember it.
  1274.             if 'Encrypt' in trailer:
  1275.                 #assert not self.encryption
  1276.                 try:
  1277.                     self.encryption = (list_value(trailer['ID']),
  1278.                                    dict_value(trailer['Encrypt']))
  1279.                 # fix for bad files
  1280.                 except:
  1281.                     self.encryption = ('ffffffffffffffffffffffffffffffffffff',
  1282.                                        dict_value(trailer['Encrypt']))
  1283.             if 'Root' in trailer:
  1284.                 self.set_root(dict_value(trailer['Root']))
  1285.                 break
  1286.         else:
  1287.             raise PDFSyntaxError('No /Root object! - Is this really a PDF?')
  1288.         # The document is set to be non-ready again, until all the
  1289.         # proper initialization (asking the password key and
  1290.         # verifying the access permission, so on) is finished.
  1291.         self.ready = False
  1292.         return
  1293.  
  1294.     # set_root(root)
  1295.     #   Set the Root dictionary of the document.
  1296.     #   Each PDF file must have exactly one /Root dictionary.
  1297.     def set_root(self, root):
  1298.         self.root = root
  1299.         self.catalog = dict_value(self.root)
  1300.         if self.catalog.get('Type') is not LITERAL_CATALOG:
  1301.             if STRICT:
  1302.                 raise PDFSyntaxError('Catalog not found!')
  1303.         return
  1304.     # initialize(password='')
  1305.     #   Perform the initialization with a given password.
  1306.     #   This step is mandatory even if there's no password associated
  1307.     #   with the document.
  1308.     def initialize(self, password=''):
  1309.         if not self.encryption:
  1310.             self.is_printable = self.is_modifiable = self.is_extractable = True
  1311.             self.ready = True
  1312.             return
  1313.         (docid, param) = self.encryption
  1314.         type = literal_name(param['Filter'])
  1315.         if type == 'Adobe.APS':
  1316.             return self.initialize_adobe_ps(password, docid, param)
  1317.         if type == 'Standard':
  1318.             return self.initialize_standard(password, docid, param)
  1319.         if type == 'EBX_HANDLER':
  1320.             return self.initialize_ebx(password, docid, param)
  1321.         if type == 'FOPN_fLock':
  1322.             # remove of unnecessairy password attribute
  1323.             return self.initialize_fopn_flock(docid, param)  
  1324.         if type == 'FOPN_foweb':
  1325.             # remove of unnecessairy password attribute
  1326.             return self.initialize_fopn(docid, param)
  1327.         raise PDFEncryptionError('Unknown filter: param=%r' % param)
  1328.  
  1329.     def initialize_adobe_ps(self, password, docid, param):
  1330.         global KEYFILEPATH
  1331.         self.decrypt_key = self.genkey_adobe_ps(param)
  1332.         self.genkey = self.genkey_v4
  1333.         self.decipher = self.decrypt_aes
  1334.         self.ready = True
  1335.         return
  1336.  
  1337.     def genkey_adobe_ps(self, param):
  1338.         # nice little offline principal keys dictionary
  1339.         # global static principal key for German Onleihe / Bibliothek Digital
  1340.         principalkeys = { 'bibliothek-digital.de': 'rRwGv2tbpKov1krvv7PO0ws9S436/lArPlfipz5Pqhw='.decode('base64')}
  1341.         self.is_printable = self.is_modifiable = self.is_extractable = True
  1342. ##        print 'keyvalue'
  1343. ##        print len(keyvalue)
  1344. ##        print keyvalue.encode('hex')
  1345.         length = int_value(param.get('Length', 0)) / 8
  1346.         edcdata = str_value(param.get('EDCData')).decode('base64')
  1347.         pdrllic = str_value(param.get('PDRLLic')).decode('base64')
  1348.         pdrlpol = str_value(param.get('PDRLPol')).decode('base64')          
  1349.         #print 'ecd rights'
  1350.         edclist = []
  1351.         for pair in edcdata.split('\n'):
  1352.             edclist.append(pair)
  1353.         #print edclist
  1354.         #print 'edcdata decrypted'
  1355.         #print edclist[0].decode('base64').encode('hex')
  1356.         #print edclist[1].decode('base64').encode('hex')
  1357.         #print edclist[2].decode('base64').encode('hex')
  1358.         #print edclist[3].decode('base64').encode('hex')
  1359.         #print 'offlinekey'
  1360.         #print len(edclist[9].decode('base64'))
  1361.         #print pdrllic
  1362.         # principal key request
  1363.         for key in principalkeys:
  1364.             if key in pdrllic:
  1365.                 principalkey = principalkeys[key]
  1366.             else:
  1367.                 raise ADEPTError('Cannot find principal key for this pdf')
  1368.         shakey = SHA256.new(principalkey).digest()
  1369.         ivector = 16 * chr(0)
  1370.         #print shakey
  1371.         plaintext = AES.new(shakey,AES.MODE_CBC,ivector).decrypt(edclist[9].decode('base64'))
  1372.         if plaintext[-16:] != 16 * chr(16):
  1373.             raise ADEPTError('Offlinekey cannot be decrypted, aborting ...')
  1374.         pdrlpol = AES.new(plaintext[16:32],AES.MODE_CBC,edclist[2].decode('base64')).decrypt(pdrlpol)
  1375.         if ord(pdrlpol[-1]) < 1 or ord(pdrlpol[-1]) > 16:
  1376.             raise ADEPTError('Could not decrypt PDRLPol, aborting ...')
  1377.         else:
  1378.             cutter = -1 * ord(pdrlpol[-1])
  1379.             #print cutter
  1380.             pdrlpol = pdrlpol[:cutter]            
  1381.         #print plaintext.encode('hex')
  1382.         #print 'pdrlpol'
  1383.         #print pdrlpol
  1384.         return plaintext[:16]
  1385.    
  1386.     PASSWORD_PADDING = '(\xbfN^Nu\x8aAd\x00NV\xff\xfa\x01\x08..' \
  1387.                        '\x00\xb6\xd0h>\x80/\x0c\xa9\xfedSiz'
  1388.     # experimental aes pw support
  1389.     def initialize_standard(self, password, docid, param):
  1390.         # copy from a global variable
  1391.         V = int_value(param.get('V', 0))
  1392.         if (V <=0 or V > 4):
  1393.             raise PDFEncryptionError('Unknown algorithm: param=%r' % param)
  1394.         length = int_value(param.get('Length', 40)) # Key length (bits)
  1395.         O = str_value(param['O'])
  1396.         R = int_value(param['R']) # Revision
  1397.         if 5 <= R:
  1398.             raise PDFEncryptionError('Unknown revision: %r' % R)
  1399.         U = str_value(param['U'])
  1400.         P = int_value(param['P'])
  1401.         try:
  1402.             EncMetadata = str_value(param['EncryptMetadata'])
  1403.         except:
  1404.             EncMetadata = 'True'
  1405.         self.is_printable = bool(P & 4)        
  1406.         self.is_modifiable = bool(P & 8)
  1407.         self.is_extractable = bool(P & 16)
  1408.         self.is_annotationable = bool(P & 32)
  1409.         self.is_formsenabled = bool(P & 256)
  1410.         self.is_textextractable = bool(P & 512)
  1411.         self.is_assemblable = bool(P & 1024)
  1412.         self.is_formprintable = bool(P & 2048)
  1413.         # Algorithm 3.2
  1414.         password = (password+self.PASSWORD_PADDING)[:32] # 1
  1415.         hash = hashlib.md5(password) # 2
  1416.         hash.update(O) # 3
  1417.         hash.update(struct.pack('<l', P)) # 4
  1418.         hash.update(docid[0]) # 5
  1419.         # aes special handling if metadata isn't encrypted
  1420.         if EncMetadata == ('False' or 'false'):
  1421.             hash.update('ffffffff'.decode('hex'))
  1422.             # 6
  1423. ##            raise PDFNotImplementedError(
  1424. ##                'Revision 4 encryption is currently unsupported')
  1425.         if 5 <= R:
  1426.             # 8
  1427.             for _ in xrange(50):
  1428.                 hash = hashlib.md5(hash.digest()[:length/8])
  1429.         key = hash.digest()[:length/8]
  1430.         if R == 2:
  1431.             # Algorithm 3.4
  1432.             u1 = ARC4.new(key).decrypt(password)
  1433.         elif R >= 3:
  1434.             # Algorithm 3.5
  1435.             hash = hashlib.md5(self.PASSWORD_PADDING) # 2
  1436.             hash.update(docid[0]) # 3
  1437.             x = ARC4.new(key).decrypt(hash.digest()[:16]) # 4
  1438.             for i in xrange(1,19+1):
  1439.                 k = ''.join( chr(ord(c) ^ i) for c in key )
  1440.                 x = ARC4.new(k).decrypt(x)
  1441.             u1 = x+x # 32bytes total
  1442.         if R == 2:
  1443.             is_authenticated = (u1 == U)
  1444.         else:
  1445.             is_authenticated = (u1[:16] == U[:16])
  1446.         if not is_authenticated:
  1447.             raise ADEPTError('Password is not correct.')
  1448. ##            raise PDFPasswordIncorrect
  1449.         self.decrypt_key = key
  1450.         # genkey method
  1451.         if V == 1 or V == 2:
  1452.             self.genkey = self.genkey_v2
  1453.         elif V == 3:
  1454.             self.genkey = self.genkey_v3
  1455.         elif V == 4:
  1456.             self.genkey = self.genkey_v2
  1457.          #self.genkey = self.genkey_v3 if V == 3 else self.genkey_v2
  1458.         # rc4
  1459.         if V != 4:
  1460.             self.decipher = self.decipher_rc4  # XXX may be AES
  1461.         # aes
  1462.         elif V == 4 and Length == 128:
  1463.             elf.decipher = self.decipher_aes
  1464.         elif V == 4 and Length == 256:
  1465.             raise PDFNotImplementedError('AES256 encryption is currently unsupported')
  1466.         self.ready = True
  1467.         return
  1468.  
  1469.     def initialize_ebx(self, password, docid, param):
  1470.         global KEYFILEPATH
  1471.         self.is_printable = self.is_modifiable = self.is_extractable = True
  1472.         # keyfile path is wrong
  1473.         if KEYFILEPATH == False:
  1474.             errortext = 'Cannot find adeptkey.der keyfile. Use ineptkey to generate it.'
  1475.             raise ADEPTError(errortext)
  1476.         with open(password, 'rb') as f:
  1477.             keyder = f.read()
  1478.         #    KEYFILEPATH = ''
  1479.         key = ASN1Parser([ord(x) for x in keyder])
  1480.         key = [bytesToNumber(key.getChild(x).value) for x in xrange(1, 4)]
  1481.         rsa = RSA.construct(key)
  1482.         length = int_value(param.get('Length', 0)) / 8
  1483.         rights = str_value(param.get('ADEPT_LICENSE')).decode('base64')
  1484.         rights = zlib.decompress(rights, -15)
  1485.         rights = etree.fromstring(rights)
  1486.         expr = './/{http://ns.adobe.com/adept}encryptedKey'
  1487.         bookkey = ''.join(rights.findtext(expr)).decode('base64')
  1488.         bookkey = rsa.decrypt(bookkey)
  1489.         if bookkey[0] != '\x02':
  1490.             raise ADEPTError('error decrypting book session key')
  1491.         index = bookkey.index('\0') + 1
  1492.         bookkey = bookkey[index:]
  1493.         ebx_V = int_value(param.get('V', 4))
  1494.         ebx_type = int_value(param.get('EBX_ENCRYPTIONTYPE', 6))
  1495.         # added because of the booktype / decryption book session key error
  1496.         if ebx_V == 3:
  1497.             V = 3        
  1498.         elif ebx_V < 4 or ebx_type < 6:
  1499.             V = ord(bookkey[0])
  1500.             bookkey = bookkey[1:]
  1501.         else:
  1502.             V = 2
  1503.         if length and len(bookkey) != length:
  1504.             raise ADEPTError('error decrypting book session key')
  1505.         self.decrypt_key = bookkey
  1506.         self.genkey = self.genkey_v3 if V == 3 else self.genkey_v2
  1507.         self.decipher = self.decrypt_rc4
  1508.         self.ready = True
  1509.         return
  1510.  
  1511.     # fileopen support    
  1512.     def initialize_fopn_flock(self, docid, param):
  1513.         raise ADEPTError('FOPN_fLock not supported, yet ...')
  1514.         # debug mode processing
  1515.         global DEBUG_MODE
  1516.         global IVERSION
  1517.         if DEBUG_MODE == True:
  1518.             if os.access('.',os.W_OK) == True:
  1519.                 debugfile = open('ineptpdf-'+IVERSION+'-debug.txt','w')
  1520.             else:
  1521.                 raise ADEPTError('Cannot write debug file, current directory is not writable')
  1522.         self.is_printable = self.is_modifiable = self.is_extractable = True
  1523.         # get parameters and add it to the fo dictionary
  1524.         self.fileopen['V'] = int_value(param.get('V',2))        
  1525.         # crypt base
  1526.         (docid, param) = self.encryption
  1527.         #rights = dict_value(param['Info'])
  1528.         rights = param['Info']        
  1529.         #print rights
  1530.         if DEBUG_MODE == True: debugfile.write(rights + '\n\n')
  1531. ##        for pair in rights.split(';'):
  1532. ##            try:
  1533. ##                key, value = pair.split('=',1)
  1534. ##                self.fileopen[key] = value
  1535. ##            # fix for some misconfigured INFO variables
  1536. ##            except:
  1537. ##                pass
  1538. ##        kattr = { 'SVID': 'ServiceID', 'DUID': 'DocumentID', 'I3ID': 'Ident3ID', \
  1539. ##                  'I4ID': 'Ident4ID', 'VERS': 'EncrVer', 'PRID': 'USR'}
  1540. ##        for keys in  kattr:
  1541. ##            try:
  1542. ##                self.fileopen[kattr[keys]] = self.fileopen[keys]
  1543. ##                del self.fileopen[keys]
  1544. ##            except:
  1545. ##                continue
  1546.         # differentiate OS types
  1547. ##        sysplatform = sys.platform
  1548. ##        # if ostype is Windows
  1549. ##        if sysplatform=='win32':
  1550. ##            self.osuseragent = 'Windows NT 6.0'
  1551. ##            self.get_macaddress = self.get_win_macaddress
  1552. ##            self.fo_sethwids = self.fo_win_sethwids
  1553. ##            self.BrowserCookie = WinBrowserCookie
  1554. ##        elif sysplatform=='linux2':
  1555. ##            adeptout = 'Linux is not supported, yet.\n'
  1556. ##            raise ADEPTError(adeptout)
  1557. ##            self.osuseragent = 'Linux i686'
  1558. ##            self.get_macaddress = self.get_linux_macaddress            
  1559. ##            self.fo_sethwids = self.fo_linux_sethwids            
  1560. ##        else:
  1561. ##            adeptout = ''
  1562. ##            adeptout = adeptout + 'Due to various privacy violations from Apple\n'
  1563. ##            adeptout = adeptout + 'Mac OS X support is disabled by default.'
  1564. ##            raise ADEPTError(adeptout)            
  1565. ##        # add static arguments for http/https request
  1566. ##        self.fo_setattributes()
  1567. ##        # add hardware specific arguments for http/https request        
  1568. ##        self.fo_sethwids()
  1569. ##
  1570. ##        if 'Code' in self.urlresult:            
  1571. ##            if self.fileopen['Length'] == len(self.urlresult['Code']):
  1572. ##                self.decrypt_key = self.urlresult['Code']
  1573. ##            else:
  1574. ##                self.decrypt_key = self.urlresult['Code'].decode('hex')
  1575. ##        else:
  1576. ##            raise ADEPTError('Cannot find decryption key.')
  1577.         self.decrypt_key = 'stuff'
  1578.         self.genkey = self.genkey_v2
  1579.         self.decipher = self.decrypt_rc4
  1580.         self.ready = True
  1581.         return
  1582.  
  1583.     def initialize_fopn(self, docid, param):
  1584.         # debug mode processing
  1585.         global DEBUG_MODE
  1586.         global IVERSION
  1587.         if DEBUG_MODE == True:
  1588.             if os.access('.',os.W_OK) == True:
  1589.                 debugfile = open('ineptpdf-'+IVERSION+'-debug.txt','w')
  1590.             else:
  1591.                 raise ADEPTError('Cannot write debug file, current directory is not writable')
  1592.         self.is_printable = self.is_modifiable = self.is_extractable = True
  1593.         # get parameters and add it to the fo dictionary
  1594.         self.fileopen['Length'] = int_value(param.get('Length', 0)) / 8
  1595.         self.fileopen['VEID'] = str_value(param.get('VEID'))
  1596.         self.fileopen['BUILD'] = str_value(param.get('BUILD'))
  1597.         self.fileopen['SVID'] = str_value(param.get('SVID'))
  1598.         self.fileopen['DUID'] = str_value(param.get('DUID'))
  1599.         self.fileopen['V'] = int_value(param.get('V',2))        
  1600.         # crypt base
  1601.         rights = str_value(param.get('INFO')).decode('base64')
  1602.         rights = self.genkey_fileopeninfo(rights)
  1603.         if DEBUG_MODE == True: debugfile.write(rights + '\n\n')    
  1604.         for pair in rights.split(';'):
  1605.             try:
  1606.                 key, value = pair.split('=',1)
  1607.                 self.fileopen[key] = value
  1608.             # fix for some misconfigured INFO variables
  1609.             except:
  1610.                 pass
  1611.         kattr = { 'SVID': 'ServiceID', 'DUID': 'DocumentID', 'I3ID': 'Ident3ID', \
  1612.                   'I4ID': 'Ident4ID', 'VERS': 'EncrVer', 'PRID': 'USR'}
  1613.         for keys in  kattr:
  1614.             # fishing some misconfigured slashs out of it
  1615.             try:
  1616.                 self.fileopen[kattr[keys]] = urllib.quote(self.fileopen[keys],safe='')
  1617.                 del self.fileopen[keys]
  1618.             except:
  1619.                 continue
  1620.         # differentiate OS types
  1621.         sysplatform = sys.platform
  1622.         # if ostype is Windows
  1623.         if sysplatform=='win32':
  1624.             self.osuseragent = 'Windows NT 6.0'
  1625.             self.get_macaddress = self.get_win_macaddress
  1626.             self.fo_sethwids = self.fo_win_sethwids
  1627.             self.BrowserCookie = WinBrowserCookie
  1628.         elif sysplatform=='linux2':
  1629.             adeptout = 'Linux is not supported, yet.\n'
  1630.             raise ADEPTError(adeptout)
  1631.             self.osuseragent = 'Linux i686'
  1632.             self.get_macaddress = self.get_linux_macaddress            
  1633.             self.fo_sethwids = self.fo_linux_sethwids            
  1634.         else:
  1635.             adeptout = ''
  1636.             adeptout = adeptout + 'Due to various privacy violations from Apple\n'
  1637.             adeptout = adeptout + 'Mac OS X support is disabled by default.'
  1638.             raise ADEPTError(adeptout)            
  1639.         # add static arguments for http/https request
  1640.         self.fo_setattributes()
  1641.         # add hardware specific arguments for http/https request        
  1642.         self.fo_sethwids()
  1643.         #if DEBUG_MODE == True: debugfile.write(self.fileopen)
  1644.         if 'UURL' in self.fileopen:
  1645.             buildurl = self.fileopen['UURL']
  1646.         else:
  1647.             buildurl = self.fileopen['PURL']
  1648.         # fix for bad DPRM structure
  1649.         if self.fileopen['DPRM'][0] != r'/':
  1650.             self.fileopen['DPRM'] = r'/' + self.fileopen['DPRM']
  1651.         # genius fix for bad server urls (IMHO)
  1652.         if '?' in self.fileopen['DPRM']:
  1653.             buildurl = buildurl + self.fileopen['DPRM'] + '&'
  1654.         else:
  1655.             buildurl = buildurl + self.fileopen['DPRM'] + '?'            
  1656.  
  1657.         # debug customization
  1658.         #self.fileopen['Machine'] = ''
  1659.         #self.fileopen['Disk'] = ''
  1660.  
  1661.  
  1662.         surl = ( 'Stamp', 'Mode', 'USR', 'ServiceID', 'DocumentID',\
  1663.                  'Ident3ID', 'Ident4ID','DocStrFmt', 'OSType', 'OSName', 'OSData', 'Language',\
  1664.                  'LngLCID', 'LngRFC1766', 'LngISO4Char', 'Build', 'ProdVer', 'EncrVer',\
  1665.                  'Machine', 'Disk', 'Uuid', 'PrevMach', 'PrevDisk',\
  1666.                  'FormHFT',\
  1667.                  'SelServer', 'AcroVersion', 'AcroProduct', 'AcroReader',\
  1668.                  'AcroCanEdit', 'AcroPrefIDib', 'InBrowser', 'CliAppName',\
  1669.                  'DocIsLocal', 'DocPathUrl', 'VolName', 'VolType', 'VolSN',\
  1670.                  'FSName',  'FowpKbd', 'OSBuild',\
  1671.                   'RequestSchema')
  1672.        
  1673.         #settings request and special modes
  1674.         if 'EVER' in self.fileopen and float(self.fileopen['EVER']) < 3.8:
  1675.             self.fileopen['Mode'] = 'ICx'
  1676.        
  1677.         origurl = buildurl
  1678.         buildurl = buildurl + 'Request=Setting'        
  1679.         for keys in surl:
  1680.             try:
  1681.                 buildurl = buildurl + '&' + keys + '=' + self.fileopen[keys]
  1682.             except:
  1683.                 continue
  1684.         if DEBUG_MODE == True: debugfile.write( 'settings url:\n')
  1685.         if DEBUG_MODE == True: debugfile.write( buildurl+'\n\n')
  1686.         # custom user agent identification?
  1687.         if 'AGEN' in self.fileopen:
  1688.             useragent = self.fileopen['AGEN']
  1689.             urllib.URLopener.version = useragent
  1690.         # attribute doesn't exist - take the default user agent
  1691.         else:
  1692.             urllib.URLopener.version = self.osuseragent
  1693.         # try to open the url
  1694.         try:
  1695.             u = urllib.urlopen(buildurl)
  1696.             u.geturl()
  1697.             result = u.read()
  1698.         except:
  1699.             raise ADEPTError('No internet connection or a blocking firewall!')
  1700. ##        finally:
  1701. ##            u.close()
  1702.         # getting rid of the line feed
  1703.         if DEBUG_MODE == True: debugfile.write('Settings'+'\n')
  1704.         if DEBUG_MODE == True: debugfile.write(result+'\n\n')
  1705.         #get rid of unnecessary characters
  1706.         result = result.rstrip('\n')
  1707.         result = result.rstrip(chr(13))
  1708.         result = result.lstrip('\n')
  1709.         result = result.lstrip(chr(13))
  1710.         self.surlresult = {}
  1711.         for pair in result.split('&'):
  1712.             try:
  1713.                 key, value = pair.split('=',1)
  1714.                 # fix for bad server response
  1715.                 if key not in self.surlresult:
  1716.                     self.surlresult[key] = value
  1717.             except:
  1718.                 pass
  1719.         if 'RequestSchema' in self.surlresult:
  1720.             self.fileopen['RequestSchema'] = self.surlresult['RequestSchema']
  1721.         if 'ServerSessionData' in self.surlresult:
  1722.             self.fileopen['ServerSessionData'] = self.surlresult['ServerSessionData']
  1723.         #print self.surlresult
  1724.         if 'RetVal' in self.surlresult and (('Reason' in self.surlresult and \
  1725.            self.surlresult['Reason'] == 'AskUnp') or ('SetTarget' in self.surlresult and\
  1726.                                                self.surlresult['SetTarget'] == 'UnpDlg')):
  1727.             # get user and password dialog
  1728.             try:
  1729.                 self.gen_pw_dialog(self.surlresult['UnpUiName'], self.surlresult['UnpUiPass'],\
  1730.                                    self.surlresult['UnpUiTitle'], self.surlresult['UnpUiOk'],\
  1731.                                    self.surlresult['UnpUiSunk'], self.surlresult['UnpUiComm'])
  1732.             except:
  1733.                 self.gen_pw_dialog()
  1734.            
  1735.         # the fileopen check might not be always right because of strange server responses    
  1736.         if 'SEMO' in self.fileopen and (self.fileopen['SEMO'] == '1'\
  1737.             or self.fileopen['SEMO'] == '2') and ('CSES' in self.fileopen and\
  1738.                                                   self.fileopen['CSES'] != 'fileopen'):
  1739.             # get the url name for the cookie(s)
  1740.             if 'CURL' in self.fileopen:
  1741.                 self.surl = self.fileopen['CURL']
  1742.             if 'CSES' in self.fileopen:
  1743.                 self.cses = self.fileopen['CSES']
  1744.             elif 'PHOS' in self.fileopen:
  1745.                 self.surl = self.fileopen['PHOS']
  1746.             elif 'LHOS' in self.fileopen:
  1747.                 self.surl = self.fileopen['LHOS']
  1748.             else:
  1749.                 raise ADEPTError('unknown Cookie name.\n Check ineptpdf forum for further assistance')
  1750.             self.pwfieldreq = 1
  1751.             # session cookie processing
  1752.             if self.fileopen['SEMO'] == '1':
  1753.                 cookies = self.BrowserCookie()
  1754.                 #print self.cses
  1755.                 #print self.surl
  1756.                 csession = cookies.getcookie(self.cses,self.surl)
  1757.                 if csession != None:
  1758.                     self.fileopen['Session'] = csession
  1759.                     self.gui = False
  1760.                 # fallback
  1761.                 else:
  1762.                     self.pwtk = Tkinter.Tk()
  1763.                     self.pwtk.title('Ineptpdf8')
  1764.                     self.pwtk.minsize(150, 0)
  1765.                     infotxt1 = 'Get the session cookie key manually (Firefox step-by-step:\n'+\
  1766.                                'Start Firefox -> Tools -> Options -> Privacy -> Show Cookies\n'+\
  1767.                                '-> Search for a cookie from ' + self.surl +' with the\n'+\
  1768.                                'name ' + self.cses +' and copy paste the content field in the\n'+\
  1769.                                'Session Content field. Remove possible spaces or new lines at the '+\
  1770.                                'end\n (cursor must be blinking right behind the last character)'
  1771.                     self.label0 = Tkinter.Label(self.pwtk, text=infotxt1)
  1772.                     self.label0.pack()
  1773.                     self.label1 = Tkinter.Label(self.pwtk, text="Session Content")
  1774.                     self.pwfieldreq = 0
  1775.                     self.gui = True
  1776.             # user cookie processing                                    
  1777.             elif self.fileopen['SEMO'] == '2':
  1778.                 cookies = self.BrowserCookie()
  1779.                 #print self.cses
  1780.                 #print self.surl
  1781.                 name = cookies.getcookie('name',self.surl)
  1782.                 passw = cookies.getcookie('pass',self.surl)                    
  1783.                 if name != None or passw != None:
  1784.                     self.fileopen['UserName'] = urllib.quote(name)
  1785.                     self.fileopen['UserPass'] = urllib.quote(passw)
  1786.                     self.gui = False
  1787.                 # fallback
  1788.                 else:
  1789.                     self.pwtk = Tkinter.Tk()
  1790.                     self.pwtk.title('Ineptpdf8')
  1791.                     self.pwtk.minsize(150, 0)
  1792.                     self.label1 = Tkinter.Label(self.pwtk, text="Username")
  1793.                     infotxt1 = 'Get the user cookie keys manually (Firefox step-by-step:\n'+\
  1794.                                'Start Firefox -> Tools -> Options -> Privacy -> Show Cookies\n'+\
  1795.                                '-> Search for cookies from ' + self.surl +' with the\n'+\
  1796.                                'name name in the user field and copy paste the content field in the\n'+\
  1797.                                'username field. Do the same with the name pass in the password field).'
  1798.                     self.label0 = Tkinter.Label(self.pwtk, text=infotxt1)
  1799.                     self.label0.pack()                                      
  1800.                     self.pwfieldreq = 1
  1801.                     self.gui = True
  1802. ##            else:
  1803. ##                self.pwtk = Tkinter.Tk()
  1804. ##                self.pwtk.title('Ineptpdf8')
  1805. ##                self.pwtk.minsize(150, 0)
  1806. ##                self.pwfieldreq = 0
  1807. ##                self.label1 = Tkinter.Label(self.pwtk, text="Username")
  1808. ##                self.pwfieldreq = 1
  1809. ##                self.gui = True
  1810.             if self.gui == True:
  1811.                 self.un_entry = Tkinter.Entry(self.pwtk)
  1812.                 # cursor here
  1813.                 self.un_entry.focus()
  1814.                 self.label2 = Tkinter.Label(self.pwtk, text="Password")
  1815.                 self.pw_entry = Tkinter.Entry(self.pwtk, show="*")
  1816.                 self.button = Tkinter.Button(self.pwtk, text='Go for it!', command=self.fo_save_values)
  1817.                 # widget layout, stack vertical
  1818.                 self.label1.pack()
  1819.                 self.un_entry.pack()
  1820.                 # create a password label and field
  1821.                 if self.pwfieldreq == 1:
  1822.                     self.label2.pack()
  1823.                     self.pw_entry.pack()
  1824.                 self.button.pack()
  1825.                 self.pwtk.update()            
  1826.                 # start the event loop
  1827.                 self.pwtk.mainloop()
  1828.          
  1829.         # original request
  1830.         # drive through tupple for building the permission url
  1831.         burl = ( 'Stamp', 'Mode', 'USR', 'ServiceID', 'DocumentID',\
  1832.                  'Ident3ID', 'Ident4ID','DocStrFmt', 'OSType', 'Language',\
  1833.                  'LngLCID', 'LngRFC1766', 'LngISO4Char', 'Build', 'ProdVer', 'EncrVer',\
  1834.                  'Machine', 'Disk', 'Uuid', 'PrevMach', 'PrevDisk', 'User', 'SaUser', 'SaSID',\
  1835.                  # special security measures
  1836.                  'HostIsDomain', 'PhysHostname', 'LogiHostname', 'SaRefDomain',\
  1837.                  'FormHFT', 'UserName', 'UserPass', 'Session', \
  1838.                  'SelServer', 'AcroVersion', 'AcroProduct', 'AcroReader',\
  1839.                  'AcroCanEdit', 'AcroPrefIDib', 'InBrowser', 'CliAppName',\
  1840.                  'DocIsLocal', 'DocPathUrl', 'VolName', 'VolType', 'VolSN',\
  1841.                  'FSName', 'ServerSessionData', 'FowpKbd', 'OSBuild', \
  1842.                  'DocumentSessionData', 'RequestSchema')
  1843.        
  1844.         buildurl = origurl
  1845.         buildurl = buildurl + 'Request=DocPerm'
  1846.         for keys in burl:
  1847.             try:
  1848.                 buildurl = buildurl + '&' + keys + '=' + self.fileopen[keys]
  1849.             except:
  1850.                 continue
  1851.         if DEBUG_MODE == True: debugfile.write('1st url:'+'\n')
  1852.         if DEBUG_MODE == True: debugfile.write(buildurl+'\n\n')
  1853.         # custom user agent identification?
  1854.         if 'AGEN' in self.fileopen:
  1855.             useragent = self.fileopen['AGEN']
  1856.             urllib.URLopener.version = useragent
  1857.         # attribute doesn't exist - take the default user agent
  1858.         else:
  1859.             urllib.URLopener.version = self.osuseragent
  1860.         # try to open the url
  1861.         try:
  1862.             u = urllib.urlopen(buildurl)
  1863.             u.geturl()
  1864.             result = u.read()
  1865.         except:
  1866.             raise ADEPTError('No internet connection or a blocking firewall!')
  1867. ##        finally:
  1868. ##            u.close()
  1869.         # getting rid of the line feed
  1870.         if DEBUG_MODE == True: debugfile.write('1st preresult'+'\n')
  1871.         if DEBUG_MODE == True: debugfile.write(result+'\n\n')
  1872.         #get rid of unnecessary characters
  1873.         result = result.rstrip('\n')
  1874.         result = result.rstrip(chr(13))
  1875.         result = result.lstrip('\n')
  1876.         result = result.lstrip(chr(13))
  1877.         self.urlresult = {}
  1878.         for pair in result.split('&'):
  1879.             try:
  1880.                 key, value = pair.split('=',1)
  1881.                 self.urlresult[key] = value
  1882.             except:
  1883.                 pass
  1884. ##        if 'RequestSchema' in self.surlresult:
  1885. ##            self.fileopen['RequestSchema'] = self.urlresult['RequestSchema']
  1886.          #self.urlresult
  1887.         #result[0:8] == 'RetVal=1') or (result[0:8] == 'RetVal=2'):
  1888.         if ('RetVal' in self.urlresult and (self.urlresult['RetVal'] != '1' and \
  1889.                                             self.urlresult['RetVal'] != '2' and \
  1890.                                             self.urlresult['RetVal'] != 'Update' and \
  1891.                                             self.urlresult['RetVal'] != 'Answer')):
  1892.            
  1893.             if ('Reason' in self.urlresult and (self.urlresult['Reason'] == 'BadUserPwd'\
  1894.                 or self.urlresult['Reason'] == 'AskUnp')) or ('SwitchTo' in self.urlresult\
  1895.                     and (self.urlresult['SwitchTo'] == 'Dialog')):
  1896.                 if 'ServerSessionData' in self.urlresult:
  1897.                     self.fileopen['ServerSessionData'] = self.urlresult['ServerSessionData']
  1898.                 if 'DocumentSessionData' in self.urlresult:
  1899.                     self.fileopen['DocumentSessionData'] = self.urlresult['DocumentSessionData']        
  1900.                 buildurl = origurl
  1901.                 buildurl = buildurl + 'Request=DocPerm'
  1902.                 self.gen_pw_dialog()
  1903.                 # password not found - fallback
  1904.                 for keys in burl:
  1905.                     try:
  1906.                         buildurl = buildurl + '&' + keys + '=' + self.fileopen[keys]
  1907.                     except:
  1908.                         continue
  1909.                 if DEBUG_MODE == True: debugfile.write( '2ndurl:')
  1910.                 if DEBUG_MODE == True: debugfile.write( buildurl+'\n\n')
  1911.                 # try to open the url
  1912.                 try:
  1913.                     u = urllib.urlopen(buildurl)
  1914.                     u.geturl()
  1915.                     result = u.read()
  1916.                 except:
  1917.                     raise ADEPTError('No internet connection or a blocking firewall!')
  1918.                 # getting rid of the line feed
  1919.                 if DEBUG_MODE == True: debugfile.write( '2nd preresult')
  1920.                 if DEBUG_MODE == True: debugfile.write( result+'\n\n')
  1921.                 #get rid of unnecessary characters
  1922.                 result = result.rstrip('\n')
  1923.                 result = result.rstrip(chr(13))
  1924.                 result = result.lstrip('\n')
  1925.                 result = result.lstrip(chr(13))
  1926.                 self.urlresult = {}
  1927.                 for pair in result.split('&'):
  1928.                     try:
  1929.                         key, value = pair.split('=',1)
  1930.                         self.urlresult[key] = value
  1931.                     except:
  1932.                         pass
  1933.         # did it work?
  1934.         if ('RetVal' in self.urlresult and (self.urlresult['RetVal'] != '1' and \
  1935.                                                     self.urlresult['RetVal'] != '2' and
  1936.                                                     self.urlresult['RetVal'] != 'Update' and \
  1937.                                                     self.urlresult['RetVal'] != 'Answer')):
  1938.             raise ADEPTError('Decryption was not successfull.\nReason: ' + self.urlresult['Error'])
  1939.         # fix for non-standard-conform fileopen pdfs
  1940. ##        if self.fileopen['Length'] != 5 and self.fileopen['Length'] != 16:
  1941. ##            if self.fileopen['V'] == 1:
  1942. ##                self.fileopen['Length'] = 5
  1943. ##            else:
  1944. ##                self.fileopen['Length'] = 16
  1945.         # patch for malformed pdfs
  1946.         #print len(self.urlresult['Code'])
  1947.         #print self.urlresult['Code'].encode('hex')
  1948.         if 'code' in self.urlresult:
  1949.             self.urlresult['Code'] = self.urlresult['code']
  1950.         if 'Code' in self.urlresult:            
  1951.             if len(self.urlresult['Code']) == 5 or len(self.urlresult['Code']) == 16:
  1952.                 self.decrypt_key = self.urlresult['Code']
  1953.             else:
  1954.                 self.decrypt_key = self.urlresult['Code'].decode('hex')
  1955.         else:
  1956.             raise ADEPTError('Cannot find decryption key.')
  1957.         self.genkey = self.genkey_v2
  1958.         self.decipher = self.decrypt_rc4
  1959.         self.ready = True
  1960.         return
  1961.    
  1962.     def gen_pw_dialog(self, Username='Username', Password='Password', Title='User/Password Authentication',\
  1963.                       OK='Proceed', Text1='Authorization', Text2='Enter Required Data'):
  1964.         self.pwtk = Tkinter.Tk()
  1965.         self.pwtk.title(Title)
  1966.         self.pwtk.minsize(150, 0)
  1967.         self.label1 = Tkinter.Label(self.pwtk, text=Text1)
  1968.         self.label2 = Tkinter.Label(self.pwtk, text=Text2)
  1969.         self.label3 = Tkinter.Label(self.pwtk, text=Username)
  1970.         self.pwfieldreq = 1        
  1971.         self.gui = True
  1972.         self.un_entry = Tkinter.Entry(self.pwtk)
  1973.         # cursor here
  1974.         self.un_entry.focus()
  1975.         self.label4 = Tkinter.Label(self.pwtk, text=Password)
  1976.         self.pw_entry = Tkinter.Entry(self.pwtk, show="*")
  1977.         self.button = Tkinter.Button(self.pwtk, text=OK, command=self.fo_save_values)
  1978.         # widget layout, stack vertical
  1979.         self.label1.pack()
  1980.         self.label2.pack()
  1981.         self.label3.pack()        
  1982.         self.un_entry.pack()
  1983.         # create a password label and field
  1984.         if self.pwfieldreq == 1:
  1985.             self.label4.pack()
  1986.             self.pw_entry.pack()
  1987.         self.button.pack()
  1988.         self.pwtk.update()            
  1989.         # start the event loop
  1990.         self.pwtk.mainloop()
  1991.        
  1992.     # genkey functions
  1993.     def genkey_v2(self, objid, genno):
  1994.         objid = struct.pack('<L', objid)[:3]
  1995.         genno = struct.pack('<L', genno)[:2]
  1996.         key = self.decrypt_key + objid + genno
  1997.         hash = hashlib.md5(key)
  1998.         key = hash.digest()[:min(len(self.decrypt_key) + 5, 16)]
  1999.         return key
  2000.    
  2001.     def genkey_v3(self, objid, genno):
  2002.         objid = struct.pack('<L', objid ^ 0x3569ac)
  2003.         genno = struct.pack('<L', genno ^ 0xca96)
  2004.         key = self.decrypt_key
  2005.         key += objid[0] + genno[0] + objid[1] + genno[1] + objid[2] + 'sAlT'
  2006.         hash = hashlib.md5(key)
  2007.         key = hash.digest()[:min(len(self.decrypt_key) + 5, 16)]
  2008.         return key
  2009.  
  2010.     # aes v2 and v4 algorithm
  2011.     def genkey_v4(self, objid, genno):
  2012.         objid = struct.pack('<L', objid)[:3]
  2013.         genno = struct.pack('<L', genno)[:2]
  2014.         key = self.decrypt_key + objid + genno + 'sAlT'
  2015.         hash = hashlib.md5(key)
  2016.         key = hash.digest()[:min(len(self.decrypt_key) + 5, 16)]
  2017.         return key
  2018.  
  2019.     def decrypt_aes(self, objid, genno, data):
  2020.         key = self.genkey(objid, genno)
  2021.         ivector = data[:16]
  2022.         data = data[16:]
  2023.         plaintext = AES.new(key,AES.MODE_CBC,ivector).decrypt(data)
  2024.         # remove pkcs#5 aes padding
  2025.         cutter = -1 * ord(plaintext[-1])
  2026.         #print cutter
  2027.         plaintext = plaintext[:cutter]
  2028.         return plaintext
  2029.  
  2030.     def decrypt_aes256(self, objid, genno, data):
  2031.         key = self.genkey(objid, genno)
  2032.         ivector = data[:16]
  2033.         data = data[16:]
  2034.         plaintext = AES.new(key,AES.MODE_CBC,ivector).decrypt(data)
  2035.         # remove pkcs#5 aes padding
  2036.         cutter = -1 * ord(plaintext[-1])
  2037.         #print cutter
  2038.         plaintext = plaintext[:cutter]
  2039.         return plaintext
  2040.    
  2041.     def decrypt_rc4(self, objid, genno, data):
  2042.         key = self.genkey(objid, genno)
  2043.         return ARC4.new(key).decrypt(data)
  2044.  
  2045.     # fileopen user/password dialog    
  2046.     def fo_save_values(self):
  2047.         getout = 0
  2048.         username = 0
  2049.         password = 0
  2050.         username = self.un_entry.get()
  2051.         if self.pwfieldreq == 1:        
  2052.             password = self.pw_entry.get()
  2053.         un_length = len(username)
  2054.         if self.pwfieldreq == 1:                
  2055.             pw_length = len(password)
  2056.         if (un_length != 0):
  2057.             if self.pwfieldreq == 1:
  2058.                 if (pw_length != 0):
  2059.                     getout = 1
  2060.             else:
  2061.                 getout = 1
  2062.         if getout == 1:
  2063.             if 'SEMO' in self.fileopen and self.fileopen['SEMO'] == '1':
  2064.                 self.fileopen['Session'] = urllib.quote(username)
  2065.             else:
  2066.                 self.fileopen['UserName'] = urllib.quote(username)
  2067.             if self.pwfieldreq == 1:
  2068.                 self.fileopen['UserPass'] = urllib.quote(password)
  2069.             else:
  2070.                 pass
  2071.                 #self.fileopen['UserPass'] = self.fileopen['UserName']
  2072.             # doesn't always close the password window, who
  2073.             # knows why (Tkinter secrets ;=))
  2074.             self.pwtk.quit()
  2075.    
  2076.    
  2077.     def fo_setattributes(self):
  2078.         self.fileopen['Request']='DocPerm'
  2079.         self.fileopen['Mode']='CNR'
  2080.         self.fileopen['DocStrFmt']='ASCII'
  2081.         self.fileopen['Language']='ENU'
  2082.         self.fileopen['LngLCID']='ENU'
  2083.         self.fileopen['LngRFC1766']='en'
  2084.         self.fileopen['LngISO4Char']='en-us'
  2085.         self.fileopen['ProdVer']='1.8.7.9'
  2086.         self.fileopen['FormHFT']='Yes'
  2087.         self.fileopen['SelServer']='Yes'
  2088.         self.fileopen['AcroCanEdit']='Yes'
  2089.         self.fileopen['AcroPrefIDib']='Yes'
  2090.         self.fileopen['InBrowser']='Unk'
  2091.         self.fileopen['CliAppName']=''
  2092.         self.fileopen['DocIsLocal']='Yes'
  2093.         self.fileopen['FowpKbd']='Yes'
  2094.         self.fileopen['RequestSchema']='Default'
  2095.        
  2096.     # get nic mac address
  2097.     def get_linux_macaddress(self):
  2098.         try:
  2099.             for line in os.popen("/sbin/ifconfig"):
  2100.                 if line.find('Ether') > -1:
  2101.                     mac = line.split()[4]
  2102.                     break
  2103.             return mac.replace(':','')
  2104.         except:
  2105.             raise ADEPTError('Cannot find MAC address. Get forum help.')
  2106.  
  2107.     def get_win_macaddress(self):
  2108.         try:
  2109.             gasize = c_ulong(5000)
  2110.             p = create_string_buffer(5000)
  2111.             GetAdaptersInfo = windll.iphlpapi.GetAdaptersInfo
  2112.             GetAdaptersInfo(byref(p),byref(gasize))
  2113.             return p[0x194:0x19a].encode('hex')
  2114.         except:
  2115.             raise ADEPTError('Cannot find MAC address. Get forum help.')
  2116.        
  2117.     # custom conversion 5 bytes to 8 chars method
  2118.     def fo_convert5to8(self, edisk):
  2119.         # byte to number/char mapping table
  2120.         darray=[0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x41,0x42,0x43,0x44,0x45,\
  2121.                 0x46,0x47,0x48,0x4A,0x4B,0x4C,0x4D,0x4E,0x50,0x51,0x52,0x53,0x54,\
  2122.                 0x55,0x56,0x57,0x58,0x59,0x5A]
  2123.         pdid = struct.pack('<I', int(edisk[0:4].encode("hex"),16))
  2124.         pdid = int(pdid.encode("hex"),16)
  2125.         outputhw = ''
  2126.         # disk id processing
  2127.         for i in range(0,6):
  2128.             index = pdid & 0x1f
  2129.             # shift the disk id 5 bits to the right
  2130.             pdid = pdid >> 5
  2131.             outputhw = outputhw + chr(darray[index])
  2132.         pdid = (ord(edisk[4]) << 2)|pdid
  2133.         # get the last 2 bits from the hwid + low part of the cpuid
  2134.         for i in range(0,2):
  2135.             index = pdid & 0x1f
  2136.             # shift the disk id 5 bits to the right
  2137.             pdid = pdid >> 5
  2138.             outputhw = outputhw + chr(darray[index])
  2139.         return outputhw
  2140.  
  2141.     # Linux processing
  2142.     def fo_linux_sethwids(self):
  2143.         # linux specific attributes
  2144.         self.fileopen['OSType']='Linux'
  2145.         self.fileopen['AcroProduct']='AcroReader'
  2146.         self.fileopen['AcroReader']='Yes'
  2147.         self.fileopen['AcroVersion']='9.101'
  2148.         self.fileopen['FSName']='ext3'    
  2149.         self.fileopen['Build']='878'
  2150.         self.fileopen['ProdVer']='1.8.5.1'
  2151.         self.fileopen['OSBuild']='2.6.33'        
  2152.         # write hardware keys
  2153.         hwkey = 0
  2154.         pmac = self.get_macaddress().decode("hex");
  2155.         self.fileopen['Disk'] = self.fo_convert5to8(pmac[1:])
  2156.         # get primary used default mac address
  2157.         self.fileopen['Machine'] = self.fo_convert5to8(pmac[1:])
  2158.         # get uuid
  2159.         # check for reversed offline handler 6AB83F4Ah + AFh 6AB83F4Ah
  2160.         if 'LILA' in self.fileopen:
  2161.             pass
  2162.         if 'Ident4ID' in self.fileopen:
  2163.             self.fileopen['User'] = getpass.getuser()
  2164.             self.fileopen['SaUser'] = getpass.getuser()
  2165.             try:
  2166.                 cuser = winreg.HKEY_CURRENT_USER
  2167.                 FOW3_UUID = 'Software\\Fileopen'
  2168.                 regkey = winreg.OpenKey(cuser, FOW3_UUID)
  2169.                 userkey = winreg.QueryValueEx(regkey, 'Fowp3Uuid')[0]
  2170. #                if self.genkey_cryptmach(userkey)[0:4] != 'ec20':
  2171.                 self.fileopen['Uuid'] = self.genkey_cryptmach(userkey)[4:]
  2172. ##                elif self.genkey_cryptmach(userkey)[0:4] != 'ec20':
  2173. ##                    self.fileopen['Uuid'] = self.genkey_cryptmach(userkey,1)[4:]
  2174. ##                else:
  2175.             except:
  2176.                 raise ADEPTError('Cannot find FowP3Uuid file')
  2177.         else:
  2178.             self.fileopen['Uuid'] = str(uuid.uuid1())
  2179.         # get time stamp
  2180.         self.fileopen['Stamp'] = str(time.time())[:-3]
  2181.         # get fileopen input pdf name + path
  2182.         self.fileopen['DocPathUrl'] = 'file%3a%2f%2f%2f'\
  2183.                                       + urllib.quote(os.path.normpath(INPUTFILEPATH))
  2184.         # clear the link
  2185.         #INPUTFILEPATH = ''
  2186. ##        # get volume name (urllib quote necessairy?) urllib.quote(
  2187. ##        self.fileopen['VolName'] = win32api.GetVolumeInformation("C:\\")[0]
  2188. ##        # get volume serial number
  2189. ##        self.fileopen['VolSN'] = str(win32api.GetVolumeInformation("C:\\")[1])
  2190.         return
  2191.  
  2192.     # Windows processing
  2193.     def fo_win_sethwids(self):
  2194.         # Windows specific attributes        
  2195.         self.fileopen['OSType']='Windows'
  2196.         self.fileopen['OSName']='Vista'
  2197.         self.fileopen['OSData']='Service%20Pack%204'        
  2198.         self.fileopen['AcroProduct']='Reader'
  2199.         self.fileopen['AcroReader']='Yes'    
  2200.         self.fileopen['OSBuild']='7600'
  2201.         self.fileopen['AcroVersion']='9.1024'
  2202.         self.fileopen['Build']='879'        
  2203.         # write hardware keys
  2204.         hwkey = 0
  2205.         # get the os type and save it in ostype
  2206.         try:
  2207.             import win32api
  2208.             import win32security
  2209.             import win32file
  2210.             import _winreg as winreg                
  2211.         except:
  2212.             raise ADEPTError('PyWin Extension (Win32API module) needed.\n'+\
  2213.                              'Download from http://sourceforge.net/projects/pywin32/files/ ')
  2214.         try:
  2215.             v0 = win32api.GetVolumeInformation('C:\\')
  2216.             v1 = win32api.GetSystemInfo()[6]
  2217.             # fix for possible negative integer (Python problem)
  2218.             volserial = v0[1] & 0xffffffff
  2219.             lowcpu = v1 & 255
  2220.             highcpu = (v1 >> 8) & 255
  2221.             # changed to int
  2222.             volserial = struct.pack('<I', int(volserial))
  2223.             lowcpu   = struct.pack('B', lowcpu)
  2224.             highcpu = struct.pack('B', highcpu)
  2225.             encrypteddisk = volserial + lowcpu + highcpu
  2226.             self.fileopen['Disk'] = self.fo_convert5to8(encrypteddisk)            
  2227.         except:
  2228.             # no c system drive available empty disk attribute
  2229.             self.fileopen['Disk'] = ''          
  2230.         # get primary used default mac address
  2231.         pmac = self.get_macaddress().decode("hex");
  2232.         self.fileopen['Machine'] = self.fo_convert5to8(pmac[1:])
  2233.         if 'LIFF' in self.fileopen:
  2234.             if 'Yes' in self.fileopen['LIFF']:
  2235.                 hostname = socket.gethostname()
  2236.                 self.fileopen['HostIsDomain']='Yes'
  2237.                 if '1' in self.fileopen['LIFF']:
  2238.                     self.fileopen['PhysHostname']= hostname
  2239.                     self.fileopen['LogiHostname']= hostname
  2240.                     self.fileopen['SaRefDomain']= hostname
  2241.         # default users
  2242.         self.user = win32api.GetUserName().lower()
  2243.         self.sauser = win32api.GetUserName()                      
  2244.         # get uuid
  2245.         # check for reversed offline handler
  2246.         if 'LILA' in self.fileopen and self.fileopen['LILA'] == 'Yes':
  2247. ##            self.fileopen['User'] = win32api.GetUserName().lower()
  2248. ##            self.fileopen['SaUser'] = win32api.GetUserName()
  2249.          
  2250.             # get sid / sasid
  2251.             try:
  2252.                 psid = win32security.LookupAccountName("",self.sauser)[0]
  2253.                 psid = win32security.ConvertSidToStringSid(psid)
  2254.                 self.fileopen['SaSID'] = psid
  2255.                 self.fileopen['User'] = urllib.quote(self.user)
  2256.                 self.fileopen['SaUser'] = urllib.quote(self.sauser)                
  2257.             # didn't work use a generic one
  2258.             except:
  2259.                 self.fileopen['SaSID'] = 'S-1-5-21-1380067357-584463869-1343024091-1000'
  2260.         #if 'Ident4d' in self.fileopen or 'LILA' in self.fileopen:
  2261.         # always calculate the right uuid
  2262.         userkey = []        
  2263.         try:
  2264.             cuser = winreg.HKEY_CURRENT_USER
  2265.             FOW3_UUID = 'Software\\Fileopen'
  2266.             regkey = winreg.OpenKey(cuser, FOW3_UUID)
  2267.             userkey.append(winreg.QueryValueEx(regkey, 'Fowp3Uuid')[0])
  2268.         except:
  2269.             pass
  2270.         try:
  2271.             fopath = os.environ['AppData']+'\\FileOpen\\'
  2272.             fofilename = 'Fowpmadi.txt'
  2273.             f = open(fopath+fofilename, 'rb')
  2274.             userkey.append(f.read()[0:40])
  2275.             f.close()
  2276.         except:
  2277.             pass
  2278.         if not userkey:
  2279.             raise ADEPTError('Cannot find FowP3Uuid in registry or file.\n'\
  2280.                                  +'Did Adobe (Reader) open the pdf file?')
  2281.         cresult = self.genkey_cryptmach(userkey)
  2282.         if cresult != False:
  2283.             self.fileopen['Uuid'] = cresult
  2284.         # kind of a long shot we'll see about it
  2285.         else:
  2286.             self.fileopen['Uuid'] = str(uuid.uuid1())
  2287. ##        else:
  2288. ##            self.fileopen['Uuid'] = str(uuid.uuid1())
  2289.         # get time stamp
  2290.         self.fileopen['Stamp'] = str(time.time())[:-3]
  2291.         # get fileopen input pdf name + path
  2292.         # print INPUTFILEPATH
  2293.         self.fileopen['DocPathUrl'] = 'file%3a%2f%2f%2f'\
  2294.                                       + urllib.quote(INPUTFILEPATH)
  2295.         # determine voltype
  2296.         voltype = ('Unknown', 'Invalid', 'Removable', 'Fixed', 'Remote', 'CDRom', 'RamDisk')
  2297.         dletter = os.path.splitdrive(INPUTFILEPATH)[0] + '\\'
  2298.         self.fileopen['VolType'] = voltype[win32file.GetDriveType(dletter)]        
  2299.         # get volume name (urllib quote necessairy?) urllib.quote(
  2300.         self.fileopen['VolName'] = urllib.quote(win32api.GetVolumeInformation(dletter)[0])
  2301.         # get volume serial number (fix for possible negative numbers)          
  2302.         self.fileopen['VolSN'] = str(win32api.GetVolumeInformation(dletter)[1])
  2303.         # no c volume so skip it
  2304.         self.fileopen['FSName'] = win32api.GetVolumeInformation(dletter)[4]
  2305.         # get previous mac address or disk handling
  2306.         userkey = []
  2307.         try:
  2308.             cuser = winreg.HKEY_CURRENT_USER
  2309.             FOW3_UUID = 'Software\\Fileopen'
  2310.             regkey = winreg.OpenKey(cuser, FOW3_UUID)
  2311.             userkey.append(winreg.QueryValueEx(regkey, 'Fowp3Madi')[0])
  2312.         except:
  2313.             pass
  2314.         try:
  2315.             fopath = os.environ['AppData']+'\\FileOpen\\'
  2316.             fofilename = 'Fowpmadi.txt'
  2317.             f = open(fopath+fofilename, 'rb')
  2318.             userkey.append(f.read()[40:])
  2319.             f.close()
  2320.         except:
  2321.             pass
  2322.         if not userkey:
  2323.             raise ADEPTError('Cannot find FowP3Madi in registry or file.\n'\
  2324.                              +'Did Adobe Reader open the pdf file?')
  2325.         cresult = self.genkey_cryptmach(userkey)
  2326.         if cresult != False:
  2327.             machdisk = self.genkey_cryptmach(userkey)
  2328.             machine = machdisk[:8]
  2329.             disk = machdisk[8:]
  2330.         # did not find the required information, false it
  2331.         else:
  2332.             machdisk = False
  2333.             machine = False
  2334.             disk = False
  2335.         if machine != self.fileopen['Machine'] and machdisk != False:
  2336.             self.fileopen['PrevMach'] = machine
  2337.         if disk != self.fileopen['Disk'] and machdisk != False:
  2338.             self.fileopen['PrevDisk'] = disk        
  2339.         return
  2340.  
  2341.     # decryption routine for the INFO area
  2342.     def genkey_fileopeninfo(self, data):
  2343.         input1 = struct.pack('L', 0xa4da49de)
  2344.         seed   = struct.pack('B', 0x82)
  2345.         key = input1[3] + input1[2] +input1[1] +input1[0] + seed
  2346.         hash = hashlib.md5()
  2347.         key = hash.update(key)
  2348.         spointer4 = struct.pack('<L', 0xec8d6c58)
  2349.         seed = struct.pack('B', 0x07)
  2350.         key = spointer4[3] + spointer4[2] + spointer4[1] + spointer4[0] + seed
  2351.         key = hash.update(key)
  2352.         md5 = hash.digest()
  2353.         key = md5[0:10]
  2354.         return ARC4.new(key).decrypt(data)
  2355.  
  2356.     def genkey_cryptmach(self, data):
  2357.         # nested subfunction
  2358.         def genkeysub(uname, mode=False):
  2359.             key_string = '37A4DA49DE82064939A60B1D8D7B5F0F8873B6D93E'.decode('hex')
  2360.             m = hashlib.md5()
  2361.             m.update(key_string[:3])
  2362.             m.update(uname[:13]) # max 13 characters 13 - sizeof(username)
  2363.             if (13 - len(uname)) > 0 and mode == True:
  2364.                 m.update(key_string[:(13-len(uname))])
  2365.             md5sum = m.digest()[0:16]
  2366.             # print md5sum.encode('hex')
  2367.             # normal ident4id calculation
  2368.             retval = []
  2369.             for sdata in data:
  2370.                 retval.append(ARC4.new(md5sum).decrypt(sdata))
  2371.             for rval in retval:
  2372.                 if rval[:4] == 'ec20':
  2373.                     return rval[4:]
  2374.             return False
  2375.         # start normal execution    
  2376.         # list for username variants
  2377.         unamevars = []
  2378.         # fill username variants list
  2379.         unamevars.append(self.user)
  2380.         unamevars.append(self.user + chr(0))
  2381.         unamevars.append(self.user.lower())
  2382.         unamevars.append(self.user.lower() + chr(0))
  2383.         unamevars.append(self.user.upper())
  2384.         unamevars.append(self.user.upper() + chr(0))
  2385.         # go through it
  2386.         for uname in unamevars:
  2387.             result = genkeysub(uname, True)
  2388.             if result != False:
  2389.               return result            
  2390.             result = genkeysub(uname)
  2391.             if result != False:
  2392.               return result
  2393.         # didn't find it, return false
  2394.         return False
  2395. ##        raise ADEPTError('Unsupported Ident4D Decryption,\n'+\
  2396. ##                             'report the bug to the ineptpdf script forum')                
  2397.                
  2398.     KEYWORD_OBJ = PSKeywordTable.intern('obj')
  2399.    
  2400.     def getobj(self, objid):
  2401.         if not self.ready:
  2402.             raise PDFException('PDFDocument not initialized')
  2403.         #assert self.xrefs
  2404.         if objid in self.objs:
  2405.             genno = 0
  2406.             obj = self.objs[objid]
  2407.         else:
  2408.             for xref in self.xrefs:
  2409.                 try:
  2410.                     (stmid, index) = xref.getpos(objid)
  2411.                     break
  2412.                 except KeyError:
  2413.                     pass
  2414.             else:
  2415.                 #if STRICT:
  2416.                 #    raise PDFSyntaxError('Cannot locate objid=%r' % objid)
  2417.                 return None
  2418.             if stmid:
  2419.                 if gen_xref_stm:
  2420.                     return PDFObjStmRef(objid, stmid, index)
  2421. # Stuff from pdfminer: extract objects from object stream
  2422.                 stream = stream_value(self.getobj(stmid))
  2423.                 if stream.dic.get('Type') is not LITERAL_OBJSTM:
  2424.                     if STRICT:
  2425.                         raise PDFSyntaxError('Not a stream object: %r' % stream)
  2426.                 try:
  2427.                     n = stream.dic['N']
  2428.                 except KeyError:
  2429.                     if STRICT:
  2430.                         raise PDFSyntaxError('N is not defined: %r' % stream)
  2431.                     n = 0
  2432.  
  2433.                 if stmid in self.parsed_objs:
  2434.                     objs = self.parsed_objs[stmid]
  2435.                 else:
  2436.                     parser = PDFObjStrmParser(stream.get_data(), self)
  2437.                     objs = []
  2438.                     try:
  2439.                         while 1:
  2440.                             (_,obj) = parser.nextobject()
  2441.                             objs.append(obj)
  2442.                     except PSEOF:
  2443.                         pass
  2444.                     self.parsed_objs[stmid] = objs
  2445.                 genno = 0
  2446.                 i = n*2+index
  2447.                 try:
  2448.                     obj = objs[i]
  2449.                 except IndexError:
  2450.                     raise PDFSyntaxError('Invalid object number: objid=%r' % (objid))
  2451.                 if isinstance(obj, PDFStream):
  2452.                     obj.set_objid(objid, 0)
  2453. ###
  2454.             else:
  2455.                 self.parser.seek(index)
  2456.                 (_,objid1) = self.parser.nexttoken() # objid
  2457.                 (_,genno) = self.parser.nexttoken() # genno
  2458.                 #assert objid1 == objid, (objid, objid1)
  2459.                 (_,kwd) = self.parser.nexttoken()
  2460.         # #### hack around malformed pdf files
  2461.         #        assert objid1 == objid, (objid, objid1)
  2462. ##                if objid1 != objid:
  2463. ##                    x = []
  2464. ##                    while kwd is not self.KEYWORD_OBJ:
  2465. ##                        (_,kwd) = self.parser.nexttoken()
  2466. ##                        x.append(kwd)
  2467. ##                    if x:
  2468. ##                        objid1 = x[-2]
  2469. ##                        genno = x[-1]
  2470. ##                
  2471.                 if kwd is not self.KEYWORD_OBJ:
  2472.                     raise PDFSyntaxError(
  2473.                         'Invalid object spec: offset=%r' % index)
  2474.                 (_,obj) = self.parser.nextobject()
  2475.                 if isinstance(obj, PDFStream):
  2476.                     obj.set_objid(objid, genno)
  2477.                 if self.decipher:
  2478.                     obj = decipher_all(self.decipher, objid, genno, obj)
  2479.             self.objs[objid] = obj
  2480.         return obj
  2481.  
  2482. # helper class for cookie retrival
  2483. class WinBrowserCookie():
  2484.     def __init__(self):
  2485.         pass
  2486.     def getcookie(self, cname, chost):
  2487.         # check firefox db
  2488.         fprofile =  os.environ['AppData']+r'\Mozilla\Firefox'
  2489.         pinifile = 'profiles.ini'
  2490.         fini = os.path.normpath(fprofile + '\\' + pinifile)
  2491.         try:
  2492.             with open(fini,'r') as ffini:
  2493.                 firefoxini =  ffini.read()
  2494.         # Firefox not installed or on an USB stick
  2495.         except:
  2496.             return None
  2497.         for pair in firefoxini.split('\n'):
  2498.             try:
  2499.                 key, value = pair.split('=',1)
  2500.                 if key == 'Path':
  2501.                     fprofile = os.path.normpath(fprofile+'//'+value+'//'+'cookies.sqlite')
  2502.                     break
  2503.             # asdf
  2504.             except:
  2505.                 continue
  2506.         if os.path.isfile(fprofile):
  2507.             try:
  2508.                 con = sqlite3.connect(fprofile,1)
  2509.             except:
  2510.                 raise ADEPTError('Firefox Cookie data base locked. Close Firefox and try again')
  2511.             cur = con.cursor()
  2512.             try:            
  2513.                 cur.execute("select value from moz_cookies where name=? and host=?", (cname, chost))
  2514.             except Exception:
  2515.                 raise ADEPTError('Firefox Cookie database is locked. Close Firefox and try again')
  2516.             try:
  2517.                 return cur.fetchone()[0]
  2518.             except Exception:
  2519.                 # sometimes is a dot in front of the host
  2520.                 chost = '.'+chost
  2521.                 cur.execute("select value from moz_cookies where name=? and host=?", (cname, chost))
  2522.                 try:
  2523.                     return cur.fetchone()[0]
  2524.                 except:
  2525.                     return None
  2526.                
  2527. class PDFObjStmRef(object):
  2528.     maxindex = 0
  2529.     def __init__(self, objid, stmid, index):
  2530.         self.objid = objid
  2531.         self.stmid = stmid
  2532.         self.index = index
  2533.         if index > PDFObjStmRef.maxindex:
  2534.             PDFObjStmRef.maxindex = index
  2535.  
  2536.    
  2537. ##  PDFParser
  2538. ##
  2539. class PDFParser(PSStackParser):
  2540.  
  2541.     def __init__(self, doc, fp):
  2542.         PSStackParser.__init__(self, fp)
  2543.         self.doc = doc
  2544.         self.doc.set_parser(self)
  2545.         return
  2546.  
  2547.     def __repr__(self):
  2548.         return '<PDFParser>'
  2549.  
  2550.     KEYWORD_R = PSKeywordTable.intern('R')
  2551.     KEYWORD_ENDOBJ = PSKeywordTable.intern('endobj')
  2552.     KEYWORD_STREAM = PSKeywordTable.intern('stream')
  2553.     KEYWORD_XREF = PSKeywordTable.intern('xref')
  2554.     KEYWORD_STARTXREF = PSKeywordTable.intern('startxref')
  2555.     def do_keyword(self, pos, token):
  2556.         if token in (self.KEYWORD_XREF, self.KEYWORD_STARTXREF):
  2557.             self.add_results(*self.pop(1))
  2558.             return
  2559.         if token is self.KEYWORD_ENDOBJ:
  2560.             self.add_results(*self.pop(4))
  2561.             return
  2562.        
  2563.         if token is self.KEYWORD_R:
  2564.             # reference to indirect object
  2565.             try:
  2566.                 ((_,objid), (_,genno)) = self.pop(2)
  2567.                 (objid, genno) = (int(objid), int(genno))
  2568.                 obj = PDFObjRef(self.doc, objid, genno)
  2569.                 self.push((pos, obj))
  2570.             except PSSyntaxError:
  2571.                 pass
  2572.             return
  2573.            
  2574.         if token is self.KEYWORD_STREAM:
  2575.             # stream object
  2576.             ((_,dic),) = self.pop(1)
  2577.             dic = dict_value(dic)
  2578.             try:
  2579.                 objlen = int_value(dic['Length'])
  2580.             except KeyError:
  2581.                 if STRICT:
  2582.                     raise PDFSyntaxError('/Length is undefined: %r' % dic)
  2583.                 objlen = 0
  2584.             self.seek(pos)
  2585.             try:
  2586.                 (_, line) = self.nextline()  # 'stream'
  2587.             except PSEOF:
  2588.                 if STRICT:
  2589.                     raise PDFSyntaxError('Unexpected EOF')
  2590.                 return
  2591.             pos += len(line)
  2592.             self.fp.seek(pos)
  2593.             data = self.fp.read(objlen)
  2594.             self.seek(pos+objlen)
  2595.             while 1:
  2596.                 try:
  2597.                     (linepos, line) = self.nextline()
  2598.                 except PSEOF:
  2599.                     if STRICT:
  2600.                         raise PDFSyntaxError('Unexpected EOF')
  2601.                     break
  2602.                 if 'endstream' in line:
  2603.                     i = line.index('endstream')
  2604.                     objlen += i
  2605.                     data += line[:i]
  2606.                     break
  2607.                 objlen += len(line)
  2608.                 data += line
  2609.             self.seek(pos+objlen)
  2610.             obj = PDFStream(dic, data, self.doc.decipher)
  2611.             self.push((pos, obj))
  2612.             return
  2613.        
  2614.         # others
  2615.         self.push((pos, token))
  2616.         return
  2617.  
  2618.     def find_xref(self):
  2619.         # search the last xref table by scanning the file backwards.
  2620.         prev = None
  2621.         for line in self.revreadlines():
  2622.             line = line.strip()
  2623.             if line == 'startxref': break
  2624.             if line:
  2625.                 prev = line
  2626.         else:
  2627.             raise PDFNoValidXRef('Unexpected EOF')
  2628.         return int(prev)
  2629.  
  2630.     # read xref table
  2631.     def read_xref_from(self, start, xrefs):
  2632.         self.seek(start)
  2633.         self.reset()
  2634.         try:
  2635.             (pos, token) = self.nexttoken()
  2636.         except PSEOF:
  2637.             raise PDFNoValidXRef('Unexpected EOF')
  2638.         if isinstance(token, int):
  2639.             # XRefStream: PDF-1.5
  2640.             if GEN_XREF_STM == 1:
  2641.                 global gen_xref_stm
  2642.                 gen_xref_stm = True
  2643.             self.seek(pos)
  2644.             self.reset()
  2645.             xref = PDFXRefStream()
  2646.             xref.load(self)
  2647.         else:
  2648.             if token is not self.KEYWORD_XREF:
  2649.                 raise PDFNoValidXRef('xref not found: pos=%d, token=%r' %
  2650.                                      (pos, token))
  2651.             self.nextline()
  2652.             xref = PDFXRef()
  2653.             xref.load(self)
  2654.         xrefs.append(xref)
  2655.         trailer = xref.trailer
  2656.         if 'XRefStm' in trailer:
  2657.             pos = int_value(trailer['XRefStm'])
  2658.             self.read_xref_from(pos, xrefs)
  2659.         if 'Prev' in trailer:
  2660.             # find previous xref
  2661.             pos = int_value(trailer['Prev'])
  2662.             self.read_xref_from(pos, xrefs)
  2663.         return
  2664.        
  2665.     # read xref tables and trailers
  2666.     def read_xref(self):
  2667.         xrefs = []
  2668.         trailerpos = None
  2669.         try:
  2670.             pos = self.find_xref()
  2671.             self.read_xref_from(pos, xrefs)
  2672.         except PDFNoValidXRef:
  2673.             # fallback
  2674.             self.seek(0)
  2675.             pat = re.compile(r'^(\d+)\s+(\d+)\s+obj\b')
  2676.             offsets = {}
  2677.             xref = PDFXRef()
  2678.             while 1:
  2679.                 try:
  2680.                     (pos, line) = self.nextline()
  2681.                 except PSEOF:
  2682.                     break
  2683.                 if line.startswith('trailer'):
  2684.                     trailerpos = pos # remember last trailer
  2685.                 m = pat.match(line)
  2686.                 if not m: continue
  2687.                 (objid, genno) = m.groups()
  2688.                 offsets[int(objid)] = (0, pos)
  2689.             if not offsets: raise
  2690.             xref.offsets = offsets
  2691.             if trailerpos:
  2692.                 self.seek(trailerpos)
  2693.                 xref.load_trailer(self)
  2694.                 xrefs.append(xref)
  2695.         return xrefs
  2696.  
  2697. ##  PDFObjStrmParser
  2698. ##
  2699. class PDFObjStrmParser(PDFParser):
  2700.  
  2701.     def __init__(self, data, doc):
  2702.         PSStackParser.__init__(self, StringIO(data))
  2703.         self.doc = doc
  2704.         return
  2705.  
  2706.     def flush(self):
  2707.         self.add_results(*self.popall())
  2708.         return
  2709.  
  2710.     KEYWORD_R = KWD('R')
  2711.     def do_keyword(self, pos, token):
  2712.         if token is self.KEYWORD_R:
  2713.             # reference to indirect object
  2714.             try:
  2715.                 ((_,objid), (_,genno)) = self.pop(2)
  2716.                 (objid, genno) = (int(objid), int(genno))
  2717.                 obj = PDFObjRef(self.doc, objid, genno)
  2718.                 self.push((pos, obj))
  2719.             except PSSyntaxError:
  2720.                 pass
  2721.             return
  2722.         # others
  2723.         self.push((pos, token))
  2724.         return
  2725.  
  2726. ###
  2727. ### My own code, for which there is none else to blame
  2728.  
  2729. class PDFSerializer(object):
  2730.     def __init__(self, inf, keypath):
  2731.         global GEN_XREF_STM, gen_xref_stm
  2732.         gen_xref_stm = GEN_XREF_STM > 1
  2733.         self.version = inf.read(8)
  2734.         inf.seek(0)
  2735.         self.doc = doc = PDFDocument()
  2736.         parser = PDFParser(doc, inf)
  2737.         doc.initialize(keypath)
  2738.         self.objids = objids = set()
  2739.         for xref in reversed(doc.xrefs):
  2740.             trailer = xref.trailer
  2741.             for objid in xref.objids():
  2742.                 objids.add(objid)
  2743.         trailer = dict(trailer)
  2744.         trailer.pop('Prev', None)
  2745.         trailer.pop('XRefStm', None)
  2746.         if 'Encrypt' in trailer:
  2747.             objids.remove(trailer.pop('Encrypt').objid)
  2748.         self.trailer = trailer
  2749.  
  2750.     def dump(self, outf):
  2751.         self.outf = outf
  2752.         self.write(self.version)
  2753.         self.write('\n%\xe2\xe3\xcf\xd3\n')
  2754.         doc = self.doc
  2755.         objids = self.objids
  2756.         xrefs = {}
  2757.         maxobj = max(objids)
  2758.         trailer = dict(self.trailer)
  2759.         trailer['Size'] = maxobj + 1
  2760.         for objid in objids:
  2761.             obj = doc.getobj(objid)
  2762.             if isinstance(obj, PDFObjStmRef):
  2763.                 xrefs[objid] = obj
  2764.                 continue
  2765.             if obj is not None:
  2766.                 try:
  2767.                     genno = obj.genno
  2768.                 except AttributeError:
  2769.                     genno = 0
  2770.                 xrefs[objid] = (self.tell(), genno)
  2771.                 self.serialize_indirect(objid, obj)
  2772.         startxref = self.tell()
  2773.  
  2774.         if not gen_xref_stm:
  2775.             self.write('xref\n')
  2776.             self.write('0 %d\n' % (maxobj + 1,))
  2777.             for objid in xrange(0, maxobj + 1):
  2778.                 if objid in xrefs:
  2779.                     # force the genno to be 0
  2780.                     self.write("%010d 00000 n \n" % xrefs[objid][0])
  2781.                 else:
  2782.                     self.write("%010d %05d f \n" % (0, 65535))
  2783.            
  2784.             self.write('trailer\n')
  2785.             self.serialize_object(trailer)
  2786.             self.write('\nstartxref\n%d\n%%%%EOF' % startxref)
  2787.  
  2788.         else: # Generate crossref stream.
  2789.  
  2790.             # Calculate size of entries
  2791.             maxoffset = max(startxref, maxobj)
  2792.             maxindex = PDFObjStmRef.maxindex
  2793.             fl2 = 2
  2794.             power = 65536
  2795.             while maxoffset >= power:
  2796.                 fl2 += 1
  2797.                 power *= 256
  2798.             fl3 = 1
  2799.             power = 256
  2800.             while maxindex >= power:
  2801.                 fl3 += 1
  2802.                 power *= 256
  2803.                    
  2804.             index = []
  2805.             first = None
  2806.             prev = None
  2807.             data = []
  2808.             # Put the xrefstream's reference in itself
  2809.             startxref = self.tell()
  2810.             maxobj += 1
  2811.             xrefs[maxobj] = (startxref, 0)
  2812.             for objid in sorted(xrefs):
  2813.                 if first is None:
  2814.                     first = objid
  2815.                 elif objid != prev + 1:
  2816.                     index.extend((first, prev - first + 1))
  2817.                     first = objid
  2818.                 prev = objid
  2819.                 objref = xrefs[objid]
  2820.                 if isinstance(objref, PDFObjStmRef):
  2821.                     f1 = 2
  2822.                     f2 = objref.stmid
  2823.                     f3 = objref.index
  2824.                 else:
  2825.                     f1 = 1
  2826.                     f2 = objref[0]
  2827.                     # we force all generation numbers to be 0
  2828.                     # f3 = objref[1]
  2829.                     f3 = 0
  2830.                
  2831.                 data.append(struct.pack('>B', f1))
  2832.                 data.append(struct.pack('>L', f2)[-fl2:])
  2833.                 data.append(struct.pack('>L', f3)[-fl3:])
  2834.             index.extend((first, prev - first + 1))
  2835.             data = zlib.compress(''.join(data))
  2836.             dic = {'Type': LITERAL_XREF, 'Size': prev + 1, 'Index': index,
  2837.                    'W': [1, fl2, fl3], 'Length': len(data),
  2838.                    'Filter': LITERALS_FLATE_DECODE[0],
  2839.                    'Root': trailer['Root'],}
  2840.             if 'Info' in trailer:
  2841.                 dic['Info'] = trailer['Info']
  2842.             xrefstm = PDFStream(dic, data)
  2843.             self.serialize_indirect(maxobj, xrefstm)
  2844.             self.write('startxref\n%d\n%%%%EOF' % startxref)
  2845.     def write(self, data):
  2846.         self.outf.write(data)
  2847.         self.last = data[-1:]
  2848.  
  2849.     def tell(self):
  2850.         return self.outf.tell()
  2851.  
  2852.     def escape_string(self, string):
  2853.         string = string.replace('\\', '\\\\')
  2854.         string = string.replace('\n', r'\n')
  2855.         string = string.replace('(', r'\(')
  2856.         string = string.replace(')', r'\)')
  2857.          # get rid of ciando id
  2858.         regularexp = re.compile(r'http://www.ciando.com/index.cfm/intRefererID/\d{5}')
  2859.         if regularexp.match(string): return ('http://www.ciando.com')
  2860.         return string
  2861.    
  2862.     def serialize_object(self, obj):
  2863.         if isinstance(obj, dict):
  2864.             # Correct malformed Mac OS resource forks for Stanza
  2865.             if 'ResFork' in obj and 'Type' in obj and 'Subtype' not in obj \
  2866.                    and isinstance(obj['Type'], int):
  2867.                 obj['Subtype'] = obj['Type']
  2868.                 del obj['Type']
  2869.             # end - hope this doesn't have bad effects
  2870.             self.write('<<')
  2871.             for key, val in obj.items():
  2872.                 self.write('/%s' % key)
  2873.                 self.serialize_object(val)
  2874.             self.write('>>')
  2875.         elif isinstance(obj, list):
  2876.             self.write('[')
  2877.             for val in obj:
  2878.                 self.serialize_object(val)
  2879.             self.write(']')
  2880.         elif isinstance(obj, str):
  2881.             self.write('(%s)' % self.escape_string(obj))
  2882.         elif isinstance(obj, bool):
  2883.             if self.last.isalnum():
  2884.                 self.write(' ')
  2885.             self.write(str(obj).lower())            
  2886.         elif isinstance(obj, (int, long, float)):
  2887.             if self.last.isalnum():
  2888.                 self.write(' ')
  2889.             self.write(str(obj))
  2890.         elif isinstance(obj, PDFObjRef):
  2891.             if self.last.isalnum():
  2892.                 self.write(' ')            
  2893.             self.write('%d %d R' % (obj.objid, 0))
  2894.         elif isinstance(obj, PDFStream):
  2895.             ### If we don't generate cross ref streams the object streams
  2896.             ### are no longer useful, as we have extracted all objects from
  2897.             ### them. Therefore leave them out from the output.
  2898.             if obj.dic.get('Type') == LITERAL_OBJSTM and not gen_xref_stm:
  2899.                     self.write('(deleted)')
  2900.             else:
  2901.                 data = obj.get_decdata()
  2902.                 self.serialize_object(obj.dic)
  2903.                 self.write('stream\n')
  2904.                 self.write(data)
  2905.                 self.write('\nendstream')
  2906.         else:
  2907.             data = str(obj)
  2908.             if data[0].isalnum() and self.last.isalnum():
  2909.                 self.write(' ')
  2910.             self.write(data)
  2911.    
  2912.     def serialize_indirect(self, objid, obj):
  2913.         self.write('%d 0 obj' % (objid,))
  2914.         self.serialize_object(obj)
  2915.         if self.last.isalnum():
  2916.             self.write('\n')
  2917.         self.write('endobj\n')
  2918.  
  2919. def cli_main(argv=sys.argv):
  2920.     progname = os.path.basename(argv[0])
  2921.     if RSA is None:
  2922.         print "%s: This script requires PyCrypto, which must be installed " \
  2923.               "separately.  Read the top-of-script comment for details." % \
  2924.               (progname,)
  2925.         return 1
  2926.     if len(argv) != 4:
  2927.         print "usage: %s KEYFILE INBOOK OUTBOOK" % (progname,)
  2928.         return 1
  2929.     keypath, inpath, outpath = argv[1:]
  2930.     with open(inpath, 'rb') as inf:
  2931.         serializer = PDFSerializer(inf, keypath)
  2932.         # hope this will fix the 'bad file descriptor' problem
  2933.         with open(outpath, 'wb') as outf:
  2934.         # help construct to make sure the method runs to the end
  2935.             serializer.dump(outf)
  2936.     return 0
  2937.  
  2938.  
  2939. class DecryptionDialog(Tkinter.Frame):
  2940.     def __init__(self, root):
  2941.         # debug mode debugging
  2942.         global DEBUG_MODE
  2943.         Tkinter.Frame.__init__(self, root, border=5)
  2944.         ltext='Select file for decryption\n(Ignore Password / Key file option for Fileopen/APS PDFs)'        
  2945.         self.status = Tkinter.Label(self, text=ltext)
  2946.         self.status.pack(fill=Tkconstants.X, expand=1)
  2947.         body = Tkinter.Frame(self)
  2948.         body.pack(fill=Tkconstants.X, expand=1)
  2949.         sticky = Tkconstants.E + Tkconstants.W
  2950.         body.grid_columnconfigure(1, weight=2)
  2951.         Tkinter.Label(body, text='Password\nor Key file').grid(row=0)
  2952.         self.keypath = Tkinter.Entry(body, width=30)
  2953.         self.keypath.grid(row=0, column=1, sticky=sticky)
  2954.         if os.path.exists('adeptkey.der'):
  2955.             self.keypath.insert(0, 'adeptkey.der')
  2956.         button = Tkinter.Button(body, text="...", command=self.get_keypath)
  2957.         button.grid(row=0, column=2)
  2958.         Tkinter.Label(body, text='Input file').grid(row=1)
  2959.         self.inpath = Tkinter.Entry(body, width=30)
  2960.         self.inpath.grid(row=1, column=1, sticky=sticky)
  2961.         button = Tkinter.Button(body, text="...", command=self.get_inpath)
  2962.         button.grid(row=1, column=2)
  2963.         Tkinter.Label(body, text='Output file').grid(row=2)
  2964.         self.outpath = Tkinter.Entry(body, width=30)
  2965.         self.outpath.grid(row=2, column=1, sticky=sticky)
  2966.         debugmode = Tkinter.Checkbutton(self, text = "Debug Mode (writable directory required)", command=self.debug_toggle, height=2, \
  2967.                  width = 40)            
  2968.         debugmode.pack()        
  2969.         button = Tkinter.Button(body, text="...", command=self.get_outpath)
  2970.         button.grid(row=2, column=2)
  2971.         buttons = Tkinter.Frame(self)
  2972.         buttons.pack()
  2973.  
  2974.  
  2975.         botton = Tkinter.Button(
  2976.             buttons, text="Decrypt", width=10, command=self.decrypt)
  2977.         botton.pack(side=Tkconstants.LEFT)
  2978.         Tkinter.Frame(buttons, width=10).pack(side=Tkconstants.LEFT)
  2979.         button = Tkinter.Button(
  2980.             buttons, text="Quit", width=10, command=self.quit)
  2981.         button.pack(side=Tkconstants.RIGHT)
  2982.          
  2983.  
  2984.     def get_keypath(self):
  2985.         keypath = tkFileDialog.askopenfilename(
  2986.             parent=None, title='Select ADEPT key file',
  2987.             defaultextension='.der', filetypes=[('DER-encoded files', '.der'),
  2988.                                                 ('All Files', '.*')])
  2989.         if keypath:
  2990.             keypath = os.path.normpath(os.path.realpath(keypath))
  2991.             self.keypath.delete(0, Tkconstants.END)
  2992.             self.keypath.insert(0, keypath)
  2993.         return
  2994.  
  2995.     def get_inpath(self):
  2996.         inpath = tkFileDialog.askopenfilename(
  2997.             parent=None, title='Select ADEPT or FileOpen-encrypted PDF file to decrypt',
  2998.             defaultextension='.pdf', filetypes=[('PDF files', '.pdf'),
  2999.                                                  ('All files', '.*')])
  3000.         if inpath:
  3001.             inpath = os.path.normpath(os.path.realpath(inpath))
  3002.             self.inpath.delete(0, Tkconstants.END)
  3003.             self.inpath.insert(0, inpath)
  3004.         return
  3005.  
  3006.     def debug_toggle(self):
  3007.         global DEBUG_MODE
  3008.         if DEBUG_MODE == False:
  3009.             DEBUG_MODE = True
  3010.         else:
  3011.             DEBUG_MODE = False
  3012.            
  3013.     def get_outpath(self):
  3014.         outpath = tkFileDialog.asksaveasfilename(
  3015.             parent=None, title='Select unencrypted PDF file to produce',
  3016.             defaultextension='.pdf', filetypes=[('PDF files', '.pdf'),
  3017.                                                  ('All files', '.*')])
  3018.         if outpath:
  3019.             outpath = os.path.normpath(os.path.realpath(outpath))
  3020.             self.outpath.delete(0, Tkconstants.END)
  3021.             self.outpath.insert(0, outpath)
  3022.         return
  3023.  
  3024.     def decrypt(self):
  3025.         global INPUTFILEPATH
  3026.         global KEYFILEPATH
  3027.         global PASSWORD
  3028.         keypath = self.keypath.get()
  3029.         inpath = self.inpath.get()
  3030.         outpath = self.outpath.get()
  3031.         if not keypath or not os.path.exists(keypath):
  3032.             # keyfile doesn't exist
  3033.             KEYFILEPATH = False
  3034.             PASSWORD = keypath            
  3035.         if not inpath or not os.path.exists(inpath):
  3036.             self.status['text'] = 'Specified input file does not exist'
  3037.             return
  3038.         if not outpath:
  3039.             self.status['text'] = 'Output file not specified'
  3040.             return
  3041.         if inpath == outpath:
  3042.             self.status['text'] = 'Must have different input and output files'
  3043.             return
  3044.         # patch for non-ascii characters
  3045.         INPUTFILEPATH = inpath.encode('utf-8')
  3046.         argv = [sys.argv[0], keypath, inpath, outpath]
  3047.         self.status['text'] = 'Processing ...'
  3048.         try:
  3049.             cli_main(argv)
  3050.         except Exception, a:
  3051.             self.status['text'] = 'Error: ' + str(a)
  3052.             return
  3053.         self.status['text'] = 'File successfully decrypted.\n'+\
  3054.                               'Close this window or decrypt another pdf file.'
  3055.         return
  3056.  
  3057. def gui_main():
  3058.     root = Tkinter.Tk()
  3059.     if RSA is None:
  3060.         root.withdraw()
  3061.         tkMessageBox.showerror(
  3062.             "INEPT PDF and FileOpen Decrypter",
  3063.             "This script requires PyCrypto, which must be installed "
  3064.             "separately.  Read the top-of-script comment for details.")
  3065.         return 1
  3066.     root.title('INEPT PDF Decrypter 8.4.48 (FileOpen/APS-Support)')
  3067.     root.resizable(True, False)
  3068.     root.minsize(370, 0)
  3069.     DecryptionDialog(root).pack(fill=Tkconstants.X, expand=1)
  3070.     root.mainloop()
  3071.     return 0
  3072.  
  3073.  
  3074. if __name__ == '__main__':
  3075.     if len(sys.argv) > 1:
  3076.         sys.exit(cli_main())
  3077.     sys.exit(gui_main())
Add Comment
Please, Sign In to add comment