Advertisement
Guest User

Untitled

a guest
Feb 25th, 2018
83
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.63 KB | None | 0 0
  1. class Solution(object):
  2. def isValid(self, code):
  3. self.pos = 0
  4. def eat_tag():
  5. start_pos = self.pos
  6. while self.pos < len(code) and code[self.pos] != '>':
  7. self.pos += 1
  8. self.pos += 1
  9. tag = code[start_pos:self.pos]
  10. return tag
  11.  
  12. def eat_cdata():
  13. begin_len = len("<![CDATA[")
  14. end = "]]>"
  15. end_len = len(end)
  16. start_pos = self.pos
  17. self.pos += begin_len
  18. while self.pos + end_len < len(code) and code[self.pos:self.pos+end_len] != end:
  19. self.pos += 1
  20. self.pos += end_len
  21. cdata = code[start_pos:self.pos]
  22. return cdata
  23.  
  24. def validate_open_tag(tag):
  25. if len(tag) > len('<>') + 9 or len(tag) < len('<>') + 1:
  26. return False
  27. elif tag[0] != '<' or tag[-1] != '>':
  28. return False
  29. return all(ch.isupper() for ch in tag[1:-1])
  30.  
  31. def validate_close_tag(tag):
  32. if len(tag) > len('</>') + 9 or len(tag) < len('</>') + 1:
  33. return False
  34. elif tag[:2] != '</' or tag[-1] != '>':
  35. return False
  36. return all(ch.isupper() for ch in tag[2:-1])
  37.  
  38. def validate_cdata(cdata):
  39. begin = "<![CDATA["
  40. end = "]]>"
  41. if len(cdata) < len(begin) + len(end):
  42. return False
  43. elif cdata[:len(begin)] != begin:
  44. return False
  45. elif cdata[len(cdata)-len(end):] != end:
  46. return False
  47. return True
  48.  
  49. def validate_start_matches_end(start_tag, end_tag):
  50. if len(start_tag) != len(end_tag)-1:
  51. return False
  52. return start_tag[1:-1] == end_tag[2:-1]
  53.  
  54. first_tag = eat_tag()
  55. if not validate_open_tag(first_tag):
  56. return False
  57. tag_stack = [first_tag]
  58. while self.pos < len(code) and tag_stack:
  59. peek = code[self.pos:self.pos+2] if self.pos+1 < len(code) else None
  60. if peek == '<!':
  61. cdata = eat_cdata()
  62. if not validate_cdata(cdata):
  63. return False
  64. elif peek == '</':
  65. close_tag = eat_tag()
  66. if not validate_close_tag(close_tag):
  67. return False
  68. elif not tag_stack:
  69. return False
  70. open_tag = tag_stack.pop()
  71. if not validate_start_matches_end(open_tag, close_tag):
  72. return False
  73. elif code[self.pos] == '<':
  74. open_tag = eat_tag()
  75. if not validate_open_tag(open_tag):
  76. return False
  77. tag_stack.append(open_tag)
  78. else:
  79. self.pos += 1
  80. return not tag_stack and self.pos >= len(code)
  81.  
  82.  
  83.  
  84.  
  85.  
  86.  
  87. """
  88. :type code: str
  89. :rtype: bool
  90.  
  91. The code must be wrapped in a valid closed tag. Otherwise, the code is invalid.
  92. A closed tag (not necessarily valid) has exactly the following format : <TAG_NAME>TAG_CONTENT</TAG_NAME>. Among them, <TAG_NAME> is the start tag, and </TAG_NAME> is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid.
  93. A valid TAG_NAME only contain upper-case letters, and has length in range [1,9]. Otherwise, the TAG_NAME is invalid.
  94. A valid TAG_CONTENT may contain other valid closed tags, cdata and any characters (see note1) EXCEPT unmatched <, unmatched start and end tag, and unmatched or closed tags with invalid TAG_NAME. Otherwise, the TAG_CONTENT is invalid.
  95. A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa. However, you also need to consider the issue of unbalanced when tags are nested.
  96. A < is unmatched if you cannot find a subsequent >. And when you find a < or </, all the subsequent characters until the next > should be parsed as TAG_NAME (not necessarily valid).
  97. The cdata has the following format : <![CDATA[CDATA_CONTENT]]>. The range of CDATA_CONTENT is defined as the characters between <![CDATA[ and the first subsequent ]]>.
  98. CDATA_CONTENT may contain any characters. The function of cdata is to forbid the validator to parse CDATA_CONTENT, so even it has some characters that can be parsed as tag (no matter valid or invalid), you should treat it as regular characters.
  99. """
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement