Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- #meat
- Q: 找多个单词和 target 单词的编辑路径, 距离小于 k 的单词们.
- 思路: 用 Trie 存放重复的路径, DFS, 其余和 Edit Distance 一样.
- 类似 preorder 根须遍历. 先访问根节点, 再访问子节点.
- DFS 是单线程, 一条公共路径的子节点走完以后, 才会去下一条公共路径.
- 不用担心数据重用之前被修改.
- 递归定义: dfs(c, root, i, t, k, f, res):
- 递归拆解:
- 每层调用, 只比较一个字母, 即当前节点的字母.
- - 递推方程:
- for j in range(len(t)):
- upd = f[i - 1][j - 1] + 1
- if c == t[j - 1]: upd -= 1
- f[i][j] = min(f[i-1][j] + 1, f[i][j - 1] + 1, upd)
- - 找到一个编辑距离小于 k 的单词, 加进结果集.
- root.is_word and f[i][n] <= k:
- - 计算出 f[i][j] 以后, 遍历每个子节点.
- for c1 in root.child:
- self.dfs(c1, root.child[c1], i + 1, t, k, f, res)
- 数据结构:
- class TrieNode:
- f = [[float("inf")] * (n + 1) for _ in range(m + 1)]
- note.
- if j - 1 >= 0 and c == t[j - 1]: # 注意下标索引越界.
- .
- class TrieNode:
- def __init__(self,):
- self.child = {}
- self.is_word = False
- self.word = ""
- class Solution:
- def kDistance(self, words, target, k):
- # init
- self.root = TrieNode()
- n = len(target)
- m = 0
- for w in words:
- if len(w) - n > k:
- continue
- m = max(m, len(w))
- self.insert(w)
- f = [[float("inf")] * (n + 1) for _ in range(m + 1)]
- for i in range(m + 1):
- f[i][0] = i
- for i in range(n + 1):
- f[0][i] = i
- res = []
- self.dfs("", self.root, 0, target, k, f, res)
- return res
- def dfs(self, c, root, i, t, k, f, res):
- n = len(t)
- if i > 0:
- for j in range(n + 1):
- f[i][j] = min(f[i-1][j], f[i][j - 1], f[i-1][j-1]) + 1
- if j - 1 >= 0 and c == t[j - 1]:
- f[i][j] = min(f[i][j], f[i - 1][j - 1])
- if root.is_word:
- if f[i][n] <= k:
- res.append(root.word)
- for c1 in root.child:
- self.dfs(c1, root.child[c1], i + 1, t, k, f, res)
- def insert(self, word):
- root = self.root
- for c in word:
- if c not in root.child:
- root.child[c] = TrieNode()
- root = root.child[c]
- root.is_word = True
- root.word = word
- ["acc","abcd","ade","abbcd","ddeeff","fabcd","fabcde"]
- "abc"
- 2
- Output:
- ["abbcd","abcd","acc","ade","fabcd"]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement