Word Break
- tags: [DP_Sequence]
Question
- leetcode: Word Break | LeetCode OJ
- lintcode: (107) Word Break
Given a string s and a dictionary of words dict, determine if s can be
segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".
题解
单序列(DP_Sequence) DP 题,由单序列动态规划的四要素可大致写出:
- State:
f[i]
表示前i
个字符能否根据词典中的词被成功分词。 - Function:
f[i] = or{f[j], j < i, letter in [j+1, i] can be found in dict}
, 含义为小于i
的索引j
中只要有一个f[j]
为真且j+1
到i
中组成的字符能在词典中找到时,f[i]
即为真,否则为假。具体实现可分为自顶向下或者自底向上。 - Initialization:
f[0] = true
, 数组长度为字符串长度 + 1,便于处理。 - Answer:
f[s.length]
考虑到单词长度通常不会太长,故在s
较长时使用自底向上效率更高。
Python
class Solution:
# @param s, a string
# @param wordDict, a set<string>
# @return a boolean
def wordBreak(self, s, wordDict):
if not s:
return True
if not wordDict:
return False
max_word_len = max([len(w) for w in wordDict])
can_break = [True]
for i in xrange(len(s)):
can_break.append(False)
for j in xrange(i, -1, -1):
# optimize for too long interval
if i - j + 1 > max_word_len:
break
if can_break[j] and s[j:i + 1] in wordDict:
can_break[i + 1] = True
break
return can_break[-1]
- reference
- TEL
class Solution:
# @param s, a string
# @param dict, a set of string
# @return a boolean
# @good coding!
def wordBreak(self, s, dict):
dp = [False for i in range(len(s)+1)]
dp[0] = True
for i in range(1, len(s)+1):
for k in range(i):
if dp[k] and s[k:i] in dict:
dp[i] = True
return dp[len(s)]
s = "lintcode"
dict = ["lint","code"]
sol=Solution()
sol.wordBreak(s,dict)
class Solution:
# @param s, a string
# @param dict, a set of string
# @return a boolean
def wordBreak(self, s, dict):
n = len(s)
dp = [False] * (n + 1)
dp[0] = True
for i in range(n):
if dp[i]:
for word in dict:
j = len(word)
if i + j <= n and s[i: i + j] == word:
dp[i + j] = True
return dp[n]
# debug
s = Solution()
print s.wordBreak('a', ['a'])
- reference
- 看一下