Word Break

  • tags: [DP_Sequence]

Question

Given a string s and a dictionary of words dict, determine if s can be
segmented into a space-separated sequence of one or more dictionary words.

For example, given
s = "leetcode",
dict = ["leet", "code"].

Return true because "leetcode" can be segmented as "leet code".

题解

单序列(DP_Sequence) DP 题,由单序列动态规划的四要素可大致写出:

  1. State: f[i] 表示前i个字符能否根据词典中的词被成功分词。
  2. Function: f[i] = or{f[j], j < i, letter in [j+1, i] can be found in dict}, 含义为小于i的索引j中只要有一个f[j]为真且j+1i中组成的字符能在词典中找到时,f[i]即为真,否则为假。具体实现可分为自顶向下或者自底向上。
  3. Initialization: f[0] = true, 数组长度为字符串长度 + 1,便于处理。
  4. Answer: f[s.length]

考虑到单词长度通常不会太长,故在s较长时使用自底向上效率更高。

Python

class Solution:
    # @param s, a string
    # @param wordDict, a set<string>
    # @return a boolean
    def wordBreak(self, s, wordDict):
        if not s:
            return True
        if not wordDict:
            return False

        max_word_len = max([len(w) for w in wordDict])
        can_break = [True]
        for i in xrange(len(s)):
            can_break.append(False)
            for j in xrange(i, -1, -1):
                # optimize for too long interval
                if i - j + 1 > max_word_len:
                    break
                if can_break[j] and s[j:i + 1] in wordDict:
                    can_break[i + 1] = True
                    break
        return can_break[-1]
class Solution:
    # @param s, a string
    # @param dict, a set of string
    # @return a boolean
    # @good coding!
    def wordBreak(self, s, dict):
        dp = [False for i in range(len(s)+1)]
        dp[0] = True
        for i in range(1, len(s)+1):
            for k in range(i):
                if dp[k] and s[k:i] in dict:
                    dp[i] = True
        return dp[len(s)]

s = "lintcode"
dict = ["lint","code"]
sol=Solution()
sol.wordBreak(s,dict)
class Solution:
    # @param s, a string
    # @param dict, a set of string
    # @return a boolean
    def wordBreak(self, s, dict):
        n = len(s)
        dp = [False] * (n + 1)
        dp[0] = True
        for i in range(n):
            if dp[i]:
                for word in dict:
                    j = len(word)
                    if i + j <= n and s[i: i + j] == word:
                        dp[i + j] = True
        return dp[n]

# debug
s = Solution()
print s.wordBreak('a', ['a'])

results matching ""

    No results matching ""