Friday, June 15, 2007

Word ladder

You form a word ladder by changing exactly one letter in the current "rung" to form the next. For example, the following is a ladder from 'paste' to 'carry':

  1. paste
  2. pasty
  3. party
  4. parry
  5. carry

Given starting and goal words, the Lisp code below performs a breadth-first search in the supplied dictionary. Using BFS has the interesting property of guaranteeing that the path found is the shortest between the two words.

I'm not really pleased with the result: there's not enough abstraction. I guess with Lisp, it's hard to be satisfied unless you write a couple of fly-daddy macros that reduce the meat of the code to about three lines.

(defparameter *default-dictionary*
  #P"/usr/dict/words"
)


(defvar *dictionary* (make-hash-table :test #'equal))

(defun ladder (start goal)
  (if (/= (length start) (length goal))
      (error "'~S' and '~S' have different lengths" start goal)
      (let* ((dict (read-dictionary))
             (candidates (ladder-candidates dict start))
             (seen (make-hash-table :test #'equal))
             (begat (make-hash-table :test #'equal))
             (todo (list start))
)

        (loop while todo do
              (let ((next '()))
                (loop for word in todo do
                      (let ((neighbors (neighbors word candidates seen)))
                        (loop for w in neighbors do
                              (setf (gethash w seen) 1
                                    (gethash w begat) word
)
)

                        (if (find goal neighbors :test #'equal)
                            (return-from ladder (rungs start goal begat))
                            (setf next (nconc next neighbors))
)
)
)

                (setf todo next)
)
)

        (error "No path from '~A' to '~A'" start goal)
)
)
)


(defun read-dictionary (&optional (path *default-dictionary*))
  (let ((cached (gethash path *dictionary*)))
    (if cached
        cached
        (let ((dict '()))
          (with-open-file (s path)
            (loop for word = (read-line s nil)
                  while word do
                  (push word dict)
)
)

          (setf (gethash path *dictionary*) dict)
)
)
)
)


(defun ladder-candidates (dict start)
  (let ((candidates '())
        (len (length start))
)

    (loop for w in dict
          when (and (= len (length w)) (string-not-equal w start)) do
          (push (string-downcase w) candidates)
)

    candidates
)
)


(defun one-hop-p (from to)
  (let ((hops 0))
    (loop for ff across from
          for tt across to
          when (char-not-equal ff tt) do
          (incf hops)
)

    (= hops 1)
)
)


(defun rungs (start goal begat)
  (let ((path (list goal)))
    (loop for word = (first path)
          for parent = (gethash word begat)
          while (string-not-equal word start) do
          (push parent path)
)

    path
)
)


(defun neighbors (word pool &optional (seen (make-hash-table)))
  (loop for w in pool
        when (and (one-hop-p word w)
                  (not (gethash w seen))
)

        collect w
)
)


(defun one-hop-neighbors (word)
  (neighbors word (ladder-candidates (read-dictionary) word))
)

10 comments:

Tordek said...

I've just looked over it, but I'll take a quick guess that it doesn't "feel" lispy, is because you're not using recursion...

I mean, for example:
(ladder start end) will call
(foreach (neighbor start) as rung
(ladder rung end))
until rung == end, and then return the shortest chain it finds.

Correct me if I'm wrong.

danb said...

I'm not really pleased with the result: there's not enough abstraction.

You don't need abstraction yet. BFS is naturally procedural, so recursion won't help much. And macros only help when you're going to use the code more than once. Look for any pattern you used more than once, and wrap that up in a function or macro.

If you want to clean the code up a little, you might make a separate hash containing only words of the right length. Then when you use a word, you can just delete it.

Tordek said...

(defun ladder-candidates (dict start)
(let ((candidates '())
(len (length start)))
(loop for w in dict
when (and (= len (length w)) (string-not-equal w start)) do
(push (string-downcase w) candidates))
candidates))

Any reason to use push there? You can use loop to do that for you:

(defun ladder-candidates (dict start)
(let ((len (length start)))
(loop for w in dict
when (and (= len (length w)) (string-not-equal w start))
collect (string-downcase w)))

There, you saved a line

Tordek said...

New thing: The only thing ladder-candidates uses is the length of the start word, so you can remove that let altogether, and pass the length as a parameter when calling.

(loop for w in neighbors do
(setf (gethash w seen) 1
(gethash w begat) word))

This is getting rid of all the neighbors (to get rid of a possible longer path, I assume) and stores only the last one; care to explain? Or I'm just not getting it...

Alan Crowe said...

(defun one-hop-p (from to)
(= (loop for ff across from
for tt across to
count (char-not-equal ff tt))
1))

or

(defun one-hop-p (from to)
(declare (string from to))
(= (count nil
(map 'vector
#'char-equal
from to))
1))

you don't need the declaration, but it serves as a comment.

Use an assertion instead of (if bad-shit (error ...)), eg.

(defun ladder (start goal)
(assert (= (length start)
(length goal))
()
"~S and ~S have different lengths"
start goal)
(print 'working)
'giving-up)

Eric Mertens said...

I think that you are making this problem harder than it needs to be... for example, here is the code for a simple iteratively deepening depth first search (less memory complexity than a breadth first search with a constant time factor speed penatly) I'm sure that some of these techniques could be used to improve the solution presented.

(line wrapping might make the code hard to read)

import Data.Set (fromList,member)
import Data.List (delete)

main = do content <- readFile "linux.words"
print $ head $ ladder (lines content) "paste" "carry"

ladder word_list from to = concat [dfs n from | n<- [0..]]
where
word_set = fromList [w | w <- ws, length from == length w]
dfs 0 x = [[x] | x == to]
dfs n x = [x:xs | w <- oneOff x, member w word_set, xs <- dfs (n-1) w]

oneOff [] = []
oneOff (x:xs) = map (:xs) (delete x ['a'..'z']) ++ map (x:) (oneOff xs)

Greg said...

Eric,

I had to rename word_list to ws before your code would compile. When I ran it (with GHC 6.6), it seemed to fall into an infinite loop.

Greg said...

Tordek,

Thanks for your comments and suggestions.

The word begat is uncommon in English, so that may have made it confusing. The Spanish engendró seems to be close in meaning.

The point is to trace the path once we've found a match. When we reach the goal, we recursively find each word's "parent" in begat until we hit the start word.

Tordek said...

Ah, okay.

Still, although your code works and all, I think there's a main factor here: Yes, you're using Lisp, which is not purely functional, but your approach relies on side effects to achieve a result... and that is sooo procedural ;P

I've posted a (mostly) functional version in my blog, if you care to check it out.

Greg said...

Thanks for the pointer to your blog entry (translated by BabelFish).