Monday, June 29, 2009

Cleaning up your Haskell imports

Explicit imports have a couple of benefits. For one, doing so reduces compile times with ghc. Another is giving a hand to your future self (or other maintainers) and especially to those who are reading your code to learn. We've all been there: scratching our heads wondering, ‘Where does that function live?’ Yes, ghci's :info command and Hoogle are your friends, but explicit imports right there in your code will give the answer in a snap.

Neil Mitchell calls explicit imports “needlessly verbose,” certainly a fair point in the context where he made it, so this is a matter of polish, not strict necessity. There's also a certain aspy-appeal to it.

The -ddump-minimal-imports option to ghc writes the cleaned-up list to M.imports, where M is the module being compiled. For example, consider the following code that finds anagrams in a dictionary file:

module Main where

import Control.Arrow
import Control.Monad
import Data.Char
import Data.List
import Data.Map hiding (filter, map)
import System.Environment
import System.Exit
import System.IO

usage :: IO a
usage = do
  me <- getProgName
  hPutStrLn stderr $ "Usage: " ++ me ++ " [ dictionary ]"
  exitWith (ExitFailure 1)

main :: IO ()
main =
  getPath >>= readFile >>= mapM_ (putStrLn . unwords) . sorted
  where sorted = sort . map sort . anagrams . lines

anagrams :: [String] -> [[String]]
anagrams words = filter ((>1) . length) equiv
  where equiv = elems $
                  fromListWith (++)
                    [ (normal w, [w]) | w <- words ]
        normal = sort . map toLower

getPath :: IO FilePath
getPath = getArgs >>= go
  where go [path] = return path
        go []     = return "/usr/share/dict/words"
        go _      = usage
To get the minimal set of imports:
$ ghc-6.10.3 -ddump-minimal-imports --make anagram.hs 
$ cat Main.imports
import System.IO(IO, FilePath, putStrLn, readFile, hPutStrLn,
import Data.Map(elems, fromListWith)
import Control.Arrow()    -- Instances only
import Control.Monad(Monad(return, (>>=)), mapM_)
import Data.Char(String, toLower)
import Data.List((++), filter, map, length, lines, unwords, sort)
import System.Environment(getArgs, getProgName)
import System.Exit(ExitCode(ExitFailure), exitWith)
Although nice, the result is less than satisfying. The cuddled lists are ugly. The imports are in an odd order. Having to do run a separate compilation by hand followed by copy-paste, as opposed to automatically à la Eclipse's organize imports for Java, is a bit of a pain.

Notice that although Control.Arrow is unnecessary, it remains in the “minimal” set with an empty import list. Its presence is an artifact of the list comprehension being equivalent to

map (normal &&& (:[])) words
Cool, yes. Readable, not so much.

Note also there's an open ticket against ghc concerning the interaction between -ddump-minimal-imports and qualified imports.


simonpj said...

As luck would have it, I cleaned up the badness with -ddump-minimal-imports last Friday. Expect a patch soon.

It still puts the results in M.imports, but now (a) they should be right, (b) they should come out in the same order as the original imports.

Pasting directly back into the source file seems dangerous to me. Maybe to stdout though?

Anyway, watch the ticket, and give it a whirl when we've committed the patch


Greg said...

I look forward to seeing your patch! Writing to stdout seems the better approach: it wouldn't litter source trees but does seem friendlier to source-munging tools, editor plugins, and so on.