Friday, October 31, 2008

Accountability, one element at a time

At the end of September and beginning of October, the U.S. House of Representatives voted on a proposed bailo"rescue" plan for poorly managed Wall Street firms. HR 3997 was the first vote, and it failed so back to the drawing board! In the words of Rep. Ron Paul of Texas, "It’s amazing, you take a very, very bad bill, appropriating $700 billion, you can’t get enough votes to pass it so you take it back out, you make it much worse and take it up to over $800 billion." The "much worse" version was the one that passed, so the obvious question is which of our public servants made this possible? This post is a literate Haskell program: copy-and-paste it into a file with the extension "lhs" (say, turncoats.lhs) to get a working program! First, a bit of front matter to import libraries that we'll be using.
> {-# LANGUAGE Arrows #-}

> module Main where
> import Control.Monad
> import Data.List (groupBy, intercalate, sort)
> import qualified Data.Map as M
> import System.Environment
> import Text.XML.HXT.Arrow
The House makes available on the web results of recorded votes:
> hr3997 = "http://clerk.house.gov/evs/2008/roll674.xml"
> hr1424 = "http://clerk.house.gov/evs/2008/roll681.xml"
Despite the way they look in your browser, the resources linked above are XML document instances — verify for yourself with View Source — that we can use for a little accountability. The agenda for our program is straightforward: pull the results of the votes, extract the votes from each, and output the flip-floppers. As a bit of lagniappe, we group the principled stalwarts into classes according to how they changed their votes.
> main :: IO ()
> main = do
>   a <- runX $ readDoc hr3997 >>> votes
>   b <- runX $ readDoc hr1424 >>> votes
>   let turncoats = flipFlops a b
>   forM_ (groupBy same (sort turncoats)) $
>     \ xs -> do
>       let (v,v',_) = head xs
>           n = show $ length xs
>       putStrLn $ v ++ " -> " ++ v' ++ ": (" ++ n ++ ")  "
>       putStrLn $ intercalate ", " (map name xs)
>       putStrLn ""
>   where
>     a `same` b = before a == before b && after a == after b
>     before (v,_,_) = v
>     after  (_,v,_) = v
>     name   (_,_,n) = n
>     readDoc = readDocument [(a_tagsoup, "1")]
We'll represent each vote by pairing a representative's name with his yea-or-nay:
> type Name = String
> type Vote = (Name, String)
For a baseline, we use HR 3997 to build a hash table whose keys are representative names and whose values are the corresponding votes. Then for each vote from HR 1424, we compare the latter vote against the former, making note of those members who changed their votes. As the type of flipFlops indicates, the result is a list of tuples of the form (former-vote, latter-vote, rep-name).
> flipFlops :: [Vote] -> [Vote] -> [(String, String, Name)]
> flipFlops before after =
>   let prev = M.fromList before
>   in after >>= ff prev
>   where
In cases where a member did not vote on the earlier issue, lookup produces an error value, which is Nothing inside the Maybe monad. In Haskell, we don't get NullPointerExceptions. The astute reader will note that flipFlops is not fully general: it doesn't report cases where representatives voted on the former question but not the latter.
>     ff prev (name, latter) =
>       case M.lookup name prev of
>         Just former -> if former == latter
>                          then []
>                          else [(former,   latter, name)]
>         _           ->        [("<none>", latter, name)]
These are the bits that worry about slogging through the XML, but XPath makes it straightforward: the expression below says we want all recorded-vote elements, and those are children of the vote-data element, which are children of the rollcall-vote element at the document root.
> votes :: ArrowXml a => a XmlTree Vote
> votes = getXPathTrees "/rollcall-vote/vote-data/recorded-vote" >>>
>   proc rv -> do
>     name <- getName -< rv
>     vote <- getVote -< rv
>     returnA -< (name, normalize vote)
Consider the structure of a recorded-vote element:
<recorded-vote>
    <legislator>Cramer</legislator>
    <vote>Aye</vote>
</recorded-vote>
So for each recorded-vote, we extract the inner-text of the legislator and vote child elements.
>   where
>     getName = getChildren >>>
>               isElem >>> hasName "legislator" >>>
>               xshow getChildren
>     getVote = getChildren >>>
>               isElem >>> hasName "vote" >>>
>               xshow getChildren
Due to supremely lovely irony, yea is not yea nor nay nay in the recorded votes, so we have to normalize.
>     normalize "Yea" = "Y"
>     normalize "Yes" = "Y"
>     normalize "Aye" = "Y"
>     normalize "Nay" = "N"
>     normalize "No"  = "N"
>     normalize v     = v

Finally the output:

N → Y: (58)
Abercrombie, Alexander, Baca, Barrett (SC), Berkley, Biggert, Boustany, Braley (IA), Buchanan, Carson, Cleaver, Coble, Conaway, Cuellar, Cummings, Dent, Edwards (MD), Fallin, Frelinghuysen, Gerlach, Giffords, Green, Al, Hirono, Hoekstra, Jackson (IL), Jackson-Lee (TX), Kilpatrick, Knollenberg, Kuhl (NY), Lee, Lewis (GA), Mitchell, Myrick, Ortiz, Pascrell, Pastor, Ramstad, Ros-Lehtinen, Rush, Schiff, Schmidt, Scott (GA), Shadegg, Shuster, Solis, Sullivan, Sutton, Terry, Thompson (CA), Thornberry, Tiberi, Tierney, Wamp, Watson, Welch (VT), Woolsey, Wu, Yarmuth

Not Voting → Y: (1)
Weller

Y → N: (1)
McDermott

Tuesday, October 07, 2008