Advent of Code day 10 “Syntax Scoring” provides us with invalid parenthesized expressions and asks us for a convoluted score depending on just how incorrect they were. This post is a literate Haskell program that starts with a few imports.
import Control.Arrow ((***))
import Control.Monad (void)
import Data.Either (partitionEithers)
import Data.Foldable (find)
import Data.List (sort)
import Data.List.NonEmpty (NonEmpty((:|)))
import Data.Void (Void)
import Text.Megaparsec hiding (chunk)
import Text.Megaparsec.Char (char)I’m reaching out to Megaparsec, renowned for its quality error
reporting. Matching a parenthesized expression is a simple matter of
invoking between while not getting confused between
some (matches one or more) and many (matches
zero or more). We’re never parsing valid expressions, so the parser’s
return type is actually irrelevant.
type Parser = Parsec Void String ()
inputLine,chunks,chunk :: Parser
inputLine = void (some chunk)
chunks = void (many chunk)
chunk = choice (pair <$> ["()", "[]", "{}", "<>"]) where
pair [a,b] = void (between (char a) (char b) chunks)
parser :: String -> Either (ParseErrorBundle String Void) ()
parser = parse inputLine "<stdin>"Now what happens when I feed the parser with an invalid expression from the sample input? I can get two families of errors.
Corrupted lines parse as an invalid (unexpected by the parser) character:
<stdin>:3:13:
|
3 | {([(<{}[<>[]}>{[]{[(<()>
| ^
unexpected '}'
expecting '(', '<', '[', ']', or '{'
Incomplete lines parse as an unexpected end of input:
<stdin>:1:25:
|
1 | [({(<(())[]>[[{[]{<()<>>
| ^
unexpected end of input
expecting '(', '<', '[', '{', or '}'
Let’s write a classifier using Megaparsec’s error return. It would have the following signature:
classify :: String -> ParseErrorBundle String Void -> Either Part1 Part2And we’d invoke it in a general loop as such:
main :: IO ()
main = interact $ show . (part1 *** part2) . partitionEithers .
map (\l -> either (classify l) undefined (parser l)) .
linesI use two semantic differing Eithers here:
- The
either (classify l) undefinedis there to apply to the lines in error; I afford to useundefinedwhere the valid lines would be because there shouldn’t be any as per the problem statement, and I’d definitely want to investigate it as a bug if it were to happen. - The
partitionEithersdownstream is there to distinguish between lines relevant to parts 1 and 2.
For part 1 specifically, we compute a checksum based on the first character in error. So let’s have our classifier return it.
type Part1 = CharThe checksum is a simple weighted sum, depending on the characters’ arbitrary values:
part1 :: [Part1] -> Int
part1 = sum . map syntaxScore where
syntaxScore ')' = 3
syntaxScore ']' = 57
syntaxScore '}' = 1197
syntaxScore '>' = 25137Identifying the relevant lines, namely the corrupted ones, is a
matter of drilling down the TrivialError structure
Megaparsec returns, and checking the unexpected element is a
Tokens and not an EndOfInput.
classify _ (ParseErrorBundle (TrivialError _ (Just (Tokens (c :| []))) _ :| []) _)In that case, the character in error is taken directy there and returned.
= Left cFor part 2, we want to autocomplete a valid ending in as few
characters as possible. So we drill down looking for an unexpected
EndfInput and examine its expected tokens counterpart:
classify l (ParseErrorBundle (TrivialError _ (Just EndOfInput) ts :| []) _) =To have the resulting string as short as possible, we extract the closing bracket among the list. There is necessarily one, else the expression would actually be valid.
let Just (Tokens (c :| [])) = find (`elem` (Tokens . pure <$> ")]}>")) tsWe can then append it to the original string and try again.
l' = l ++ [c]
in (c :) <$> either (classify l') (const (Right [])) (parser l')Note this time I cannot afford an
undefined: at some point the autocompleted expression is
going to be valid. I use it to return the list terminator.
type Part2 = [Char]So we extracted a Right […] out of the incomplete
expression. Now to munge it into the value AoC expects. The sorting and
taking the middle value is a disguised median, though I will actually
implement it just like that anyway.
The autocomplete score, on the other hand, is a numerical base-5 decoding. That happens not to have zeros.
part2 :: [Part2] -> Int
part2 = median . map (foldl (\a b -> 5*a + autocompleteScore b) 0)
where median xs = sort xs !! (length xs `div` 2)
autocompleteScore ')' = 1
autocompleteScore ']' = 2
autocompleteScore '}' = 3
autocompleteScore '>' = 4This concludes today’s solution. See you next time!