Advent of Code day 10 “Syntax Scoring” provides us with invalid parenthesized expressions and asks us for a convoluted score depending on just how incorrect they were. This post is a literate Haskell program that starts with a few imports.
import Control.Arrow ((***))
import Control.Monad (void)
import Data.Either (partitionEithers)
import Data.Foldable (find)
import Data.List (sort)
import Data.List.NonEmpty (NonEmpty((:|)))
import Data.Void (Void)
import Text.Megaparsec hiding (chunk)
import Text.Megaparsec.Char (char)
I’m reaching out to Megaparsec, renowned for its quality error
reporting. Matching a parenthesized expression is a simple matter of
invoking between
while not getting confused between
some
(matches one or more) and many
(matches
zero or more). We’re never parsing valid expressions, so the parser’s
return type is actually irrelevant.
type Parser = Parsec Void String ()
chunk :: Parser
inputLine,chunks,= void (some chunk)
inputLine = void (many chunk)
chunks = choice (pair <$> ["()", "[]", "{}", "<>"]) where
chunk = void (between (char a) (char b) chunks)
pair [a,b]
parser :: String -> Either (ParseErrorBundle String Void) ()
= parse inputLine "<stdin>" parser
Now what happens when I feed the parser with an invalid expression from the sample input? I can get two families of errors.
Corrupted lines parse as an invalid (unexpected by the parser) character:
<stdin>:3:13:
|
3 | {([(<{}[<>[]}>{[]{[(<()>
| ^
unexpected '}'
expecting '(', '<', '[', ']', or '{'
Incomplete lines parse as an unexpected end of input:
<stdin>:1:25:
|
1 | [({(<(())[]>[[{[]{<()<>>
| ^
unexpected end of input
expecting '(', '<', '[', '{', or '}'
Let’s write a classifier using Megaparsec’s error return. It would have the following signature:
classify :: String -> ParseErrorBundle String Void -> Either Part1 Part2
And we’d invoke it in a general loop as such:
main :: IO ()
= interact $ show . (part1 *** part2) . partitionEithers .
main map (\l -> either (classify l) undefined (parser l)) .
lines
I use two semantic differing Either
s here:
- The
either (classify l) undefined
is there to apply to the lines in error; I afford to useundefined
where the valid lines would be because there shouldn’t be any as per the problem statement, and I’d definitely want to investigate it as a bug if it were to happen. - The
partitionEithers
downstream is there to distinguish between lines relevant to parts 1 and 2.
For part 1 specifically, we compute a checksum based on the first character in error. So let’s have our classifier return it.
type Part1 = Char
The checksum is a simple weighted sum, depending on the characters’ arbitrary values:
part1 :: [Part1] -> Int
= sum . map syntaxScore where
part1 ')' = 3
syntaxScore ']' = 57
syntaxScore '}' = 1197
syntaxScore '>' = 25137 syntaxScore
Identifying the relevant lines, namely the corrupted ones, is a
matter of drilling down the TrivialError
structure
Megaparsec returns, and checking the unexpected element is a
Tokens
and not an EndOfInput
.
ParseErrorBundle (TrivialError _ (Just (Tokens (c :| []))) _ :| []) _) classify _ (
In that case, the character in error is taken directy there and returned.
= Left c
For part 2, we want to autocomplete a valid ending in as few
characters as possible. So we drill down looking for an unexpected
EndfInput
and examine its expected tokens counterpart:
ParseErrorBundle (TrivialError _ (Just EndOfInput) ts :| []) _) = classify l (
To have the resulting string as short as possible, we extract the closing bracket among the list. There is necessarily one, else the expression would actually be valid.
let Just (Tokens (c :| [])) = find (`elem` (Tokens . pure <$> ")]}>")) ts
We can then append it to the original string and try again.
= l ++ [c]
l' in (c :) <$> either (classify l') (const (Right [])) (parser l')
Note this time I cannot afford an
undefined
: at some point the autocompleted expression is
going to be valid. I use it to return the list terminator.
type Part2 = [Char]
So we extracted a Right […]
out of the incomplete
expression. Now to munge it into the value AoC expects. The sorting and
taking the middle value is a disguised median, though I will actually
implement it just like that anyway.
The autocomplete score, on the other hand, is a numerical base-5 decoding. That happens not to have zeros.
part2 :: [Part2] -> Int
= median . map (foldl (\a b -> 5*a + autocompleteScore b) 0)
part2 where median xs = sort xs !! (length xs `div` 2)
')' = 1
autocompleteScore ']' = 2
autocompleteScore '}' = 3
autocompleteScore '>' = 4 autocompleteScore
This concludes today’s solution. See you next time!