Code Jam 2021 round 1A’s problem C. “Hacked Exam”. Now this one was painful.
It was so painful it’s the primary, dare I say only reason there’s such a pause in my Code Jam postings. It was so painful I had to take a peek at the analysis. Then read it in full. Then implement it all. Then flip a table or two and give it a rest. Then come back to it a few times. Then re-write key parts of it. And again for performance. And check out a few contest solutions for inspiration. Then bite the bullet and fall back on the linear-time solution.
I’m not so good at probabilities in competitive programming.
So, fair warning, this post is mostly a re-wording of the analysis. But it’s the re-wording I finally (mostly) came to accept and understand, so it may help. And it comes with a solution in Haskell, which isn’t something you’re going to find on the website.1
So what’s the fuss all about? We’re cheating at an exam. An exam is a bitstring of length Q. An exam attempt2 is a bitstring of length Q. An attempt’s score is the number of bits it has in common with the exam. For a given exam, we’re given one to three attempts and their score, and are asked for an attempt and its expected score. Maximized.
Let’s start out with a few types.
type Exam = [Bool]
type Score = Int
type Exp = Rational
data Attempt a = Attempt
attemptScore :: !a
{ attemptAnswers :: [Bool]
, }
I’m making Attempt
parametric so its score can either be
an integer, for the input, or a rational, for the output.
Ok. The small test set is ten questions, two attempts. We can literally generate all possible 1024 exams, the “universe”, keep those who are compatible with the two attempts we have, then identify the most likely answer per question.
small :: [Attempt Score] -> Attempt Exp
= foldMap computeAnswer (transpose feasibleExams)
small as where
universe :: [Exam]
= mapM (const [False,True]) (attemptAnswers (head as))
universe
feasibleExams :: [Exam]
= filter (\truth -> all (isFeasible truth) as) universe
feasibleExams
isFeasible :: Exam -> Attempt Score -> Bool
=
isFeasible truth attempt == attemptScore attempt
evaluate truth (attemptAnswers attempt)
evaluate :: Exam -> Exam -> Score
= ((genericLength . filter (uncurry (==))) .) . zip
evaluate
computeAnswer :: [Bool] -> Attempt Exp
| 2*t >= l = Attempt ( t/l) [True]
computeAnswer bs | otherwise = Attempt (1-t/l) [False]
where t = genericLength (filter id bs)
= genericLength bs l
The most likely answer per question is simply the one that is most frequently found in the exams.
Attempts are generated using a monoid instance defined as such:
instance Num a => Semigroup (Attempt a) where
Attempt s1 a1 <> Attempt s2 a2 = Attempt (s1+s2) (a1++a2)
instance Num a => Monoid (Attempt a) where
mempty = Attempt 0 []
For the medium test case, we can have up to 40 questions, so 240 ≈ 1012 possible tests. Enumeration won’t complete in time.
The key observations is that even though the number of questions increased, they still form classes: there are only four possible combinations the two provided attempts could have answered a question. So all questions fall into one of four groups, and within a group they are indistinguishable, so must have the same probability of the correct answer being true or false.
That’s four unknown probabilities. But we only have two equations: the attempts’ binding to their score. The other insight is that a question’s correct value from the exam is bound to an answer pair just as indistinguishably as above. For example, a question to which both attempts answered true has the same probability of actually being true as a question to which both answered false has of being false. Conversely a question to which both answered differently has the same bias towards the first attempt being correct as one to which both also answered dirrerently, but with the opposite absolute answer. (This aspect of the problem took some time to sink in.)
Using the problem analysis’s notation, we call p a question’s probability of being true, and subdivide into pTT its probability of being true when both attempts answered true, pTF when the first attempt answered true and the second false, and so on. The symmetries described in the previous paragraph can then be summarized as pTT = 1 − pFF and pTF = 1 − pFT.3
To aggregate the score from attempts, I’ll whip up a little auxiliary record type:4
data Medium = M !Int {-^ const -} !Int {-^ TT -} !Int {-^ TF -}
instance Semigroup Medium where M a b c <> M d e f = M (a+d) (b+e) (c+f)
instance Monoid Medium where mempty = M 0 0 0
As it says on the tin, it holds a constant term, a count of TTs encountered, and a count of TFs encountered. The constant term is there so I can replace TFs with 1-FTs on the fly.
medium :: [Attempt Score] -> Attempt Exp
= medium [x,x]
medium [x] Attempt s1 as1,Attempt s2 as2] = answer
medium [where
M k1 stt1 stf1,M k2 stt2 stf2,answer) = mconcat $ zipWith combine as1 as2 (
Our combination function is where most of the
magictedium lies. Brace yourselves.
True True = (M 0 1 0,M 0 1 0, if pTT >= 0.5 then Attempt pTT [True] else Attempt (1 - pTT) [False])
combine True False = (M 0 0 1,M 1 0 (-1),if pTF >= 0.5 then Attempt pTF [True] else Attempt (1 - pTF) [False])
combine False True = (M 0 0 1,M 1 0 (-1),if pTF < 0.5 then Attempt (1 - pTF) [True] else Attempt pTF [False])
combine False False = (M 0 1 0,M 0 1 0, if pTT < 0.5 then Attempt (1 - pTT) [True] else Attempt pTT [False]) combine
It’s verbose, but it’s just what we said: convert FFs and FTs by symmetry and sum probabilities. Since this is Haskell, we can share the traversal to also build up the attempt: we know the question type, we know its probability of being true or false, so we pick the most likely and accumulate the probability of being right to our score’s expected value.
No, wait, we don’t know its probability of being true or false yet. We still need to actually solve the equation system. I was lazy, I used Cramer’s rule, which I won’t reproduce here. We’re using rationals, we won’t run into any numerical stability issues today.
= cramer2 (fromIntegral stt1) (fromIntegral stf1) (fromIntegral (s1-k1))
(pTT,pTF) fromIntegral stt2) (fromIntegral stf2) (fromIntegral (s2-k2)) (
So far, so good. Now for the large test set.
The major difference is that we now have three scored attempts at our disposal. We also have up to 120 questions per test, but with the current approach that oughtn’t change much of anything. Or ought it?
Let’s start by counting the questions per type. Sorry, per half-type, aggregating symmetries.
large :: [Attempt Score] -> Attempt Exp
= large [x,x,x]
large [x] = large [x,x,y]
large [x,y] Attempt s1 as1,Attempt s2 as2,Attempt s3 as3] =
large [let
tally :: Vector Int
= V.accum (+) (V.replicate 4 0)
tally 2*fromEnum (a2 /= a1) + fromEnum (a3 /= a1),1)
[ (| a1 <- as1 | a2 <- as2 | a3 <- as3 ]
tally
is a vector of length 4, whose elements are the
number of questions of types FFF/TTT, FFT/TTF, FTF/TFT, FTT/TFF
respectively.
We can split the population of exams along the number of correct answers for questions a given type, giving rise to Cnk distinct sub-exams. Now the same magic as for the medium set: knowing the number n of questions of a certain type, we can constrain its k to be between 0 and n; but choosing a specific k for any question type enforces specific ks for all other question types. It’s just a different phrasing of the three scores’ formulae, except this time we’re using question counts, which are enumerable, instead of probabilities.
But the core idea is the same: the score of attempt 2 (attempt 1 would be boring) is the sum of the ks of the question types where attempts 1 and 2 answer the same, plus the complements to n of the ks of the question types where attempts 1 and 2 answer differently. Repeat for attempts 1 and 3, we get three independent equations. Here’s my function to generate the full vector of ks from the first one:
nsFrom0 :: Int -> Maybe (Vector Int)
= do
nsFrom0 n0 let twoN1 = s1 + s2 - 2*n0 - tally!2 - tally!3
even twoN1)
guard (>= 0)
guard (twoN1 <= 2 * tally!1)
guard (twoN1 let twoN2 = s1 + s3 - 2*n0 - tally!1 - tally!3
even twoN2)
guard (>= 0)
guard (twoN2 <= 2 * tally!2)
guard (twoN2 let twoN3 = s1 - s2 + tally!2 + tally!3 - twoN2
even twoN3)
guard (>= 0)
guard (twoN3 <= 2 * tally!3)
guard (twoN3 pure $ V.fromList [n0, twoN1 `div` 2, twoN2 `div` 2, twoN3 `div` 2]
We have all feasible exam structures. So how do we compute their payoffs and construct a satisfactory attempt out of them?
construct :: Vector (Vector Int) -> Attempt Exp
= construct kss
We’ll work by question type. Given a question’s k, we know it awarded k points to attempts that put a T there and n − k to those that put an F instead. So a question’s score is the average value of those per-k awards, weighted by the structure’s frequency within the feasible exams. Let’s start easy: computing those weights.
let
-- | number of exams with a given structure (count of Ts per
-- question type)
structureCounts :: Vector Integer
= V.product . V.zipWith ncr tally <$> kss structureCounts
Next I’ll write the function to weight a question type’s awards.
-- | structure count -> k -> (score of T,score of F) per question type
weightedAwards :: Integer -> Vector Int -> Vector (Sum Integer,Sum Integer)
= V.zipWith (\n k -> ( Sum (c * fromIntegral (n-k))
weightedAwards c Sum (c * fromIntegral k) )) tally ,
And actually aggregate:
-- | (score of T,score of F) per exam structure
typeAwards :: Vector (Sum Integer,Sum Integer)
= V.foldl1' (V.zipWith (<>)) $
typeAwards V.zipWith weightedAwards structureCounts kss
This enables us (1) to compute the expected score, by always picking the answer with the biggest payoff.
-- | sum of scores over feasible exams
= sum $ uncurry max <$> typeAwards
typeScores
-- | expected score
= sum typeScores % sum structureCounts score
And (2) to generate an attempt bitstring. We’ll first compute the best answer per question.
-- | best choice per answer
= uncurry (<) <$> typeAwards typeAnswers
This one might need a bit of specifying. We’re computing a “desymmetrised” answer, so we get choose the meaning of the bit as meaning, in effect, “is the correct answer what attempt 1 answered?” Our aggregated tables are all folded such that attempt 1 consistently answered false.5 So given a (score of T,score of F) pair, we want to answer true if the best answer is false. Which simplifies to “true if T awards less than F”.
With that out of the way, we can generate our attempt’s bitstring by re-identifying the question types and looking up the correct choice.
-- | fetch an answer per type and unfold symmetry
=
genAnswer a1 a2 a3 ! (2*fromEnum (a2 /= a1)+fromEnum (a3 /= a1)) == a1
typeAnswers
-- | the attempt's bitstring
= zipWith3 genAnswer as1 as2 as3 answers
We may now construct the much awaited Attempt
record.
let score :: Exp
( answers :: [Bool]
in Attempt score answers )
And thread it all together for a large test set solver:
let tally :: Vector Int
( nsFrom0 :: Int -> Maybe (Vector Int)
construct :: Vector (Vector Int) -> Attempt Exp
in construct $ V.mapMaybe nsFrom0 $ V.enumFromTo 0 (tally!0) )
Whew.
Well I’m glad this one is over. Maybe I can get back to a normal posting schedule at last. As usual, the full code is on github.
I’m deliberately avoiding the problem statement’s vocabulary, that requires distinction between answers and correct answers.↩︎
It seems more natural when summarized this way then when trying to phrase it. And that’s… dangerous.↩︎
I wish Haskell had support for lexically-scoped types.↩︎
False, not wrong. (Object, not predicative.)↩︎