Interesting Bernoulli Urn Probability Problem [E.T Jaynes]

Still on the topic of probability, I came across yet another interesting problem in the E.T Jaynes book and it goes as follows



Suppose an urn contains N = \sum N_i balls N_1 of color 1 , N_2 of color 2 , \hdots , N_k of color k. We draw m balls without replacement; what is the probability that we have at least one of each color


For a simple illustration, say we have an urn with 4 balls of two different colors, say black and white. What would be the probability of picking three balls having at least one white and one black.
The probability would look something like this p(Y|X) where Y is the probability and X is the current state of information about the urn. Then we have

    \begin{align*}p(Y|X) = p(WWB|X) + p(WBW|X) + p(BWW|X) + \\ p(BWB|X) + p(WBB|X) + p(WBB|X)\end{align*}

The probability seems quite tedious especially if the number of balls in the urn increase, and the number of selection also increases. But we can notice something similar in the probabilities. They all have 2 Whites,or 2 Balls in different arrangements. So all we need do is find the probability of selecting Two black and two whites and then multiply by a certain factor to account for other arrangements.

We start with two whites first. Picking white on the first and second draws while picking black on the third draw.

    \begin{align*}p(W_1 W_2 B_3 | X) = p( W_2 B_3 |W_1 X)p(W_1|X) \\ = p(B_3 | W_2 W_1 X)p( W_2 | W_1 X)p(W_1|X)\end{align*}


now that we have individual conditional probabilities, we can now the probability in terms of the state of the information of the urn

    \begin{align*}p(W_1 W_2 B_3 | X) = \dfrac{W}{N} \cdot \dfrac{W-1}{N-1} \cdot \dfrac{B}{N-2}\end{align*}


the equation above can be written in more elegant form as

    \begin{align*}p(W_1 W_2 B_3 | X) = \dfrac{W}{N} \cdot \dfrac{W-1}{N-1} \cdot \dfrac{B}{N-2} = \dfrac{2! \mathlarger{\binom{W}{2}} \cdot 1!\mathlarger{\binom{B}{1}}}{3!\mathlarger{\binom{N}{3}}}\end{align*}


this can then be extended to the other case of two blacks and white in the same order

    \begin{align*}p(B_1 B_2 W_3 | X) = \dfrac{B}{N} \cdot \dfrac{B-1}{N-1} \cdot \dfrac{W}{N-2} = \dfrac{2! \mathlarger{\binom{B}{2}} \cdot 1!\mathlarger{\binom{W}{1}}}{3!\mathlarger{\binom{N}{3}}}\end{align*}


now since we have established the other probabilities are simply just rearrangements of these two. Then we can say that

    \begin{align*}P(Y|X) = \mathlarger{\binom{3}{2}}( p(B_1 B_2 W_3 | X) + p(W_1 W_2 B_3 | X)) \\ = \mathlarger{\binom{3}{2}} \left( \dfrac{2! \mathlarger{\binom{B}{2}} \cdot 1!\mathlarger{\binom{W}{1}}}{3!\mathlarger{\binom{N}{3}}} +\dfrac{2! \mathlarger{\binom{W}{2}} \cdot 1!\mathlarger{\binom{B}{1}}}{3!\mathlarger{\binom{N}{3}}}\right)\end{align*}


The equation then reduces to

    \begin{align*}P(Y|X) = \dfrac{\mathlarger{\binom{B}{2} \binom{W}{1} + \binom{W}{2} \binom{B}{1} } }{\mathlarger{\binom{N}{3}}}\end{align*}


The equation looks quite elegant, now suppose that we had three colors instead. 3 Black, 3 White and 3 Red Balls and we select 4 random balls. The probability easily becomes

    \begin{align*}P(Y|X) = \dfrac{\mathlarger{\binom{B}{2} \binom{W}{1} \binom{R}{1} +\binom{B}{1} \binom{W}{2} \binom{R}{1} +\binom{B}{1} \binom{W}{1} \binom{R}{2} } }{\mathlarger{\binom{N}{4}}}\end{align*}


Now suppose for the same scenario above that we are supposed to select 5 balls instead. The probability simply also becomes

    \begin{align*}P(Y|X) = \dfrac{\mathlarger{\binom{B}{3} \binom{W}{1} \binom{R}{1} +\binom{B}{2} \binom{W}{2} \binom{R}{1} + \hdots\binom{B}{1} \binom{W}{1} \binom{R}{3} } }{\mathlarger{\binom{N}{5}}}\end{align*}


From the two scenarios above we see a general pattern occurring. First the m draws is partitioned into k parts, where 1 \leq m_k \leq N_i and \sum_k m_k = m
Now given N balls with each N_i of color i we can then write the probability of selecting m balls with at least one of each as

    \begin{align*}P(Y|X) = P(N_1 N_2 N_3 \hdots N_k|N_{others}X) \\= \dfrac{\mathlarger{\sum_{m_i}\prod_{i}\binom{N_i}{m_i} } }{\mathlarger{\binom{N}{m}}}\end{align*}


PHEW!! That was quite interesting!
Now the second part of the question proceeds as follows



Supposing that k = 5 and all N_i = 10 how many do we need to draw in order to have at least a 90 % probability for getting a full set.


stated mathematically this simply means,
find minimum m such that

    \begin{align*}\dfrac{\mathlarger{\sum_{m_i}\prod_{i}\binom{N_i}{m_i} } }{\mathlarger{\binom{N}{m}}} \geq 0.9\end{align*}


where
N_i = 10 and k = 5

This problem would lend itself better to programming based solution as any trick employed would require advanced knowledge of Counting(Combinatorics)
Forthe solution, I Used Haskell and got the following output (The code is placed at the end of the blog post)

5 draws => 4.7197417357322205e-2
6 draws => 0.14159225207196663
7 draws => 0.26280380119418045
8 draws => 0.39045136177421097
9 draws => 0.5107424115832719
10 draws => 0.6164726091135438
11 draws => 0.7051322262819913
12 draws => 0.7770125811341809
13 draws => 0.8338240294168358
14 draws => 0.8778255150191284
15 draws => 0.9113323783701572

Therefore our answer is 15 draws

The Haskell Code is given below [git – https://github.com/igbokwedaniel/bernoulli_urn]


import qualified Data.List as List
import qualified Data.Map as Map
import qualified Data.Maybe as May
import Control.Monad

--Defining The Factorial nCr  = n!/r!(n-r)!
factorial' :: (Integral a) => a -> a -> a
factorial' _ 0      = 1
factorial' 0 _      = 0
factorial' n k      = factorial' (n-1) (k-1) * n `div` k 

-- defining the function which returns all the possible combinations, 
-- I'm explicitly assuming the k =  5 colors here 
sumFinder' :: (Integral a) => a -> a -> Maybe a
sumFinder' 0 _         = Nothing
sumFinder' _ 0         = Nothing
sumFinder' n  m        = Just $ Map.foldrWithKey f 0 (getMap' n m) 
                            where 
                            f k a len = len + ((folder' k n)*a)

--generating the unique mapfrp
getMap' ::   (Integral a) => a-> a -> Map.Map [a] a
getMap' n  m        = Map.fromListWith (+) $ zip (getAllSums' m n) (cycle [1])


getAllSums' :: (Integral a) =>  a -> a -> [[a]]
getAllSums' m  n = [ List.sort [a,b,c,d,e] | a <- [1..n], b <- [1..n], c <- [1..n], d <- [1..n], e <- [1..n], a+b+c+d+e == m]

folder' :: (Integral a) =>  [a] -> a -> a
folder' arr n = List.foldl (*) 1 $ map (factorial' n) arr

toFractional' :: (Num b, Integral a) =>  Maybe a -> b
toFractional' m = fromIntegral $ (May.fromJust m)

probability' :: (Integral a, Fractional b) => a -> a -> b 
probability' n m  = x/y 
                    where
                    x = toFractional' (sumFinder' n m)
                    y = fromIntegral (factorial' (5*n) m)
                    
getAllProbabilities' :: (Integral a, Fractional b, Ord b) => a -> b -> [(a,b)]
getAllProbabilities' n lim = genString' n [5..(n*5)] lim
  
genString' :: (Integral a, Fractional b, Ord b) => a -> [a] -> b -> [(a,b)]
genString' n (x:xs) lim
  | (probability' n x) > lim  =  [(x ,(probability' n x))]
  | otherwise =  [(x ,(probability' n x))] ++ (genString' n xs lim)
  

main = putStrLn $ show (getAllProbabilities' 10 0.9)

3 thoughts on “Interesting Bernoulli Urn Probability Problem [E.T Jaynes]

  1. I must confess I do not understand all the magic that happened up here at a glance but I am amazed at the fact there is still more to discover in the world of probability.

    I am definitely going to read this through again.

    Thank you Igbokwe, Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *