# An alternative solution 

This describes a more detailed but maybe not so performant solution to the function
```get_prob_for_being_multiple_of_9``` in the tutorial (written by oliver).

This notebook introduces you to the Einsteinsummation, which is a method to avoid nested loops and allows for more efficient computation via parallelization. 

In [0]:
import numpy as np
import numpy
import tensorflow as tf
xrange = range
tf.enable_eager_execution()

Given below are the probabilities that 4 consequtive imput images belong to the respective classes. E.g.
probability_by_digit_and_place[0,5,6,7,8] is probability that the first image is a 5, the second a 6,... the first index is the batch number.

In [0]:
# would read as '1008'.
t_ddd_ex = tf.constant(numpy.array([
    [ -1.0, 9.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, -1.0, -1.0],
    [ 9.0, -1.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, -1.0, -1.0],
    [ 9.0, -1.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, -1.0, -1.0],
    [-1.0,  -1.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, 9.0, -1.0]],
    dtype=numpy.float32))
logs_with_batch = tf.reshape(t_ddd_ex, (1, 4, 10))
probability_by_digit_and_place = tf.nn.softmax(logs_with_batch)
probability_by_digit_and_place #
p = probability_by_digit_and_place

Lets now try to find the probability that a given combination is present. We write p=probability_by_digit_and_place. With this notation p(0, 0, 3) is the probability that the first image is a 3. If we assume independance we get: 

```
p(0, 0, X) * p(0, 1, M)  * p(0, 2, C)  * p(0, 3, I)
```

For the probability that the first image is X, the second is M, the third C and the forth is I.  


### Total probability that the 4 images are a multiple of 9 
To get the probability that the 4 images are a multiple of 9, we have to sum over all possible combinations of the digits, which are dividable by 9. Let's try it simple  first:

In [0]:
#Numpy with loops and if
def get_prob_for_being_multiple_of_9_np_with_if(p):
  p0 = 0.0
  for X in xrange(10):
    for M in xrange(10):
      for C in xrange(10):
        for I in xrange(10):
          k = (X + M + C + I) % 9
          if (k == 0):
            p0 = p0 + p[0,0,X] * p[0,1,M] * p[0,2,C] * p[0,3,I]
  return p0
print('Probability for the remainder to be k=0 (with if)', get_prob_for_being_multiple_of_9_np_with_if(probability_by_digit_and_place)) #

Probability for the remainder to be k=0 (with if) tf.Tensor(0.99845773, shape=(), dtype=float32)


We don't want the if statement (you can't do 'if' on a GPU). Therefore we introduce an indicator variable, with is 0 if a given combination is not diviable by 9 and 1 otherwise.

In [0]:
Ixmci = np.zeros([10, 10, 10, 10], dtype=np.float32)
for X in xrange(10):
  for M in xrange(10):
    for C in xrange(10):
      for I in xrange(10):
        if (X + M + C + I) % 9 == 0:
          Ixmci[X,M,C,I] = 1.0

In [0]:
#Numpy with loops
def get_prob_for_being_multiple_of_9_with_indicator(p):
  p0 = 0
  for X in xrange(10):
    for M in xrange(10):
      for C in xrange(10):
        for I in xrange(10):
          p0 = p0 + p[:,0,X] * p[:,1,M] * p[:,2,C] * p[:,3,I] * Ixmci[X,M,C,I]
  return p0
print('Probability for the remainder to be k=0 (w/o if)', get_prob_for_being_multiple_of_9_with_indicator(p)) #

Probability for the remainder to be k=0 (w/o if) tf.Tensor([0.99845773], shape=(1,), dtype=float32)


### Translating the loops to sums

Translating the 4 loops from above into a sum, we get for the probability that in a given batch $b$ the sum is dividable by 9 $P_b$

$$
    P_b = \sum_x\sum_m\sum_c\sum_i p_{bxmci} \cdot I_{xmci}
$$

with the probability factorizing (as assumed above) we get
$$
  p_{bxmci} = p_{bx} \cdot p_{bm} \cdot p_{bc} \cdot p_{bi}   
$$

and hence

$$
P_b = \sum_x\sum_m\sum_c\sum_i  p_{bx} \cdot p_{bm} \cdot p_{bc} \cdot p_{bi} \cdot  I_{xmci}
$$

### Enter Einstein
$$
P_b =I_{xmci}  p_{bxmci}  =  I_{xmci} \cdot p_{bx} \cdot p_{bm} \cdot p_{bc} \cdot p_{bi} 
$$

To wrap up:

We have a Tensor of order 4, with the indices xmci indicating if the 4 digits are dividable by 9 and another tensor with the index bxmci indicating the probability that the given 4 images xmci are present in the batch. If we do the sum, we get the probability $P_b$ for each of the 4 images in the batch.  


In [0]:
np.einsum('xmci,bx,bm,bc,bi->b',Ixmci,p[:,0],p[:,1],p[:,2],p[:,3])

array([0.99845773], dtype=float32)

In [0]:
np.einsum('bx,bm,bc,bi,xmci->b',p[:,0],p[:,1],p[:,2],p[:,3],Ixmci)

array([0.99845773], dtype=float32)

In [0]:
tf.einsum('xmci,bx,bm,bc,bi->b',tf.constant(Ixmci),tf.constant(p[:,0]),tf.constant(p[:,1]),tf.constant(p[:,2]),tf.constant(p[:,3]))

<tf.Tensor: id=242360, shape=(1,), dtype=float32, numpy=array([0.99845785], dtype=float32)>

###### Alternatively 
We are first producing a tensor of order 4

In [0]:
p_ = tf.constant(p)
pb1234_ = tf.einsum('bM,bC,bX,bI->bMCXI',
                            p_[:, 0],
                            p_[:, 1],
                            p_[:, 2],
                            p_[:, 3]
                    )
print("pb1234_", tf.shape(pb1234_)) #B=1, 10,10

Ixmci_ = tf.constant(Ixmci)
print(tf.shape(pb1234_))
tf.einsum('xmci,bxmci->b',Ixmci_,pb1234_)

pb1234_ tf.Tensor([ 1 10 10 10 10], shape=(5,), dtype=int32)
tf.Tensor([ 1 10 10 10 10], shape=(5,), dtype=int32)


<tf.Tensor: id=242430, shape=(1,), dtype=float32, numpy=array([0.9984578], dtype=float32)>

### A replacement for the function as provided in the code

In [0]:
Ixmci = numpy.zeros([10, 10, 10, 10], dtype=np.float32)
for X in xrange(10):
  for M in xrange(10):
    for C in xrange(10):
      for I in xrange(10):
        if (X + M + C + I) % 9 == 0:
          Ixmci[X,M,C,I] = 1.0

Ixmci_ = tf.constant(Ixmci)

def get_prob_for_being_multiple_of_9(logit_by_digit_and_place):
  p = tf.nn.softmax(logit_by_digit_and_place)
  return tf.einsum('xmci,bx,bm,bc,bi->b',Ixmci_,p[:,0],p[:,1],p[:,2],p[:,3])

In [0]:
# would read as '1008'.
t_ddd_ex = tf.constant(numpy.array([
    [ -1.0, 9.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, -1.0, -1.0],
    [ 9.0, -1.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, -1.0, -1.0],
    [ 9.0, -1.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, -1.0, -1.0],
    [-1.0,  -1.0, -1.0, -1.0, -1.0,  -1.0, -1.0, -1.0, 9.0, -1.0]],
    dtype=numpy.float32))

print('Probability',
      get_prob_for_being_multiple_of_9(tf.reshape(t_ddd_ex, (1, 4, 10))))

Probability tf.Tensor([0.99845785], shape=(1,), dtype=float32)
