# String matching on random text.

NAIVE-STRING-MATCHER(T,P) 1. $n\leftarrow length[T]$ 2. $n\leftarrow length[P]$ 3. for $s\leftarrow 0$ to $n-m$ 4.    if $P[1\ldots m] = T[s+1\ldots s+m]$ 5.      then print "pattern occurs at shift $s$" 
Ex. from CLRS. The Naïve String matching algorithm is optimal for uniform random text $T[]$ of length $n$ and uniform random pattern $P[]$ of length $m$.

Proof. Assume an alphabet of size $d\geq 2$.
Let $p(k)$ be the probability of doing $k\leq m$ comparisons at line 4.
If $k then $p(k) = \frac{1}{d^{k-1}}(1-\frac{1}{d})=\frac{d-1}{d^k}$.
Otherwise $k=m$ and $p(k) = \frac{1}{d^{m-1}}$.

Then the expected number of comparisons is,
$E_p = \sum_{k=0}^{m} kp(k) = \sum_{k=0}^{m-1}k\frac{d-1}{d^k} + \frac{m}{d^{m-1}} =$
$= (d-1)\sum_{k=0}^{m} \frac{k}{d^k} + \frac{m}{d^{m}}$.

Recall that,
$\sum_{k=0}^m kx^k =$
$= x \frac{d}{dx} (\sum_{k=0}^{m} x^k) = x(-(m+1) x^m (1-x) +1-x^{m+1})/(1-x)^2 =$
$= x(mx^{m+1} -mx^{m}-x^m+1)/(1-x)^2 =$
$= \frac{x}{(1-x)^2}x^m(m(x-1) -1+\frac{1}{x^m})$.

Then,
$E_p = (\frac{1-x}{x})\frac{x}{(1-x)^2}x^m(m(x-1) -1 + \frac{1}{x^m}) + mx^m = x^m(\frac{1}{x^m(1-x)}-\frac{1}{1-x}) =$
$= \frac{1}{1-x}(1-x^m) = \frac{1-x^m}{1-x} =$
$= \frac{1-\frac{1}{d^m}}{1-\frac{1}{d}}\leq \frac{1}{1/2} = 2$

Thus, the expected number of total comparisons is optimal for uniform random input,
$E^{\text{total}}_p = E_p(n-m+1)\leq 2(n-m+1) = O(m+n)$
Q.E.D.