Not-Square words are Context-Free and Not-Regular.

Ex. from Shallit (nice!).
Let \mathbb{CF} be the class of context-free languages and \mathbb{REG} be the class of regular languages. Let \overline{\textbf{SQ}}:=\{t \in \{0,1\}^* : \forall w\in\{0,1\}^* \text{ s.t. } t \neq ww \} be the language of not-square words. Then \overline{\textbf{SQ}} \in \mathbb{CF}\setminus \mathbb{REG}.
To prove that \overline{\textbf{SQ}} is not regular, it is sufficient to prove that \textbf{SQ}:=\{t \in \{0,1\}^* : \exists w\in\{0,1\}^* \text{ s.t. } t = ww\} is not regular, since the regular languages are closed under complement. Suppose \textbf{SQ} is regular, then by the pumping lemma for regular languages there exists n\geq 1 such that for every word t\in\textbf{SQ} with |t|\geq n there exist x,y,z\in\{0,1\}^* satisfying t=xyz, y\neq\epsilon, |xy|\leq n s.t. xy^i z\in\textbf{SQ} for every i\geq 0. Now we consider the square t:=(0^n1^n)^2. Then for any pumping decomposition t=xyz s.t. |xy|\leq n, it must be y=0^{|y|}. Then xy^0 z = xz = 0^m 1^n 0^n 1^n for some m<n, hence xz is clearly not a square. This contradicts the pumping lemma so \textbf{SQ} is not regular. Hence \overline{\textbf{SQ}} is not regular.

In order to prove that \overline{\mathbf{SQ}} is context-free consider the following grammar:
S\rightarrow AB \,|\, BA \,|\, A \,|\, B
A \rightarrow 0A0 \,|\, 0A1 \,|\, 1A0 \,|\, 1A1 \,|\, 0
B \rightarrow 0B0 \,|\, 0B1 \,|\, 1B0 \,|\, 1B1 \,|\, 1

We claim \overline{\mathbf{SQ}} \subseteq L(G). Let t\in\overline{\mathbf{SQ}} be not-square.
For b\in\{0,1\} define X_b:=\begin{cases} A , & \mbox{ if } b=0 \\ B, & \mbox{ if } b=1 \end{cases}.
Now we pose a condition on the parity of |t|.
If |t|=2n+1 for some n\geq 0, then t=xby for some x,y\in\{0,1\}^*\wedge  b\in\{0,1\} s.t. |x|=|y|. Consider the following production

\displaystyle S \Rightarrow X_b \Rightarrow x[1]X_by[n] \Rightarrow x[1]x[2]X_b y[n-1]y[n] \Rightarrow \cdots \Rightarrow xX_b y = t.
Then t \in L(G).

Otherwise |t|=2n for some n\geq 1, then t=xy for some x,y\in\{0,1\}^* s.t. |x|=|y|. Since t is not a square, there exists the least 1\leq i\leq n s.t. x[i]\neq y[i]. Now we write t=ab for

a:=t[1\cdots i-1]\cdot (x[i]=t[i])\cdot t[i+1\cdots 2i-1] and
b:=t[2i\cdots n+i-1]\cdot (y[i] = t[n+i])\cdot t[n+i+1\cdots 2n].

Then S\Rightarrow X_{x[i]} \Rightarrow \cdots \Rightarrow a and S\Rightarrow X_{y[i]} \Rightarrow \cdots \Rightarrow b. Hence S\Rightarrow X_{x[i]}X_{y[i]} \Rightarrow \cdots \Rightarrow t, so t \in L(G) and \overline{\mathbf{SQ}} \in L(G) follows.

Now we claim and prove that L(G) \subseteq \overline{\mathbf{SQ}}. Let t\in L(G). Without loss of generality we assume the productions to be of the form S\Rightarrow AB | BA \Rightarrow \cdots \Rightarrow t, since otherwise |t| is odd and therefore t is not a square. Let w.l.o.g. to be S\Rightarrow AB \Rightarrow \cdots \Rightarrow t. Let i be the height of the parse sub-tree of t starting from A and let x be the word parsed by the same tree, similarly let j be the height of the parse sub-tree of t starting from B and y be the word produced. It must be that x[i] = 0 \wedge y[j] = 1 \wedge t=xy.
Now let n:=\frac{|t|}{2}.
We claim that y[j] = t[n+i] since 2i-1+2j-1=2n\iff i+j-1=n imply n+i = 2i-1 + j.
Then t[i] = x[i] \neq y[j] = t[n+i].
This proves that t is not a square. Hence t\in\overline{\mathbf{SQ}}. This proves L(G)\subseteq \overline{\mathbf{SQ}}.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s