# Not-Square words are Context-Free and Not-Regular.

Ex. from Shallit (nice!).
Let $\mathbb{CF}$ be the class of context-free languages and $\mathbb{REG}$ be the class of regular languages. Let $\overline{\textbf{SQ}}:=\{t \in \{0,1\}^* : \forall w\in\{0,1\}^* \text{ s.t. } t \neq ww \}$ be the language of not-square words. Then $\overline{\textbf{SQ}} \in \mathbb{CF}\setminus \mathbb{REG}$.
Proof.
To prove that $\overline{\textbf{SQ}}$ is not regular, it is sufficient to prove that $\textbf{SQ}:=\{t \in \{0,1\}^* : \exists w\in\{0,1\}^* \text{ s.t. } t = ww\}$ is not regular, since the regular languages are closed under complement. Suppose $\textbf{SQ}$ is regular, then by the pumping lemma for regular languages there exists $n\geq 1$ such that for every word $t\in\textbf{SQ}$ with $|t|\geq n$ there exist $x,y,z\in\{0,1\}^*$ satisfying $t=xyz, y\neq\epsilon, |xy|\leq n$ s.t. $xy^i z\in\textbf{SQ}$ for every $i\geq 0$. Now we consider the square $t:=(0^n1^n)^2$. Then for any pumping decomposition $t=xyz$ s.t. $|xy|\leq n$, it must be $y=0^{|y|}$. Then $xy^0 z = xz = 0^m 1^n 0^n 1^n$ for some $m, hence $xz$ is clearly not a square. This contradicts the pumping lemma so $\textbf{SQ}$ is not regular. Hence $\overline{\textbf{SQ}}$ is not regular.

In order to prove that $\overline{\mathbf{SQ}}$ is context-free consider the following grammar:
$G:$
$S\rightarrow AB \,|\, BA \,|\, A \,|\, B$
$A \rightarrow 0A0 \,|\, 0A1 \,|\, 1A0 \,|\, 1A1 \,|\, 0$
$B \rightarrow 0B0 \,|\, 0B1 \,|\, 1B0 \,|\, 1B1 \,|\, 1$

We claim $\overline{\mathbf{SQ}} \subseteq L(G)$. Let $t\in\overline{\mathbf{SQ}}$ be not-square.
For $b\in\{0,1\}$ define $X_b:=\begin{cases} A , & \mbox{ if } b=0 \\ B, & \mbox{ if } b=1 \end{cases}$.
Now we pose a condition on the parity of $|t|$.
If $|t|=2n+1$ for some $n\geq 0$, then $t=xby$ for some $x,y\in\{0,1\}^*\wedge b\in\{0,1\}$ s.t. $|x|=|y|$. Consider the following production

$\displaystyle S \Rightarrow X_b \Rightarrow x[1]X_by[n] \Rightarrow x[1]x[2]X_b y[n-1]y[n] \Rightarrow \cdots \Rightarrow xX_b y = t$.
Then $t \in L(G)$.

Otherwise $|t|=2n$ for some $n\geq 1$, then $t=xy$ for some $x,y\in\{0,1\}^*$ s.t. $|x|=|y|$. Since $t$ is not a square, there exists the least $1\leq i\leq n$ s.t. $x[i]\neq y[i]$. Now we write $t=ab$ for

$a:=t[1\cdots i-1]\cdot (x[i]=t[i])\cdot t[i+1\cdots 2i-1]$ and
$b:=t[2i\cdots n+i-1]\cdot (y[i] = t[n+i])\cdot t[n+i+1\cdots 2n]$.

Then $S\Rightarrow X_{x[i]} \Rightarrow \cdots \Rightarrow a$ and $S\Rightarrow X_{y[i]} \Rightarrow \cdots \Rightarrow b$. Hence $S\Rightarrow X_{x[i]}X_{y[i]} \Rightarrow \cdots \Rightarrow t$, so $t \in L(G)$ and $\overline{\mathbf{SQ}} \in L(G)$ follows.

Now we claim and prove that $L(G) \subseteq \overline{\mathbf{SQ}}$. Let $t\in L(G)$. Without loss of generality we assume the productions to be of the form $S\Rightarrow AB | BA \Rightarrow \cdots \Rightarrow t$, since otherwise $|t|$ is odd and therefore $t$ is not a square. Let w.l.o.g. to be $S\Rightarrow AB \Rightarrow \cdots \Rightarrow t$. Let $i$ be the height of the parse sub-tree of $t$ starting from $A$ and let $x$ be the word parsed by the same tree, similarly let $j$ be the height of the parse sub-tree of $t$ starting from $B$ and $y$ be the word produced. It must be that $x[i] = 0 \wedge y[j] = 1 \wedge t=xy$.
Now let $n:=\frac{|t|}{2}$.
We claim that $y[j] = t[n+i]$ since $2i-1+2j-1=2n\iff i+j-1=n$ imply $n+i = 2i-1 + j$.
Then $t[i] = x[i] \neq y[j] = t[n+i]$.
This proves that $t$ is not a square. Hence $t\in\overline{\mathbf{SQ}}$. This proves $L(G)\subseteq \overline{\mathbf{SQ}}$.
Q.E.D.