We propose a new criterion for randomness of a word
$x_{1}x_{2}\cdots x_{n}\in \mathbb{A}^{n}$
over a finite alphabet
$\mathbb{A}$
defined by
$$\begin{eqnarray}\unicode[STIX]{x1D6EF}^{n}(x_{1}x_{2}\cdots x_{n})=\mathop{\sum }_{\unicode[STIX]{x1D709}\in \mathbb{A}^{+}}\unicode[STIX]{x1D713}(|x_{1}x_{2}\cdots x_{n}|_{\unicode[STIX]{x1D709}}),\end{eqnarray}$$
where
$\mathbb{A}^{+}=\bigcup _{k=1}^{\infty }\mathbb{A}^{k}$
is the set of non-empty finite words over
$\mathbb{A}$
, for
$\unicode[STIX]{x1D709}\in \mathbb{A}^{k}$
,
$$\begin{eqnarray}|x_{1}x_{2}\cdots x_{n}|_{\unicode[STIX]{x1D709}}=\#\{i;~1\leq i\leq n-k+1,~x_{i}x_{i+1}\cdots x_{i+k-1}=\unicode[STIX]{x1D709}\},\end{eqnarray}$$
and for
$t\geq 0$
,
$\unicode[STIX]{x1D713}(0)=0$
and
$\unicode[STIX]{x1D713}(t)=t\log t~(t>0)$
. This value represents how random the word
$x_{1}x_{2}\cdots x_{n}$
is from the viewpoint of the block frequency. In fact, we define a randomness criterion as
$$\begin{eqnarray}Q(x_{1}x_{2}\cdots x_{n})=(1/2)(n\log n)^{2}/\unicode[STIX]{x1D6EF}^{n}(x_{1}x_{2}\cdots x_{n}).\end{eqnarray}$$
Then,
$$\begin{eqnarray}\lim _{n\rightarrow \infty }(1/n)Q(X_{1}X_{2}\cdots X_{n})=h(X)\end{eqnarray}$$
holds with probability 1 if
$X_{1}X_{2}\cdots \,$
is an ergodic, stationary process over
$\mathbb{A}$
either with a finite energy or
$h(X)=0$
, where
$h(X)$
is the entropy of the process. Another criterion for randomness using
$t^{2}$
instead of
$t\log t$
has already been proposed in Kamae and Xue [An easy criterion for randomness.
Sankhya A
77(1) (2015), 126–152]. In comparison, our new criterion provides a better fit with the entropy. We also claim that our criterion not only represents the entropy asymptotically but also gives a good representation of the randomness of fixed finite words.