Finitely generated subgroups of free groups as formal languages and their cogrowth

For finitely generated subgroups $H$ of a free group $F_m$ of finite rank $m$, we study the language $L_H$ of reduced words that represent $H$ which is a regular language. Using the (extended) core of Schreier graph of $H$, we construct the minimal deterministic finite automaton that recognizes $L_H$. Then we characterize the f.g. subgroups $H$ for which $L_H$ is irreducible and for such groups explicitly construct ergodic automaton that recognizes $L_H$. This construction gives us an efficient way to compute the cogrowth series $L_H(z)$ of $H$ and entropy of $L_H$. Several examples illustrate the method and a comparison is made with the method of calculation of $L_H(z)$ based on the use of Nielsen system of generators of $H$.


Introduction
In [9,10,11], the notion of cogrowth of a subgroup H of a free group F m was introduced and a cogrowth criterion for amenability of the factor group F m /H (in case H is normal) was proved. The concept of cogrowth was used to construct counterexamples concerning various versions of the von Neumann conjecture about the existence of an invariant mean on groups and homogeneous spaces [1,10,20].
In [9,10,11], it was observed that when H ≤ F m is a finitely generated (f.g.) subgroup, then the cogrowth series is a rational function, where |w| denotes the length of w ∈ H and H n denotes the set of the reduced elements of length n in H with respect to a fixed basis of F m .
Also, by [9], for H ≤ F m , the following formula holds: where R(x, y) is a rational function explicitly determined in [9], and where P (n) 1,1 is the probability of return to the identity 1 ∈ G = F m /H in the simple random walk on G that starts at the identity. This formula shows a close relation between analytic properties of functions H(z) and u(z). In particular, the algebraicity of u(z) is equivalent to the algebraicity of H(z). If H is finitely generated, then the fact that H(z) is rational was proven in [11] using Nielsen system of generators for H. The topics related to growth and cogrowth have gotten a lot of attention and popularity, and are widely presented in the literature. See, for example, [8].
Among various open questions related to cogrowth, the authors suggest a conjecture that H(z) is rational if and only if H is finitely generated. Note that when H F m is a normal subgroup, the conjecture follows from the result of D. Kouksov [16].
The alternative approach to proving the rationality of H(z) is via the theory of formal languages. Recall that the classical Chomsky hierarchy of languages begins with the class of regular (also called rational) languages, that is, languages recognizable by finite-automata acceptors [13]. Already in the 1950-60's, Chomsky and Schützenberger were aware that the rationality of a language L ⊂ Σ * (where Σ is finite alphabet and Σ * denotes the set of finite words over Σ) implies rationality of the growth series where B n (L) is the set of words in L of length n. In fact, back then it was more popular to consider the following noncommutative version of the growth series, the rationality of which is equivalent to the rationality of L [3,21]. A concept closely related to cogrowth is the entropy of languages, which is defined as For a fixed free basis A = {a 1 , · · · , a m } of F m , we denote by L H the set of all reduced finite words in (A ∪ A −1 ) * that represent an element of H ≤ F m . The fact that regularity of the language L H is equivalent to the finite generation of H was observed by Anissimov and Seifert [2]. Since the intersection of two regular languages is regular (see [13]), a direct consequence of the theorem of Anissimov and Seifert is that the intersection of two finitely generated subgroups of F m is again finitely generated. The last observation is in fact a well-known theorem of Howson from 1954, [14]. A proof of the theorem of Anissimov and Seifer, based on the ideas of geometric group theory, is presented in [7]. In fact, there are several ways to prove that if H ≤ F m is finitely generated, then L H is a regular language. An Observe that the number of equivalence classes of R L is at most the number of states of A , which is finite. Now we recall a version of Myhill-Nerode Theorem.
Theorem 2.1. (Theorem 3.9 and Theorem 3.10 of [13]) Let L ⊆ Σ * be a regular language. Then, the relation R L defines a DFA A = (Q , Σ, δ , {q 0 }, F ) for L whose states correspond to the equivalence classes of R L . Moreover, this is the unique minimal DFA for L (up to isomorphism), where Q = {[w] | w ∈ Σ * } {q 0 } = the equivalence class of the empty word.
, then we will ignore the signs and by deg(v) we denote the degree of the vertex v. For further details on the theory of finite automata, we refer the reader to [13] and [21].
3. The Schreier graph and the core associated with a subgroup H of F m This section is devoted to the Schreier graph and the core of a f.g. subgroup H of F m . We shall also discuss the procedure to obtain the core of H using Stallings foldings and some of the important properties of the core.
3.1. The Schreier graph of subgroup H of F m . We define two versions of the Schreier graph associated with H ≤ F m , which we denote by Γ and Γ, respectively. The set of vertices of Γ and Γ is the same and is the set V = {H g | g ∈ F m } of right cosets. The set of edges E of Γ is the set E = {(H g , H ga ) | g ∈ F m , a ∈ A} consisting of pairs e = (H g , H ga ) of cosets. The edges are oriented and H g is the origin o(e) of e while H ga is the terminus t(e) of e. Moreover, such an edge has the label µ(e) = a. Each vertex in Γ has m outgoing edges whose labels constitute the set A. The graph Γ is obtained from Γ by adding edges from the set E = {e | e ∈ E} where e = (H ga , H g ) if e = (H g , H ga ) and the label if e ∈ E and µ(ē) = µ(e) −1 ifē ∈ E. Each vertex of Γ has 2m outgoing edges and 2m incoming edges, whose labels constitute the set Σ = A ∪ A −1 . We call Γ the Schreier graph and Γ the extended Schreier graph of H. The vertex v 0 = H 1 = H is the distinguished vertex, so in fact Γ and Γ are rooted graphs with root v 0 . Observe that in fact, according to the standard terminology in graph theory, Γ and Γ are multigraphs as they may have loops and multiple edges. We will use the obvious notion of path p in directed graph Γ or Γ and its label µ(p) ∈ A * or µ(p) ∈ Σ * , respectively. A branch of a k-regular tree is a subtree which has one degree 1 vertex, which we call the root of the branch and all the other vertices have degree k. Such a branch is uniquely determined by its stem, which is the oriented edge going from the root to the interior of the branch, see Figure ( 1). A subgraph of the Schreier graph Γ isomorphic to a branch in the Cayley graph X m of F m (with its labeling) is called a hanging branch. The Cayley graph X m = Cay(F m , Σ) is a homogeneous tree of degree |Σ|. The core of the Schreier graph Γ can be obtained also by removing the hanging branches. Moreover, if the core ∆ H is known, then the graph Γ can be obtained from the core ∆ H by filling the deficient valencies of the vertices of ∆ H with maximal hanging branches (so that all the degrees of the resulting graph have the degree 2m). Thus, since the Schreier graph Γ is connected, its core ∆ H is also connected. We refer the reader to [12] for the descriptions of the Hopf decomposition of the boundary in terms of Γ, ∆ H and the collection of hanging branches.
Let where for each e ∈ E connecting vertex v to va j , we have where p is a admissible path of A ∆ H . Notice that the admissible paths p in A ∆ H may or may not be reduced. Hence not all words in the language L(A ∆ H ) are reduced. We denote by L H the language of reduced elements of a f.g. subgroup H of F m . Theorem 5.1 from [15] can be read as   Figure 2. Construction of the core ∆ H of H = aba −1 , aca −1 . In (2b), we start with a bouquet of two loops attached to a base vertex v 0 . Each loop we split into 3 oriented edges with positive labels a, b, a and a, c, a, respectively. Observe that the first two edges of each loop have clockwise orientation whereas the third edge in both the loops has the reverse orientation. In Figures (2c) and (2d), we use Stallings foldings (see 2a) to obtain the required ∆ H (see 2e).

Stallings foldings.
Let H be generated by elements w 1 , · · · , w k . We identify w i with freely reduced words in the alphabet Σ. Then there is a simple combinatorial procedure to obtain the core ∆ H . This procedure is based on the topological idea of folding developed by J. Stallings in [22]. Roughly the procedure can be described as follows.
Start with a bouquet of k circles glued together along a vertex v 0 . Split the i-th circle, 1 ≤ i ≤ k, into |w i | edges which are oriented and labeled by the letters from the set A ∪ A −1 so that the label of the i-th circle(as read from v 0 to v 0 ) is precisely the word w i . Reverse the edges with (negative) label x −1 from the set A −1 and assign the (positive) label x from the set A (see Figure 2). Suppose e 1 , e 2 are edges of this graph with a common origin and the same label x ∈ A. Then, informally, folding the graph at e 1 , e 2 means identifying e 1 and e 2 in a single new edge labeled by x.
At the first step, fold the graph at the edges that are originated at the root vertex v 0 . After performing these foldings, fold the graph at the edges that are originated at the other vertices and continue the process. As we assumed that the subgroup H is finitely generated, the process will stop after applications of finitely many steps (i.e. it stops when no more folding is possible). The resulting graph, up to isomorphism of labeled graphs, does not depend on the performed sequence of foldings. And the resulting graph is isomorphic to the core ∆ H (see [15,22]). Moreover, the algorithm based on above procedure has polynomial time complexity (see [15]).
In fact, the above procedure works also for the situation of an infinitely generated group H = w 1 , · · · , w n , · · · . One just has to begin with folding of the bouquet of the two loops labeled by w 1 and w 2 , after the process stops add to the obtained graph a new loop labeled by w 3 , apply folding, then add w 4 etc. The process will converge to the infinite folded (i.e. no more folding is possible) graph ∆ H with rooted vertex v 0 .
The following lemma lists some of the well-known properties of the graph ∆ H of H ≤ F m .  Proof. For the proofs of above properties (1)-(4) we refer the reader to Propositions 3.8 and 8.3, and Theorem 8.14 of [15]. Property (5) follows immediately from the definition of the extended core graph ∆ H . Property (6) follows from Lemma 3.9 of [15].

The Nielsen system of generators
Let H be a subgroup of F m generated by the set S = {w i } k i=1 , 1 ≤ k ≤ ∞ of freely reduced words over Σ. We further assume that S is a Nielsen basis. Recall that a set S of freely reduced words from Σ * has the Nielsen property if the following two conditions hold: where by |w| Σ we mean the length of the reduced word w over Σ. Condition (1) means that not more than a half of u and not more than a half of v freely cancels in the product u · v. Condition (2) means that assuming (1) after free cancellation in the product u · v · w at least one letter of v will remain un-cancelled. See [19]. From (1) and (2) it follows that S is a free basis of the subgroup S of F m .
Nielsen was the first who proved that every non trivial subgroup of F m has a set of generators with properties (1) and (2). His argument was quite involved. A simpler proof is given in [19]. In Theorem 3.4 of [19], it is shown that any minimal Schreier system of generators of subgroup H of F m has the Nielsen property. See also Proposition 6.7 in [15] or [18].
Let us recall briefly how to get a Nielsen system geometrically. Let H be a subgroup of F m and Γ be the corresponding Schreier graph. Recall that labels of edges of Γ belong to the set A = {a 1 , · · · , a m }. Let T be the spanning tree in Γ. The set of vertices of T is same as the set of vertices V of Γ. The tree T is obtained from Γ by deletion of some edges. Let E be the set of deleted edges. With each edge e ∈ E we associate an element w e of H which is the word µ(p), where p is the unique path in Γ as described in Figure (  It is obvious that w e belongs to H. It is not obvious but the result of Schreier is that, the set S = {w e } e∈E is a free basis of H. Such a basis is called Schreier basis. The choice of a spanning tree is not canonical and usually there are a plenty of such choices (hence plenty of choices for Schreier system of generators). Some of the choices of T are better 1:10

A. Darbinyan, R. Grigorchuk, and A. Shaikh
Vol. 13:2 than others. A spanning tree T is geodesic (or minimal) with respect to the root v 0 if for any vertex v ∈ V the combinatorial distance from v to v 0 in Γ is the same as the distance from v to v 0 in T . Here we assume that we convert both graphs Γ and T into non-oriented graphs (by forgetting the direction of each edge). In this case the combinatorial distance (i.e. the number of edges in the closest path connecting two vertices) is the metric. It is known that a spanning tree in the connected locally finite graph always exists and there is an effective procedure to find such a tree if the graph itself is defined in an effective way. It is well-known that for any spanning tree the corresponding Schreier system S = {w e } e∈E satisfies the Nielsen properties (1) and (2) In this case, observe that the subtree T Γ \T ∆ H is disconnected and the corresponding connected components are the hanging branches of Γ. Therefore, similarly to the finite index case, if [F m : H] = ∞, then in order to find the Nielsen generating set it is sufficient to consider the

The construction of D H and D H , their properties and consequences
The first goal of this section is to construct the minimal DFA D H that recognizes the language L H of reduced words of a finitely generated subgroup H ≤ F m . We shall approach this construction by defining an unambiguous automaton A H as the (Cartesian) product of two automata. Then we obtain the minimal DFA D H as the essential part of A H . At the end we obtain the multi-initial state automaton D H by replacing the initial state of D H and provide the complete description when D H is ergodic. In Theorem 5.18, we utilize the ergodicity property of D H to obtain an entropy formula for L H . Recall that a finite automaton A is essential if in its Moore diagram G A every vertex (hence, also every edge) belongs to some path connecting an initial state to a final state, i.e. to an admissible path. The following lemma will be used later.
Lemma 5.1. Every finite automaton A has an essential part. If A is an unambiguous finite automaton, then it has only one essential part, called the essential part of A .
Proof. Define A to be the subautomaton of A such that G A is the union of all admissible paths of G A . Then, clearly, L(A ) = L(A ) and A is essential, hence A is an essential part of A .
If A is unambiguous, then no proper subautomaton of A generates the language L(A ). Also, if a vertex in G A does not belong to some admissible path in G A , then, by definition, it will not belong to any essential subautomaton of A . Thus, if A is unambiguous, then A is the only essential part of A . 5.1. The automaton A H . One of the important closure properties of regular languages is that the intersection of two regular languages is regular. See Theorem 3.3 on page 59 of [13]. A DFA recognizing the intersection of two regular languages can be constructed as follows. Let L 1 , L 2 ⊆ Σ * be regular languages. Also, let To construct a DFA A H that recognizes the language L H of reduced words of H, we shall consider first the automaton We take a second automaton where the set of states and of final states are both equal to See Figure 4 for the Moore diagram of A F 2 . We define A H = A ∆ H × A Fm . Namely, Remark 5.2. A H inherits from A Fm and A ∆ H the property of being deterministic and having only one initial state. In particular, A H is an unambiguous automaton.
Recall that L H ⊆ Σ * denotes the set of reduced words that represent elements of H ≤ F m .   Figure 4. The Moore diagram of A F 2

Definition and main properties of
In other words, Q D H is the set of accessible states of A H . Proof.
Then, by definition of A H , the path is well defined in A H and has the label w. Also, by definition of Q D H , (v l , q l j l ) ∈ Q D H . (2) Now assume that (v, q i ) ∈ Q D H . Then, by definition, there exists e ∈ E such that t(e) = v andμ(e) = a i . Assume that v = o(e) = v 0 . Then, since by Lemma 3.2, the number of incoming edges for v is at least two, there exists e ∈ E such that t(e ) = v and its label a i is different from a i . Therefore, (v , q i ) ∈ Q D H . Note that, by Lemma 5, ∆ H has an admissible path p = p ep with reduced label such that t(p ) = v and o(p ) = v. As the discussion in the proof of part (1) shows, p corresponds to a pathp in A H with the same label as p. Now, if the label of p is w 1 and the label of p is w 2 , then the label ofp is w 1 a i w 2 . By part (1) of the lemma, the sub-path ofp with label w 1 a i will terminate at v. Therefore, by part (1) of the lemma, Proof. The fact that D H is essential follows directly from Corollary 5.5. Also, from Lemma 5.4 and the fact that the set of reduced words in L(A ∆ H ) coincides with L H it follows that L H = L(D H ). Combining this with Lemma 5.1 and the fact that A H is unambiguous, we get that D H is the essential part of A H .
Remark 5.7. D H inherits from A H the properties of being deterministic and unambiguous. Automaton presentation of D H . For expository reasons, in the sequel we replace the notation q i for a state of D H by a i , and denote the initial state (v 0 , q 0 ) simply by q * . Thus D H will have the following presentation: We write u ≡ v if the following holds: for all w ∈ Σ * , δ(u, w) is a final state if and only if δ(v, w) is a final state. Notice that the relations R L (see (2.1)) and ≡ express exactly the same idea, i.e. wR L w ⇐⇒ δ(q 0 , w) ≡ δ(q 0 , w ). (5. 3) The states u and v of A are equivalent if u ≡ v. When the states u and v are not equivalent, then we say that they are distinguishable. That is, there exists at least one state w such that one of δ(u, w) and δ(v, w) is an accepting state and the other is not.
Proof. In order to show that the DFA D H is minimal, according to the Myhill-Nerode Theorem, it is enough to show that any two distinct states (u, a i ) and (v, a ν j ) of D H are distinguishable. We prove this by contradiction.
Assume that (u, , w is a final state. Let us fix one such w (the existence of w follows from D H being essential.) Since D H is an essential automaton, there are w 1 and w 2 in Σ * such that δ D H (q * , w 1 ) = (u, a i ) and δ D H (q * , w 2 ) = (v, a ν i ). Therefore, w 1 w, w 2 w ∈ L(D H ) = L H , the suffix of w 1 is a i , the suffix of w 2 is a ν j . Then w 1 (w 2 ) −1 = (w 1 w)(w 2 w) −1 ∈ H, which implies that Hw 1 = Hw 2 , and hence u = v. Therefore, it must be that a i = a ν j , which, in particular, implies that w 1 (w 2 ) −1 ∈ L H . As the states (u, a i ) and (v, a ν j ) are equivalent, under the assumption that (u, a i ) and (v, a ν j ) are R L -equivalent, w 1 (w 2 ) −1 ∈ L H implies w 2 (w 2 ) −1 ∈ L H , which is not true -a contradiction.

5.3.
Definition of D H . In this subsection we introduce the automaton D H , which is obtained from D H by simply removing its initial state q * and proclaiming the final states F D H of D H at the same time the initial and final states of D H . Thus the automaton D H would have the following presentation: and ∃e ∈ E s.t. a i = µ(e), t(e) = v , The main motivation behind consideration of the automaton D H is that it will define the language L H and be ergodic whenever L H is irreducible, and at the same time will provide a convenient way of computing the entropy of L H . All these will be revealed in detail in the next section. See, for example, Theorems 5.14, 5.18 and Proposition 5.17.
To describe the main properties of D H , first we need the following definition.  Proof. The fact that D H is deterministic follows straightforwardly from the definition of D H . The fact that D H has homogeneous ambiguity deg(v 0 ) − 1 follows from the observations that D H is a DFA (hence, is unambiguous) and the states that can be attained from q * in D H are precisely the states that can be attained from precisely |I D H | − 1 initial states of D H (note that except the initial states, the states of D H and D H coincide). Therefore, D H has homogeneous ambiguity |I D H | − 1 = deg(v 0 ) − 1.

5.4.
Ergodicity of D H . The main goal of this subsection is to give a complete description of when D H is ergodic for finitely generated subgroups H of F m . In Subsection 5.5 we discuss some of its consequences that are connected with the entropy of L H . Everywhere in this section we assume that H is a non-trivial finitely generated subgroup of the free group A finite state automaton is said to be ergodic if its Moore diagram, regarded as a directed graph, is strongly connected. That is for each ordered pair of vertices of the Moore diagram there is a path from the first to the second vertex. An alternative characterization of ergodicity of a finite state automaton is that the formal language L ⊆ Σ * that it defines is irreducible, which means that for each w 1 , w 2 ∈ L there exists w ∈ Σ * such that w 1 ww 2 ∈ L. See [4]. Yet another characterization of ergodicity is that the adjacency matrix M of the Moore diagram of the automaton is irreducible: if M is of size m × m, then for each 1 ≤ i, j ≤ m, there exists n ∈ N such that the (i, j)-th coefficient of M n is strictly positive. The latest property of ergodic automata of bounded ambiguity can be used to compute the entropy of the corresponding language, which we discuss later in Subsection 5.5. Proof. First, assume that L H satisfies to the property from the statement. Then it follows from Proposition 5.10 that H is conjugacy reduced. Now, by contradiction assume that H is not cyclic. Then, let a −1 w 1 b be the shortest word in L H ending with b and let a −1 w 2 b ∈ L H be the shortest word ending with b that is not contained in the maximal cyclic subgroup of H containing a −1 w 1 b ∈ H. Since H is cyclically reduced, a = b −1 . Then However, since by the assumption no word in L H starts with b −1 and ends with b, it must be that after the free cancellation of b −1 w −1 2 w 1 b either the prefix or the suffix of b −1 w −1 2 w 1 b cancels out. In the first case we end up with a word that is shorter than a −1 w 1 b, in the second case we end up with a word that is shorter than a −1 w 2 b. Therefore, we end up with a contradiction because of the minimality condition on the lengths of a −1 w 1 b and a −1 w 2 b. Thus H is cyclic. Now assume that H is cyclic and conjugacy reduced. Then there exists w ∈ H that generates H. In other words, L H = {w n | n ∈ Z}. Since H is conjugacy reduced, the prefix of w is not the inverse of the suffix. Therefore, the prefix corresponds to a −1 , the suffix corresponds to b, and the "only if" part of the first statement of the proposition follows as well. Now let us show the second statement. The condition that all words of L H are of the form (a −1 wb) ±1 implies that the root vertex v 0 of ∆ H has exactly two incoming edges with corresponding labels a and b, and it has exactly two outgoing edges with the same labels. This means that Moreover, since by definition of D H there is no outgoing edge of (v 0 , a) with label a −1 in the Moore diagram, we have that the label of any accepted path that starts at (v 0 , a) starts with b −1 and hence ends with a −1 , hence is a loop. The same statement in regard with (v 0 , b) is true as well. Therefore, the second assertion of the proposition follows as well.
Theorem 5.14. Let H be a finitely generated subgroup of F m . Then D H is an ergodic automaton if and only if H is conjugacy reduced and non cyclic.
Proof. First, assume that H is conjugacy reduced and non cyclic. By Lemma 5.11, D H is essential, which means that any state of D H is on some admissible path. Therefore, since I D H = F D H , in order to show that D H is strongly connected, it is enough to show that for any ordered pair of different initial (=final) stats of D H , there is a directed path connecting the first state to the second one. (Note that, by Proposition 5.10, D H has at least two different initial states.) Let (v 0 , a), (v 0 , b), a = b ∈ Σ, be an arbitrary pair of initial states. We want to show that there exists a path from (v 0 , a) to (v 0 , b), which is equivalent to the existence of a word w ∈ L H which does not start with a −1 and ends with b, as in such case δ D H ((v 0 , a), w) = (v 0 , b).
The fact that (v 0 , b) ∈ I D H means that there exists u ∈ L H such that it ends with letter b. Since by our assumption H is not cyclic, there exists 1 = v ∈ L H such that u ∩ v = {1}. Then, for large enough n ∈ N, the wordv := red(u −n vu n ) ∈ L H starts with b −1 and ends with b. Therefore, δ D H ((v 0 , a),v) = (v 0 , b). Thus we showed that if H is conjugacy reduced and non cyclic, then D H is ergodic. Now assume that H is not conjugacy reduced. Then, by Proposition 5.10, D H has exactly one initial state, which is isolated. Therefore, in such case D H is not ergodic.
Finally, assume that H is conjugacy reduced but cyclic. Then, by Proposition 5.13, D H has exactly two initial states which are not connected one to the other. Therefore, in such case D H is not ergodic either. Thus the theorem is proved.
Combining Theorem 5.14 with Propositions 5.10 and 5.13, we get the following corollary. Proof. The 'if' part of the statement follows from Theorem 5.14.
Now assume that H is not conjucagy reduced, then for some a ∈ Σ, elements of L H are (reduced) words of the form a −1 wa. Since concatenation of any such words is not reduced, it means that L H is not irreducible in such case.
Finally, assume that H = w is cyclic. Then, w, w −1 ∈ L H . However, there is no v ∈ L H such that wvw −1 is also in L H . Therefore, L H is not irreducible in this case either. Thus the proposition is proved. 5.5. Entropy of L H . Recall that for a formal language L its entropy, h(L), which is a fundamental numerical invariant of a language, is defined as where B n (L) is the subset of L of words of length n. If L is a language generated by an automaton A , then by h(A ) we denote h(L(A )). If for H ≤ F m , then the entropy of H is h(H) = h(L H ).
One important feature of ergodicity of D H is that the adjacency matrix M D H of D H is irreducible and is of bounded ambiguity (as follows from Theorem 5.14 and Proposition 5.12). Therefore, one can apply the Perron-Frobenius theory to obtain the following theorem on entropy of L H . (For discussion on Perron-Frobenius theory see Chapter 4 in [17].) Theorem 5.18. If H ≤ F m is a conjugacy reduced and non cyclic finitely generated group, then the entropy of L H is equal to log λ, where λ is the maximal eigenvalue of the adjacency matrix M D H of D H . Remark 5. 19. If H is cyclic, then its entropy is equal to 0. If H is not conjugacy reduced, then there is g ∈ F m such that gHg −1 ≤ F m is conjugacy reduced, and h(H) = h(gHg −1 ).
Definition 5.20 (Base automaton). Let A be a finite automaton and let (V, E) be its underlying graph. Then the base automatonȂ of A is the automaton with the same underlying graph (V, E) and such that (1) all states ofȂ are at the same time initial and final states, (2) all edges of its Moore diagram are labeled with different labels. To be more specific, we assume that each e ∈ E is labeled by the letter x e .
To prove Theorem 5.18, we need the following fact. Additionally, by Theorem 5.14, D H is an ergodic automaton, which implies thatD H is ergodic as well. Moreover, the adjacency matrices of D H andD H coincide and are irreducible. Therefore, the proof of Theorem 5.18 follows from Perron-Frobenius theory. See, for example, Theorem 4.4.4. in [17].
Proof of Proposition 5.21. Assuming that A is an essential automaton, for every vertex v ∈ V in its underlying graph (V, E), there exist paths that connect v to an initial state and to a final state, respectively. For each v let us fix a pair (v,v) of such paths, wherev is a path that connects an initial state to v andv is a path that connects v to a final state. Also, for each path p in (V, E) let us denote by p − its origin and by p + is terminus, and by φ(p) let us denote its label in the Moore diagram of A and by ψ(p) let us denote its label in the Moore diagram ofȂ . Note that for each path p, ψ(p) ∈ L(Ȃ ). Then we have the following map Λ : L(Ȃ ) → L(A ): for each path p in (V, E), define Λ(ψ(p)) = φ(p − pp + ).
Proof. The left inequality is obvious. For the right inequality, one can take Proof. The left inequality of (5.8) is trivial. For the right one, note that for all n ∈ N, by Lemma 5.23 and by (5.7), we get Therefore, one can take c n in (5.8) to be such that From Lemma 5.24 we immediately get that there exists a constant C > 0 and a sequence Proposition 5.21 follows immediately from the last inequality.

Computing cogrowth series of H
With an arbitrary subgroup H of F m one associates the growth function where H n is the set of elements in H of length n with respect to the basis A of F m i.e. length of the element is the length of the reduced word from Σ * , where Σ = A ∪ A −1 representing the element. Also, following [11] we introduce the cogrowth series |H n |z n (6.1) The upper limit α H = lim sup is called the growth rate of H with respect to the basis A of F m . The radius of convergence of the series (6.1) is The cogrowth series (6.1) represents a function of complex variable z ∈ C analytic at disc around z = 0 of radius R ≥ 1 2m − 1 . The major questions are: Under what conditions H(z) is rational, algebraic and belongs to distinguished class of analytic functions, like for instance the class of D-finite functions studied in [6]. Let us recall the second author's argument from [11] on rationality of H(z) (when H is finitely generated) using a Nielsen system of generators of H. 6.1. The Nielsen basis approach. Let H = w 1 , · · · , w k and let {w i } k i=1 be a Nielsen system of generators. Given two reduced words u, v ∈ Σ * denote by β(u, v) the number of a-symbols (by a-symbols we mean elements of Σ) which will be cancelled when reducing the product u · v. For each i, 1 ≤ i ≤ k and ∈ {−1, 1} let H i, n be the set of words of length n in H that can be presented by S-reduced product of generators from S ∪ S −1 , S = {w 1 , · · · , w k } ending with w i . That is u ∈ H i, n ⇐⇒ u = red(w 1 i 1 · · · w l i l w i ) for some i 1 , · · · , i l , 1 , · · · , l ∈ {−1, 1} and in the last product none of the factors w j i j is inverse to the previous or next factor. Let us recall the corollary of the Statement 3.6 of [11].
Theorem 6.1. If H is a finitely generated subgroup of F m , then H(z) is rational. Then The equation (3.5) from [11] shows that the functions H i (z) satisfy the system of linear equations.
Taking the summation term to the left, we can rewrite the above system as follows where B is a 2k × 2k matrix with (i, ), (j, )-th entry where χ i (j) = 1 if i = j 0 otherwise Y and Z are 2k × 1 column vectors whose i-th entries are respectively. Observe that, when z ∈ R and |z| < 1, the determinant of the matrix B is non-zero. Hence the system (6.3) has a unique solution. Solving this system by standard methods (for instance using Cramer's rule) we get a solution.
where P i , Q i are polynomials and hence we get the rational expression for H(z) 6.2. The finite automata approach. In the previous section we have seen the construction of the DFA D H . It is a very old observation, going back to Chomsky and Schützenberger and even to A. Kolmogorov in view of his theory of finite Markov chains, that the growth series L H (z) of the language L(D H ) = L H which by Proposition 5.6 from the previous sections coincides with H(z) and are rational. Moreover, it can be computed using a standard method which often is called the transfer matrix method. See page 573 of [23] or Section V.5 and in particular Proposition V.6 of [5]. In this section, using the DFA D H we shall compute the cogrowth series H(z). At the end, we shall present the computations of growth using Nielsen set of generators. Let M be the adjacency matrix of the labelled directed graph G with t vertices. Assume that, for every vertex of the graph G, all outgoing edges carry distinct labels. Let M n be the nth power of M . It is well known that the (i, j)th entry of M n which we denote by M n (i, j) is just the number of paths of length n from the ith vertex to the jth vertex of G.
To evaluate the M n (i, j), we shall use the transfer matrix method. This method uses linear algebra to analyze the behavior of the M n (i, j). Following [23], we define the growth series (also known as generating function) of paths in G from i to j.
Observe that γ ij (G, z) is the (i, j)th entry of the matrix where I is the identity matrix of dimension t × t. In order to compute the (i, j)th entry γ ij (G, z), we recall Theorem 4.7.2 of [23].
Theorem 6.2. The growth series γ ij (G, z) is given by where (B : j, i) denotes the minor obtained by removing the jth row and ith column of B. Thus in particular γ ij (G, z) is a rational function of z whose degree is strictly less than the multiplicity n 0 of 0 as an eigenvalue of M.
Let   where I is the identity matrix of order |Q D H | × |Q D H |.
Proof. Recall that L(D H ) = L H = set of reduced elements of H which implies that H(z) = L H (z). For every w ∈ L H of length n ≥ 0 we have a unique admissible path p in the Moore diagram G D H of D H such that w = l(p). Therefore, we write the growth series L H (z) as Remark 6.4. Observe that from (5.6) and (6.2) one deduces where h(L H ) is the entropy of L H .

Examples
Let H be a nontrivial finitely generated subgroup of F m . In this section we shall compute the cogrowth H(z) of H.   Applying formula (6.6) we get and α H = 5.

7.2.
Computations of cogrowth using a Nielsen basis of H. In each of the examples below, we compute the growth series by solving the system (6.3) explained in the Section 6. We shall consider the same set of examples that we have discussed in the previous section (7.1).
(1) Let H be a finite index subgroup of F m .
(a) H = a 2 , b, c, aba −1 , aca −1 is a subgroup of F 3 . Solving the system