# The Open Mathematics, Statistics and Probability Journal

The Open Statistics & Probability Journal

(Discontinued)

ISSN: 2666-1489 ― Volume 10, 2020
RESEARCH ARTICLE

# An Entropy Rate Theorem for a Hidden Inhomogeneous Markov Chain

Yao Qi-feng1, Dong Yun2, Wang Zhong-Zhi1, *
1 School of Mathematics and Physics, Anhui University of Technology, Ma'anshan, 243002, P.R. China
2 School of Mathematics, Maanshan Teachers' College, Ma'anshan, 243041, P.R. China

## Abstract

### Objective:

The main object of our study is to extend some entropy rate theorems to a Hidden Inhomogeneous Markov Chain (HIMC) and establish an entropy rate theorem under some mild conditions.

### Introduction:

A hidden inhomogeneous Markov chain contains two different stochastic processes; one is an inhomogeneous Markov chain whose states are hidden and the other is a stochastic process whose states are observable.

### Materials and Methods:

The proof of theorem requires some ergodic properties of an inhomogeneous Markov chain, and the flexible application of the properties of norm and the bounded conditions of series are also indispensable.

### Results:

This paper presents an entropy rate theorem for an HIMC under some mild conditions and two corollaries for a hidden Markov chain and an inhomogeneous Markov chain.

### Conclusion:

Under some mild conditions, the entropy rates of an inhomogeneous Markov chains, a hidden Markov chain and an HIMC are similar and easy to calculate.

Keywords: Hidden inhomogeneous Markov chains, Entropy rate, Stochastic process, Cesaro average, Ergodicily, Norm.

### Article Information

#### Identifiers and Pagination:

Year: 2017
Volume: 8
First Page: 19
Last Page: 26
Publisher Id: TOSPJ-8-19
DOI: 10.2174/1876527001708010019

#### Article History:

Acceptance Date: 16/08/2017
Electronic publication date: 30/09/2017
Collection year: 2017

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

This research is supported in part by the NNSF of China (Grant No.11571142,11601191), the NNSF of Anhui Province (Grant No.1408085MA04,1608085QA03) and the RP of Anhui Provincial Department of Education (KJ2017A851) and the RIP for College Graduates of Anhui University of Technology (2015128).
* Address correspondence to this author at the School of Mathematics and Physics, Anhui University of Technology, Ma'anshan, 243002, P.R. China: Tel: +13855507696; E-mails: wzz30@ahut.edu.cn; zhongzhiw@126.com

## 1. INTRODUCTION

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. Hidden Markov models are especially known for their application in temporal pattern recognition such as speech, handwriting, gesture recognition [1T. Starner, and A. Pentland, "Real-Time American Sign Language Recognition from Video Using Hidden Markov Models", International Symposium on Computer Vision, pp. 265-265, 1995.
[http://dx.doi.org/10.1109/ISCV.1995.477012]
], part-of-speech tagging, musical score following [2B. Pardo, and W. Birmingham, "Modeling Form for On-line Following of Musical Performances", Proceedings of the Twentieth National Conference on Artificial Intelligence, vol. 2, no. 2, pp. 9-13, 2005.], partial [3L. Satish, and B.I. Gururaj, "Use of hidden Markov models for partial discharge pattern classification", IEEE Trans. Elect. Insulation, vol. 28, no. 2, pp. 172-182, 1993.
[http://dx.doi.org/10.1109/14.212242]
] and bioinformatics. Hidden Markov Chain (HMC) is derived from context mentioned hidden Markov model (HMM).

The entropy rate or source information rate of a stochastic process is, informally, the time density of the average information in a stochastic process and it plays great roles in information theory. Therefore, in recent years, many approaches have been adopted to try to improve theoretical integrity about the entropy rate of a hidden Markov chain. For instance, Ordentlich et al.[4E. Ordentlich, and T. Weiman, "On the optimality of symbol by symbol filtering and denoising", IEEE Trans. Theory, vol. 52, no. 1, pp. 19-40, 2006.
[http://dx.doi.org/10.1109/TIT.2005.860432]
] used Blackwell’s measure to compute the entropy rate and Egner et al.[5S. Egner, V. Balakirsky, L. Tolhuizen, S. Baggen, and H. Hollmann, "On the entropy rate of a hidden Markov model", International Symposium on Information Theory, p. 12, 2004.
[http://dx.doi.org/10.1109/ISIT.2004.1365047]
] introduced variation bounds. Ordentlich et al.[4E. Ordentlich, and T. Weiman, "On the optimality of symbol by symbol filtering and denoising", IEEE Trans. Theory, vol. 52, no. 1, pp. 19-40, 2006.
[http://dx.doi.org/10.1109/TIT.2005.860432]
], Jacquet et al.[6P. Jacquet, G. Seroussi, and W. Szpankowski, "On the entropy of a hidden Markov process", Conference on Data Compression, pp. 2-3, 2004.], Zuk et al.[7O. Zuk, I. Kanter, and E. Domany, "The entropy of a binary hidden Markov process", J. Stat. Phys., vol. 121, no. 3-4, pp. 343-360, 2005.
[http://dx.doi.org/10.1007/s10955-005-7576-y]
] etc studied the variation of the entropy rate as parameters of the underlying Markov chain. Liu et al.[8W. Liu, and W.G. Yang, "An extension of Shannon-McMillan theorem and some limit properties for nonhomogeneous Markov chains", Stochastic Process. Appl., vol. 61, no. 1, pp. 129-145, 1996.
[http://dx.doi.org/10.1016/0304-4149(95)00068-2]
] have given some limit properties of relative entropy and relative entropy density and Shannon-McMillan theorem for inhomogeneous Markov chains.

Motivated by the work above, the main object of our study is to extend some results mentioned above to HIMC and establish an entropy rate theorem under some mild conditions. For the definitions and methods, we learn from W.G. Yang et al.[9W.G. Yang, and J.F. Han, "The Convergence of the Cesaro Averages of Nonhomogeneous Markov Chains", J. Eng. Math., vol. 40, no. 4, pp. 537-544, 1997. [in Chinese].] and G.Q. Yang et al.[10G.Q. Yang, W.G. Yang, and X.T Wu, "The strong laws of large numbers for countable nonhomogeneous hidden Markov models. Communications in Statistics-Theory and Methods", pp. 1-12, 2016.
[http://dx.doi.org/10.1080/03610926.2016.1193203]
]. W.G. Yang et al.[9W.G. Yang, and J.F. Han, "The Convergence of the Cesaro Averages of Nonhomogeneous Markov Chains", J. Eng. Math., vol. 40, no. 4, pp. 537-544, 1997. [in Chinese].] proved a convergence theorem for the Cesaro averages for inhomogeneous Markov chains, gave a limit theorem of one functional of inhomogeneous Markov chains and discussed the application of this limit theorem on the Markovian decision process and the information theory. G.Q. Yang et al.[11D. Isaacson, and R. Madsen, Markov Chains Theory and Applications, Wiley: New York, 1976] introduced the notion of countable hidden inhomogeneous Markov models, obtained some properties for those Markov models and established two strong laws of large numbers for countable hidden inhomogeneous Markov models and its corollaries.

The remainder of the paper is organized as follows: Section 2 provides a brief description of the HIMC and related Lemmas. Section 3 gives the main results and proofs.

## 2. PRELIMINARIES

In this section we have some fundamental definitions and related preliminaries that are needed in the next section.

Let ζ = (ξ, η) be random vector, where ξ = (ξ 0, ξ1, ···), η= (η 0,η1, ···) are two different stochastic processes, η is hidden (η takes values in Y = {ω 0, ω1, ···}), and ξ is observable (ξ takes values in set X = {θ 0, θ1, ···}).

We first recall the definition of a hidden inhomogeneous Markov chain (HIMC) ζ = (ξ, η) = with hidden chain and observable process .

Definition 1. [11D. Isaacson, and R. Madsen, Markov Chains Theory and Applications, Wiley: New York, 1976] The process ζ = (ξ, η) is called an HIMC if it follows the following form and conditions:

1. Assume that a given time inhomogeneous Markov chain takes value in state space Y and its starting distribution be

 (2.1)

and transition matrices

 (2.2)

Where

2. For any n,

 (2.3)

Some necessary and sufficient conditions for (2.3) have been proved by G.Q. Yang et al [11D. Isaacson, and R. Madsen, Markov Chains Theory and Applications, Wiley: New York, 1976].

a. A necessary and sufficient condition for (2.3) holds that for any n, we have

 (2.4)

b. ζ = (ξ, η) is a inhomogeneous hidden Markov model if and only if n > 0

 (2.5)

c. ζ = (ξ, η) is a inhomogeneous hidden Markov model if and only if n > 0

 (2.6)

 (2.7)

Definition 2. Let h = (h 0, h1, ···) be a vector, define the norm of h by

Let A = (aij) be a square matrix, defining the norm of A by

Definition 3. [11D. Isaacson, and R. Madsen, Markov Chains Theory and Applications, Wiley: New York, 1976] Let Q be a transition matrix of a homogeneous Markov chain. We call Q strongly ergodic, if there exists a probability distribution π = (π 0, π1, π2, ···) on Y which satisfies that

 (2.8)

where h(0) is a starting vector, Obviously, (2.8) implies π Q = π, and we call π the stationary distribution determined by Q.

Definition 4. [9W.G. Yang, and J.F. Han, "The Convergence of the Cesaro Averages of Nonhomogeneous Markov Chains", J. Eng. Math., vol. 40, no. 4, pp. 537-544, 1997. [in Chinese].] Let be a hidden inhomogeneous Markov chain defined as above and H (ξ0 , η0 , ···, ξn, ηn) be the entropy of ζ. The entropy rate of ζ is defined by

For simplicity, we use the natural logarithm here, thus the entropy is measured by NATS. From the definitions of entropy and HIMC, we have

Lemma 1. [10G.Q. Yang, W.G. Yang, and X.T Wu, "The strong laws of large numbers for countable nonhomogeneous hidden Markov models. Communications in Statistics-Theory and Methods", pp. 1-12, 2016.
[http://dx.doi.org/10.1080/03610926.2016.1193203]
] Let η = (η 0, η1, ···) be an inhomogeneous Markov chain with transition matrices {Qn, n≥1}. Let Q be a periodic strong ergodic random matrix. Let c= (c1, c2, ···) be a left eigenvector of Q and the unique solution of equations cQ = c and jcj= 1. Let B be a constant random matrix, where each row of B is c. If

 (2.9)

then

 (2.10)

holds for any m N, where (I is the identical matrix).

Lemma 2.Let {gn, n ≥ 1} and g be column vectors with real number entries, if

 (2.11)

then there exists a subsequence {gnk, k ≥ 1} of {gn, n ≥ 1} such that

 (2.12)

Proof. Assume (2.11) holds, We have

 (2.13)

Hence

 (2.14)

holds for any m N.

choose Xk > 0 with Xk ↓ 0, by (2.14), M N , such that

 (2.15)

Therefore, there exists n1M such that ||gn1g|| ≤ X1, by eq. (2.14), we have

 (2.16)

Therefore, there exists n2n1 such that ||gn2g|| ≤ X2. Generally, we can get a subsequence {gnk, k ≥ 1} of {gn, n ≥ 1} such that || gnk − g || ≤ Xk. (2.12) follows immediately.

## 3. MAINSTREAM

Theorem 1.Let be an HIMC defined as above, Q = (q(ωi, ωj)) be another transition matrix and assume that Q is periodic strongly ergodic. Let

 (3.1)

 (3.2)

where, gn (i), g(i) are the i th-coordinate of column vectors gn and g resp., {||gn||, n ≥ 1} are bounded. If

 (3.3)

and

 (3.4)

then the entropy rate of ζ =(ξ, η) exists and

 (3.5)

where, π = (π 0, π1, π2···) is the unique stationary distribution determined by Q.

Proof. Let h(k−1) be a row vector with the ith coordinate P (ηk−1 = ωi). Hence, using the properties and the definition of gk, we have

 (3.6)

Just take B to be a constant random matrix whose rows are equal to π. Note that π = h(0)B, where h(0) is a starting distribution of the Markov chain. Since

 (3.7)

where, Q(0, k−1) = Q1···Qk−1, Q(0, 0) = I (I is the identical matrix), By Eqs. (3.3) and (3.7) and

Lemma 1, we have

 (3.8)

By Eq. (3.4) and Lemma 2, there exists a subsequence of such that

 (3.9)

hence ||g|| is finite. By Eq. (3.8) and the entropy property, we have

This completes the proof of Theorem 1.

Remark. Theorem 1 gives a method to compute the entropy rate of an HIMC under some mild conditions.

Corollary 1.Let be a hidden homogeneous Markov chain with periodic strongly ergodic transition matrix Q(q(ωi, ωj)) and transition probability p(θlj), where ωi, ωj Y, θl X. Let

 (3.10)

where, g(i) is the i th-coordinate of column vectors g, ||g|| is bounded. Then the entropy rate ofζ = (ξ, η) exists and

 (3.11)

where, π = (π 0, π1, π2···) is the unique stationary distribution determined by Q.

As a corollary we can get the following theorem of W.G. Yang et al.[9W.G. Yang, and J.F. Han, "The Convergence of the Cesaro Averages of Nonhomogeneous Markov Chains", J. Eng. Math., vol. 40, no. 4, pp. 537-544, 1997. [in Chinese].] for an inhomogeneous Markov chain.

Corollary 2. [9W.G. Yang, and J.F. Han, "The Convergence of the Cesaro Averages of Nonhomogeneous Markov Chains", J. Eng. Math., vol. 40, no. 4, pp. 537-544, 1997. [in Chinese].] Let be an inhomogeneous Markov chain, Q = (q(ωi, ωj)) be another transition matrix and assume that Q is periodic strongly ergodic. Let

where, gn(i), g(i) are the i th-coordinate of column vectors gn and g resp., {||gn||, n≥1} are bounded. If

and

then the entropy rate of η exists and

where, π = (π 0, π1, π2···) is the unique stationary distribution determined by Q.

## CONCLUSION

We quote the definition of HIMC, norm, stationary distribution and entropy rate which contain rich content. Lemma 1 gives a limit property of state transition matrices of an inhomogeneous Markov chain. Lemma 2 gives a limit property of vector series. The process of proof references the situation of inhomogeneous Markov chains. Key to the proof is that Cesaro average of state distribution of a Markov chain takes stationary distribution as the limit. Under some mild conditions, the entropy rates of an inhomogeneous Markov chain, a hidden Markov chains and a HIMC are similar and easy to calculate.

Not applicable.

### CONFLICT OF INTEREST

The authors declare no conflict of interest, financial or otherwise.

## ACKNOWLEDGEMENTS

The authors are thankful to the referees for their valuable suggestions and the editor for encouragement.

## REFERENCES

 [1] T. Starner, and A. Pentland, "Real-Time American Sign Language Recognition from Video Using Hidden Markov Models", International Symposium on Computer Vision, pp. 265-265, 1995.[http://dx.doi.org/10.1109/ISCV.1995.477012] [2] B. Pardo, and W. Birmingham, "Modeling Form for On-line Following of Musical Performances", Proceedings of the Twentieth National Conference on Artificial Intelligence, vol. 2, no. 2, pp. 9-13, 2005. [3] L. Satish, and B.I. Gururaj, "Use of hidden Markov models for partial discharge pattern classification", IEEE Trans. Elect. Insulation, vol. 28, no. 2, pp. 172-182, 1993.[http://dx.doi.org/10.1109/14.212242] [4] E. Ordentlich, and T. Weiman, "On the optimality of symbol by symbol filtering and denoising", IEEE Trans. Theory, vol. 52, no. 1, pp. 19-40, 2006.[http://dx.doi.org/10.1109/TIT.2005.860432] [5] S. Egner, V. Balakirsky, L. Tolhuizen, S. Baggen, and H. Hollmann, "On the entropy rate of a hidden Markov model", International Symposium on Information Theory, p. 12, 2004.[http://dx.doi.org/10.1109/ISIT.2004.1365047] [6] P. Jacquet, G. Seroussi, and W. Szpankowski, "On the entropy of a hidden Markov process", Conference on Data Compression, pp. 2-3, 2004. [7] O. Zuk, I. Kanter, and E. Domany, "The entropy of a binary hidden Markov process", J. Stat. Phys., vol. 121, no. 3-4, pp. 343-360, 2005.[http://dx.doi.org/10.1007/s10955-005-7576-y] [8] W. Liu, and W.G. Yang, "An extension of Shannon-McMillan theorem and some limit properties for nonhomogeneous Markov chains", Stochastic Process. Appl., vol. 61, no. 1, pp. 129-145, 1996.[http://dx.doi.org/10.1016/0304-4149(95)00068-2] [9] W.G. Yang, and J.F. Han, "The Convergence of the Cesaro Averages of Nonhomogeneous Markov Chains", J. Eng. Math., vol. 40, no. 4, pp. 537-544, 1997. [in Chinese]. [10] G.Q. Yang, W.G. Yang, and X.T Wu, "The strong laws of large numbers for countable nonhomogeneous hidden Markov models. Communications in Statistics-Theory and Methods", pp. 1-12, 2016. [http://dx.doi.org/10.1080/03610926.2016.1193203] [11] D. Isaacson, and R. Madsen, Markov Chains Theory and Applications, Wiley: New York, 1976

### Endorsements

"Open access will revolutionize 21st century knowledge work and accelerate the diffusion of ideas and evidence that support just in time learning and the evolution of thinking in a number of disciplines."

Daniel Pesut
(Indiana University School of Nursing, USA)

"It is important that students and researchers from all over the world can have easy access to relevant, high-standard and timely scientific information. This is exactly what Open Access Journals provide and this is the reason why I support this endeavor."

Jacques Descotes
(Centre Antipoison-Centre de Pharmacovigilance, France)

"Publishing research articles is the key for future scientific progress. Open Access publishing is therefore of utmost importance for wider dissemination of information, and will help serving the best interest of the scientific community."

Patrice Talaga
(UCB S.A., Belgium)

"Open access journals are a novel concept in the medical literature. They offer accessible information to a wide variety of individuals, including physicians, medical students, clinical investigators, and the general public. They are an outstanding source of medical and scientific information."

Jeffrey M. Weinberg
(St. Luke's-Roosevelt Hospital Center, USA)

"Open access journals are extremely useful for graduate students, investigators and all other interested persons to read important scientific articles and subscribe scientific journals. Indeed, the research articles span a wide range of area and of high quality. This is specially a must for researchers belonging to institutions with limited library facility and funding to subscribe scientific journals."

Debomoy K. Lahiri
(Indiana University School of Medicine, USA)

"Open access journals represent a major break-through in publishing. They provide easy access to the latest research on a wide variety of issues. Relevant and timely articles are made available in a fraction of the time taken by more conventional publishers. Articles are of uniformly high quality and written by the world's leading authorities."

Robert Looney

"Open access journals have transformed the way scientific data is published and disseminated: particularly, whilst ensuring a high quality standard and transparency in the editorial process, they have increased the access to the scientific literature by those researchers that have limited library support or that are working on small budgets."

Richard Reithinger
(Westat, USA)

"Not only do open access journals greatly improve the access to high quality information for scientists in the developing world, it also provides extra exposure for our papers."

J. Ferwerda
(University of Oxford, UK)

"Open Access 'Chemistry' Journals allow the dissemination of knowledge at your finger tips without paying for the scientific content."

Sean L. Kitson
(Almac Sciences, Northern Ireland)

"In principle, all scientific journals should have open access, as should be science itself. Open access journals are very helpful for students, researchers and the general public including people from institutions which do not have library or cannot afford to subscribe scientific journals. The articles are high standard and cover a wide area."

Hubert Wolterbeek
(Delft University of Technology, The Netherlands)

"The widest possible diffusion of information is critical for the advancement of science. In this perspective, open access journals are instrumental in fostering researches and achievements."

Alessandro Laviano
(Sapienza - University of Rome, Italy)

"Open access journals are very useful for all scientists as they can have quick information in the different fields of science."

Philippe Hernigou
(Paris University, France)

"There are many scientists who can not afford the rather expensive subscriptions to scientific journals. Open access journals offer a good alternative for free access to good quality scientific information."

Fidel Toldrá
(Instituto de Agroquimica y Tecnologia de Alimentos, Spain)

"Open access journals have become a fundamental tool for students, researchers, patients and the general public. Many people from institutions which do not have library or cannot afford to subscribe scientific journals benefit of them on a daily basis. The articles are among the best and cover most scientific areas."

M. Bendandi
(University Clinic of Navarre, Spain)

"These journals provide researchers with a platform for rapid, open access scientific communication. The articles are of high quality and broad scope."

Peter Chiba
(University of Vienna, Austria)

"Open access journals are probably one of the most important contributions to promote and diffuse science worldwide."

Jaime Sampaio
(University of Trás-os-Montes e Alto Douro, Portugal)

"Open access journals make up a new and rather revolutionary way to scientific publication. This option opens several quite interesting possibilities to disseminate openly and freely new knowledge and even to facilitate interpersonal communication among scientists."

Eduardo A. Castro
(INIFTA, Argentina)

"Open access journals are freely available online throughout the world, for you to read, download, copy, distribute, and use. The articles published in the open access journals are high quality and cover a wide range of fields."

Kenji Hashimoto
(Chiba University, Japan)

"Open Access journals offer an innovative and efficient way of publication for academics and professionals in a wide range of disciplines. The papers published are of high quality after rigorous peer review and they are Indexed in: major international databases. I read Open Access journals to keep abreast of the recent development in my field of study."

Daniel Shek
(Chinese University of Hong Kong, Hong Kong)

"It is a modern trend for publishers to establish open access journals. Researchers, faculty members, and students will be greatly benefited by the new journals of Bentham Science Publishers Ltd. in this category."

Jih Ru Hwu
(National Central University, Taiwan)