# The Hamiltonian Cycle Problem and Markov Decision Processes

We consider the famous Hamiltonian cycle problem (HCP) embedded in a Markov decision process (MDP). More speciﬁcally, we consider a moving object on a graph G where, at each vertex, a controller may select an arc emanating from that vertex according to a probabilistic decision rule. A stationary policy is simply a control where these decision rules are time invariant. Such a policy induces a Markov chain on the vertices of the graph. Therefore, HCP is equivalent to a search for a stationary policy that induces a 0−1 probability transition matrix whose non-zero entries trace out a Hamiltonian cycle in the graph. A consequence of this embedding is that we may consider the problem over a number of, alternative, convex - rather than discrete - domains. These include: (a) the space of stationary policies, (b) the more restricted but, very natural, space of doubly stochastic matrices induced by the graph, and (c) the associated spaces of so-called “occupational measures”.

This approach to the HCP has led to both theoretical and algorithmic approaches to the underlying HCP problem. In this presentation, we outline a selection of results generated by this line of research.