As with regular MDPs, -stationary MDPs can also be generalized with general environment and agent operators. The resulting model inherits the advantages of both approaches of generalization: a broad scale of decision problems can be discussed simultaneously, while the underlying environment is allowed to change over time as well. This family of MDPs will be called generalized -stationary MDPs or -MDPs for short.

Given a prescribed
, a *generalized
-MDP* is
defined by the tuple
, with
and
,
, if there exists a generalized MDP
such that
. Note
that the last assumption requires that the asymptotic distance of
the corresponding dynamic-programming operator sequence and
is small.

Note also, that the given definition is indeed a generalization of both concepts: setting , and for all yields a generalized MDP, while setting and for all simplifies to an -stationary MDP.