The time complexity for decoding and evaluating canonical forms of the hidden Markov model for N states and T observations is O(N2T). The training of HMM using the Baum-Welch algorithm is O(N2TM), where M is the number of iterations.
There are several options to improve the performance of HMM:
The training of the linear chain conditional random fields is implemented using the same dynamic programming techniques as HMM implementation (Viterbi, forward-backward passes). Its time complexity for training T data sequence, N labels y, and M weights/features is O(MTN2). The time complexity of the training of a CRF can be reduced by distributing the computation of the log likelihood and gradient over multiple nodes [7:13].