LLMs are Marokov Chains in the following sense: States are vectors of context-length many tokens. Then the model describes a transitions matrix: For a given
context-length sized vector of tokens it gives you probabilities for the next context-length sized vector of tokens.
The length of the input in tokens. For the simple case of tokens just being characters, a LLM does nothing but take a string of length n, the context length, and calculate for each character in the alphabet the probability that this character is the next character following the input. Then it picks one character at random according to that distribution, outputs it as the first character of the response, appends it to the input, discards the first character of the input to get it back to length n and then repeats the entire process to produce the next character of the response.