Formal language theory is defined to be the study of sets of words over finite alphabets. In formal language theory a word in a language can be accepted by a device (automaton) or generated by a grammar. The four languages of the Chomsky hierarchy (regular, context free, context sensitive and recursively enumerable languages) are typically studied.
A context-free grammar (CFG) is a formal grammar in which every production rule is of the form V → w
where V is a non-terminal symbol and w is a string consisting of terminals and/or non-terminals. The term "context-free" comes from the fact that the non-terminal V can always be replaced by w, regardless of the context in which it occurs. Context free languages are also those which are accepted by finite state automata.
A context-sensitive grammar is a formal grammar such that all its rules are of the form αAβ → αγβ
with nonterminal A and α and β strings of nonterminals and terminals. The name context-sensitive is explained by the α and β that form the context of A and determine whether A can be replaced with γ or not. Context sensitive languages can be accepted by linear bounded automata.
A recursively enumerable language is a formal language for which there exists a Turing machine (or other computable function) that will halt and accept when presented with any string in the language as input but may either halt and reject or loop forever when presented with a string not in the language.
A grammar is regular if and only if its rules are of the form X -> a or X -> aY, where X and Y are nonterminals and a is a terminal. Regular languages can be accepted by finite state automata.
Regular languages may also be defined using regular expressions, which consist of sets of string over a finite alphabet under the operations of union, concatenation and Kleene closure.