Computability and Complexity Notes
Introduction
This is the course notes for Computability and Complexity by Prof. Dr. Peter Zaspel at Jacobs University Bremen.
Computability Theory
The ChurchTuring Thesis
There exists
 an intuitive idea of an algorithm and
 a formalization via DTM/NTM/RAM
The hypothesis that both are equal is called ChurchTuring Thesis.
Turing Machines

Formal definition of a TM:
 A Turing machine is a 7tuple, $M=(Q,\Sigma,\Gamma,\delta,q_0,q_{accept},q_{reject})$, where $Q,\Sigma,\Gamma$ are all finite sets and
 $Q$ is the set of states
 $\Sigma$ is the input alphabet not containing the blank symbol $\sqcup$
 $\Gamma$ is the tape alphabet, where $\sqcup \in \Gamma$ and $\Sigma \subseteq \Gamma$
 $\delta :Q\times \Gamma \rightarrow Q\times \Gamma \times \{L,R\}$ is the transition function
 $q_0 \in Q$ is the start state
 $q_{accept} \in Q$ is the accept state, where $q_{accept} \neq q_{reject}$
 $q_{reject} \in Q$ is the reject state.
 A Turing machine is a 7tuple, $M=(Q,\Sigma,\Gamma,\delta,q_0,q_{accept},q_{reject})$, where $Q,\Sigma,\Gamma$ are all finite sets and

Configuration of a TM:
 Let M be a TM, $u,v \in \Gamma^*$ and $q \in Q$. The setting $uqv$ is a configuration of the TM, where
 $u$ is the current tape content to the left of the head
 $q$ is the current state
 $v$ is the current tape content below ($v_1$) and to the right of the head (being terminated by blanks $\sqcup$)
 Let M be a TM, $u,v \in \Gamma^*$ and $q \in Q$. The setting $uqv$ is a configuration of the TM, where

A configuration $C_1$ yields a configuration $C_2$ if M can legally move from $C_1$ to $C_2$ in a single step.
 Leftward move: $uaq_i bv$ yields $uq_j acv$ iff $\delta(q_i, b) = (q_j, c, L)$
 Rightward move: $uaq_i bv$ yields $uacq_j v$ iff $\delta(q_i, b) = (q_j, c, R)$
 Lmove, left tape end: $q_i bv$ yields $q_j cv$ iff $\delta(q_i, b) = (q_j, c, L)$
 Rmove, left tape end: $q_i bv$ yields $cq_j v$ iff $\delta(q_i, b) = (q_j, c, R)$

Characterization of configuration:
 Start configuration: $u = \epsilon$, i.e. head is at leftmost end of tape
 Accepting configuration: $q=q_{accept}$
 Rejecting configuration: $q=q_{reject}$
 Halting configuration: accepting or rejecting configuration
 Remark: there’s exactly one accepting/rejecting state, but there can be many accepting/rejecting configurations.

Accepted input and recognized language:
 M accepts $w$ if a sequence of configurations $C_1 , C_2 , …, C_k$ exists, such that
 $C_1$ is the start configuration
 $C_i$ yields $C_{i+1}$
 $C_k$ is the accepting configuration
 The language of M is $L(M) = \{w \in \Sigma^*  M \ \text{accepts} \ w\}$. We say L is recognized by M.
 M accepts $w$ if a sequence of configurations $C_1 , C_2 , …, C_k$ exists, such that

Turingrecognizable or recursively enumerable language, if some TM recognizes it.

M halts on $w$ if a sequence of configurations $C_1 , C_2 , …, C_k$ exists, such that
 $C_1$ is the start configuration
 $C_i$ yields $C_{i+1}$
 $C_k$ is the halting configuration

Decider: if it halts on all inputs $w \in \Sigma^*$. A decider M that recognizes L is said to decide L.

Decidable/Recursive/Turingdecidable language: if there exists a TM (i.e. a decider) that decides it.
Variants of Turing Machines
 All variants have the same power as a standard TM.
Mutlitape TM

The definition differs from the standard TM that: $\delta :Q\times \Gamma^k \rightarrow Q\times \Gamma^k \times \{L,R\}^k$

For every multitape TM M, there exits a singletape TM S that recognizes the same language.

Turingrecognizable iff some multitape TM recognizes it.
Nondeterministic TM

The definition differs from the standard TM that: $\delta :Q\times \Gamma \rightarrow \mathcal{P}(Q\times \Gamma \times \{L,R\})$

For definition of yielding configurations, it differs from the standard TM that we need to change from $=$ to $\in$. (One config can yield several or no config.)

Exactly the same definition for accepted input and recognized language.

For every NTM N, there exists a deterministic (singletape) TM S that recognizes the same language.

Turingrecognizable iff some NTM recognizes it.

Nondeterministic decider: if all branches halt on all inputs.
Random Access Machine

k  command counter; a  accumulator; c  register; x  input; $h_r$  position on input; y  output.

Some commands:
 read: $a = x_{h_r}; \ h_r = h_r + 1; k = k + 1$
 print a: $y=ay; k = k + 1$

Input tape: readonly and read once, can only move to the right

Program: finite sequence of commands.

There exists a RAM that terminates and outputs 1 whenever a TM M accepts. The RAM terminates and outputs 0 whenever M rejects.

Let R be an RAM. For an arbitrary input x it outputs y(x). There exists a multitape TM that accepts the input x after writing y(x) on one of its tapes.

TMs and RAMs have the same expressive power.
Decidability
Decidable Languages

The language $A_{DFA}=\{\langle B,w \rangle \text{B is a DFA that accepts input string } w \}$ is decidable. (The problem of testing whether a given DFA accepts a given string is decidable.)

The language $A_{NFA}=\{\langle B,w \rangle \text{B is a NFA that accepts input string } w \}$ is decidable. (Convert to NFA to DFA.)

The language $A_{REX}=\{\langle B,w \rangle \text{B is a regular expression that generates string } w \}$ is decidable. (Convert R to DFA.)

The language $E_{DFA}=\{\langle A \rangle \text{A is a DFA and L(A) = } \emptyset \}$ is decidable. (No accept states are marked.)

The language $EQ_{DFA}=\{\langle A,B \rangle \text{A and B are DFAs and L(A)=L(B) } \}$ is decidable. (Build C that’s empty iff A=B. Use $E_{DFA}$ on C.)

The language $A_{CFG}=\{\langle G,w \rangle \text{A is a CFG that generates string } w \}$ is decidable. (Convert G to CFG in Chomsky Normal Form; enumerate all derivations within 2n1 steps.)

The language $E_{CFG}=\{\langle A \rangle \text{A is a CFG and L(G) = } \emptyset \}$ is decidable. (Mark all terminal symbols. Follow the rules until no new variables are marked. If the start symbol is not marked, accept.)

But $EQ_{CFG}$ is not decidable.

Every contextfree language L is decidable. (Build a TM with CFG G with $A_{CFG}$)
Undecidability of $A_{TM}$

The language $A_{TM}=\{\langle M,w \rangle \text{M is a TM and M accepts } w \}$ is undecidable. (Proof by diagonalization. A TM D($\langle M\rangle$) that accept if M doesn’t accept $\langle M\rangle$, which leads to contradiction.)

The language $A_{TM}=\{\langle M,w \rangle \text{M is a TM and M accepts } w \}$ is Turingrecognizable. (Build TM U. Simulate M on w. If M accepts, U accepts. If M rejects, U rejects.) (U is an instance of Universal TM) (U can run forever.)

Sets:
 A and B have the same size/cardinality if there exists a bijective function f with $f: A \rightarrow B$.
 A is countable if it’s either finite or has the same size as $\mathbb{N}$. Otherwise it’s uncountable.
 The set $\mathbb{Q}$ of rational numbers is countable.
 The set $\mathbb{R}$ of real numbers is uncountable. (Proof by diagnolization.)

There exist languages that are not Turingrecognizable.
 Fixing a non empty alphabet $\Sigma$, the set of all TMs over $\Sigma$ is countable. (M can be mapped to binary strings, which are countable (enumerate).)
 The set $\mathcal{B}$ of all infinite binary sequences is uncountable. (Proof by diagnolization.)
 Fixing a nonempty alphabet $\Sigma$, the set of all languages over $\Sigma$ is uncountable. (Construct a bijection between the set of all languages of $\Sigma$ and $\mathcal{B}$ to have the characteristic string. The latter is uncountable, the former is also uncountable.)
 The set of TMs is countable but the set of languages is uncountable. So there are languages not described by TMs.
A TuringUnrecognizable Language

L is coTuringrecognizable if $\overline L$ is Turingrecognizable.

A language is decidable iff it’s Turingrecognizable and coTuringrecognizable. (Run two TMs in parallel.)

The language $\overline{A_{TM}}$ is not Turing recognizable. (L is undecidable $\Rightarrow$ L not Turing recognizable $\vee$ $\overline L$ not Turing recognizable $\Rightarrow$ $\overline L$ not Turing recognizable)
Reducibility

The language $HALT_{TM}=\{\langle M,w \rangle \text{M is a TM and M halts on input } w \}$ is undecidable. (Proof by contradiction. Assume there’s a decider R for $HALT_{TM}$. Construct a decider for $A_{TM}$ (“R rejects, reject” eliminates the loop case.))

A Computable function f $\Rightarrow$ there exists some TM M, on every input w, halts with just f(w) on its tape.

A is mapping reducible to B ($A \leq_{m} B$) if there’s a computable function f between them. f is called a reduction from A to B.

If $A \leq_{m} B$ and B is decidable, A is decidable.

If $A \leq_{m} B$ and A is undecidable, B is undecidable.

We can also prove $HALT_{TM}$ to be undecidable by finding a reduction from $A_{TM}$. ($w \in A_{TM} \Leftrightarrow f(w) \in HALT_{TM}$)

The language $REGULAR_{TM}=\{\langle M \rangle \text{M is a TM and L(M) is a regular language} \}$ is undecidable.

If $A \leq_{m} B$ and B is Turingrecognizable, A is Turingrecognizable.

If $A \leq_{m} B$ and A is not Turingrecognizable, B is not Turingrecognizable.

The language $EQ_{TM}=\{\langle M_1, M_2 \rangle M_1 \ M_2 \text{ are TMs and } L(M_1) = L(M_2) \}$ is not Turingrecognizable. (Proof by $A_{TM} \leq_{m} \overline{EQ_{TM}}$ ($M_1$ rejects everything and $M_2$ accepts if M accepts w.))
Complexity Theory
Time Complexity
Measuring Complexity

The running time or time complexity of M is the function f(n)’s maximum number of steps that M uses on any input of length n. (Worst case.)

Big O and small o:
 Big O  there exists a number c and $n_0$
 Small o  for any c > 0 there exists $n_0$

Polynomial bound and exponential bound:
 Polynomial bound  $f(n) = O(n^p)$
 Exponential bound  $f(n) = 2^{(O(n^\delta))}$

The time complexity class TIME(t(n)) is the set of all languages that are decidable by an O(t(n)) time singletape DTM.

For every t(n) time deterministic multitape TM there exists an equivalent $O(t^2(n))$ time deterministic singletape TM.
 This only gives existence. But there can be better implementations.

The running time of nondeterministic decider N is the maximum number of steps f(n) uses on any branch.

For every t(n) NTM there exists an equivalent $2^{O(t(n))}$ time DTM. (By BFS.)
The Class P

$P = \bigcup_{k \in \mathbb{N}_0} TIME(n^k)$

The PATH problem/language is The language $PATH=\{\langle G,s,t \rangle \text{G is a directed graph that has a path from s to t} \}$. $PATH \in P$ (BFS)

$RELPRIME =\{\langle x,y \rangle \text{x and y are relatively prime} \}$. $RELPRIME \in P$ (Euclidean algorithm to find the gcd (log depth, unary > O(n)).)

Every contextfree language is in P. (DP.)
The Class NP

The nondeterministic time complexity class NTIME(t(n)) is the set of all languages that are decidable by an O(t(n)) time NTM.

$NP = \bigcup_{k \in \mathbb{N}_0} NTIME(n^k)$

$P \subset NP$

A verifier for a language A is a deterministic decider $V_A$ that takes as input $\langle w,c \rangle$ such that it holds: $w \in A$ iff there exists c such that $\langle w,c \rangle$ is accepted by $V_A$. c is called certificate.

$V_A$ is a polynomial time verifier if it has a polynomial running time in the size of its input. A is called polynomially verifiable if there exists a polynomial time verifier for it.

An alternative definition for NP: NP is the class of languages for which there exist polynomial time verifiers.

There exists a polynomial time verifier for A iff there exists a nondeterministic polynomial time TM that recognizes A.

A Hamiltonian path in G is a directed path that goes through each node exactly once. $HAMPATH \in NP$.

A clique is a subgraph such that there exist edges between all paris of nodes. A kclique is a clique with k vertices. $CLIQUE \in NP$

$SUBSETSUM \in NP$
Polynomial Time Reducibility

A is polynomial time (mapping) reducible to B ($A \leq_P B$) if there’s a polynomial time computable function such that $w \in A$ iff $f(w) \in B$.

If $A \leq_P B$ and $B \in P$, then $A \in P$.
NPcompleteness

A language B is NPcomplete if:
 $B \in NP$ and
 $A \leq_P B$ for all $A \in NP$

If B is NPcomplete and $B \in P$, then $P = NP$.

If B is NPcomplete and $B \leq_P C$ for a $C \in NP$, then C is also NPcomplete.

F is satisfiable if F has at least one model.

CookLevin: $SAT =\{\langle F \rangle \text{F is a satisfiable Boolean formula}\}$ is NPcomplete. The proof:
 $F = F_{cell} \wedge F_{start} \wedge F_{move} \wedge F_{accept}$

$3SAT =\{\langle F \rangle \text{F is in 3CNF and is satisfiable} \}$ is NPcomplete.

For all boolean formulas F in CNF there exists a boolean formula $F'$ in 3CNF such that F is satisfiable iff $F'$ is satisfiable. $F'$ can be built from F in polynomial time.

CLIQUE is NPcomplete. (Proof $3SAT \leq_P CLIQUE$)
Hierarchy Theorems

Time constructible: f(n) at least $O(n \log n)$. f is time constructible if a computable function exists that maps the string $1^n$ (the unary representation) to the binary representation of f(n) in time $O(f(n))$. (output could also be unary.) (normal functions are all time constructible)

Let f be a timeconstructible function. There exists a language A that’s decidable in $O(f(n))$ but not decidable in time $o(f(n)/\log f(n))$. (By increasing the time complexity by more than a logarithm of it, we increase the size of the time complexity class.)

Let $f_1 \ f_2$ be timeconstructible and $f_1(n) = o(f_2(n)/\log f_2(n))$, then $TIME(f_1(n)) \subsetneq TIME(f_2(n))$

For $1 \leq \epsilon_1 < \epsilon_2$ and $n^{\epsilon_1}$ at least in $O(n\log n)$, it holds $TIME(n^{\epsilon_1}) \subsetneq TIME(n^{\epsilon_2})$

$EXPTIME = \bigcup_{k \in \mathbb{N}_0} TIME(2^{n^k})$.

$P \subsetneq EXPTIME$. (Even if P=NP, there are problems of exponential runtime.)
Space Complexity
Measuring Space Complexity

Space complexity is the maximum number of tape cells that are scanned by M for an arbitrary input o length n.

SPACE(f(n)) is the set of all languages decided by $O(f(n))$ DTM.

$SAT \in SPACE(n)$.

NSAPCE is the set of all languages decided by $O(f(n))$ NTM.

The language $ALL_{NFA}=\{\langle A \rangle \text{A is a NFA and } L(A) = \Sigma^*\}$. $\overline{ALL_{NFA}} \in NSPACE(n)$

f(n) at least $O(\log n)$. F is space constructible if a computable function exists that maps the string $1^n$ (the unary representation) to the binary representation of f(n) in space $O(f(n))$.

Savitch’s Theorem: $NSPACE(f(n)) \subseteq SPACE(f^2(n))$.
 Proof: Instead of calculating all branches, ask whether start config yields accepting config and check recursively, which leads to space reuse.
PSPACE and Related Classes

$PSPACE = \bigcup_{k} SPACE(n^k)$ (deterministic)

$NPSPACE = \bigcup_{k} SPACE(n^k)$ (nondeterministic)

$PSPACE = NPSPACE$

$P \subseteq PSPACE$ and $NP \subseteq NPSPACE$ and $NP \subseteq PSPACE$.

PSPACEcomplete if:
 $B \in PSPACE$ and
 $A \leq_p B$ for all $A \in PSPACE$ (only polytime reduction; using polyspace would be as hard as the full class)

The language $TQBF=\{\langle F \rangle \text{F is a true fully quantified Boolean formula}\}$ is PSPACEcomplete.

$P \subseteq NP \subseteq PSPACE = NPSPACE \subseteq EXPTIME$ (one of this subsets have to be a proper subset due to $P \subsetneq EXPTIME$)
References

Introduction to the Theory of Computation by Michael Sipser

Course slides/notes from Prof. Dr. Peter Zaspel at Jacobs University Bremen.