# Computability and Complexity Notes

Contents

## Introduction

This is the course notes for Computability and Complexity by Prof. Dr. Peter Zaspel at Jacobs University Bremen.

## Computability Theory

### The Church-Turing Thesis

There exists

1. an intuitive idea of an algorithm and
2. a formalization via DTM/NTM/RAM

The hypothesis that both are equal is called Church-Turing Thesis.

#### Turing Machines

1. Formal definition of a TM:

• A Turing machine is a 7-tuple, $M=(Q,\Sigma,\Gamma,\delta,q_0,q_{accept},q_{reject})$, where $Q,\Sigma,\Gamma$ are all finite sets and
1. $Q$ is the set of states
2. $\Sigma$ is the input alphabet not containing the blank symbol $\sqcup$
3. $\Gamma$ is the tape alphabet, where $\sqcup \in \Gamma$ and $\Sigma \subseteq \Gamma$
4. $\delta :Q\times \Gamma \rightarrow Q\times \Gamma \times \{L,R\}$ is the transition function
5. $q_0 \in Q$ is the start state
6. $q_{accept} \in Q$ is the accept state, where $q_{accept} \neq q_{reject}$
7. $q_{reject} \in Q$ is the reject state.
2. Configuration of a TM:

• Let M be a TM, $u,v \in \Gamma^*$ and $q \in Q$. The setting $uqv$ is a configuration of the TM, where
• $u$ is the current tape content to the left of the head
• $q$ is the current state
• $v$ is the current tape content below ($v_1$) and to the right of the head (being terminated by blanks $\sqcup$)
3. A configuration $C_1$ yields a configuration $C_2$ if M can legally move from $C_1$ to $C_2$ in a single step.

• Leftward move: $uaq_i bv$ yields $uq_j acv$ iff $\delta(q_i, b) = (q_j, c, L)$
• Rightward move: $uaq_i bv$ yields $uacq_j v$ iff $\delta(q_i, b) = (q_j, c, R)$
• L-move, left tape end: $q_i bv$ yields $q_j cv$ iff $\delta(q_i, b) = (q_j, c, L)$
• R-move, left tape end: $q_i bv$ yields $cq_j v$ iff $\delta(q_i, b) = (q_j, c, R)$
4. Characterization of configuration:

• Start configuration: $u = \epsilon$, i.e. head is at leftmost end of tape
• Accepting configuration: $q=q_{accept}$
• Rejecting configuration: $q=q_{reject}$
• Halting configuration: accepting or rejecting configuration
• Remark: there’s exactly one accepting/rejecting state, but there can be many accepting/rejecting configurations.
5. Accepted input and recognized language:

• M accepts $w$ if a sequence of configurations $C_1 , C_2 , …, C_k$ exists, such that
1. $C_1$ is the start configuration
2. $C_i$ yields $C_{i+1}$
3. $C_k$ is the accepting configuration
• The language of M is $L(M) = \{w \in \Sigma^* | M \ \text{accepts} \ w\}$. We say L is recognized by M.
6. Turing-recognizable or recursively enumerable language, if some TM recognizes it.

7. M halts on $w$ if a sequence of configurations $C_1 , C_2 , …, C_k$ exists, such that

1. $C_1$ is the start configuration
2. $C_i$ yields $C_{i+1}$
3. $C_k$ is the halting configuration
8. Decider: if it halts on all inputs $w \in \Sigma^*$. A decider M that recognizes L is said to decide L.

9. Decidable/Recursive/Turing-decidable language: if there exists a TM (i.e. a decider) that decides it.

#### Variants of Turing Machines

1. All variants have the same power as a standard TM.
##### Mutlitape TM
1. The definition differs from the standard TM that: $\delta :Q\times \Gamma^k \rightarrow Q\times \Gamma^k \times \{L,R\}^k$

2. For every multitape TM M, there exits a single-tape TM S that recognizes the same language.

3. Turing-recognizable iff some multitape TM recognizes it.

##### Nondeterministic TM
1. The definition differs from the standard TM that: $\delta :Q\times \Gamma \rightarrow \mathcal{P}(Q\times \Gamma \times \{L,R\})$

2. For definition of yielding configurations, it differs from the standard TM that we need to change from $=$ to $\in$. (One config can yield several or no config.)

3. Exactly the same definition for accepted input and recognized language.

4. For every NTM N, there exists a deterministic (single-tape) TM S that recognizes the same language.

5. Turing-recognizable iff some NTM recognizes it.

6. Nondeterministic decider: if all branches halt on all inputs.

##### Random Access Machine
1. k - command counter; a - accumulator; c - register; x - input; $h_r$ - position on input; y - output.

2. Some commands:

• read: $a = x_{h_r}; \ h_r = h_r + 1; k = k + 1$
• print a: $y=ay; k = k + 1$
3. Input tape: read-only and read once, can only move to the right

4. Program: finite sequence of commands.

5. There exists a RAM that terminates and outputs 1 whenever a TM M accepts. The RAM terminates and outputs 0 whenever M rejects.

6. Let R be an RAM. For an arbitrary input x it outputs y(x). There exists a multi-tape TM that accepts the input x after writing y(x) on one of its tapes.

7. TMs and RAMs have the same expressive power.

### Decidability

#### Decidable Languages

1. The language $A_{DFA}=\{\langle B,w \rangle| \text{B is a DFA that accepts input string } w \}$ is decidable. (The problem of testing whether a given DFA accepts a given string is decidable.)

2. The language $A_{NFA}=\{\langle B,w \rangle| \text{B is a NFA that accepts input string } w \}$ is decidable. (Convert to NFA to DFA.)

3. The language $A_{REX}=\{\langle B,w \rangle| \text{B is a regular expression that generates string } w \}$ is decidable. (Convert R to DFA.)

4. The language $E_{DFA}=\{\langle A \rangle| \text{A is a DFA and L(A) = } \emptyset \}$ is decidable. (No accept states are marked.)

5. The language $EQ_{DFA}=\{\langle A,B \rangle| \text{A and B are DFAs and L(A)=L(B) } \}$ is decidable. (Build C that’s empty iff A=B. Use $E_{DFA}$ on C.)

6. The language $A_{CFG}=\{\langle G,w \rangle| \text{A is a CFG that generates string } w \}$ is decidable. (Convert G to CFG in Chomsky Normal Form; enumerate all derivations within 2n-1 steps.)

7. The language $E_{CFG}=\{\langle A \rangle| \text{A is a CFG and L(G) = } \emptyset \}$ is decidable. (Mark all terminal symbols. Follow the rules until no new variables are marked. If the start symbol is not marked, accept.)

8. But $EQ_{CFG}$ is not decidable.

9. Every context-free language L is decidable. (Build a TM with CFG G with $A_{CFG}$)

#### Undecidability of $A_{TM}$

1. The language $A_{TM}=\{\langle M,w \rangle| \text{M is a TM and M accepts } w \}$ is undecidable. (Proof by diagonalization. A TM D($\langle M\rangle$) that accept if M doesn’t accept $\langle M\rangle$, which leads to contradiction.)

2. The language $A_{TM}=\{\langle M,w \rangle| \text{M is a TM and M accepts } w \}$ is Turing-recognizable. (Build TM U. Simulate M on w. If M accepts, U accepts. If M rejects, U rejects.) (U is an instance of Universal TM) (U can run forever.)

3. Sets:

• A and B have the same size/cardinality if there exists a bijective function f with $f: A \rightarrow B$.
• A is countable if it’s either finite or has the same size as $\mathbb{N}$. Otherwise it’s uncountable.
• The set $\mathbb{Q}$ of rational numbers is countable.
• The set $\mathbb{R}$ of real numbers is uncountable. (Proof by diagnolization.)
4. There exist languages that are not Turing-recognizable.

• Fixing a non empty alphabet $\Sigma$, the set of all TMs over $\Sigma$ is countable. (M can be mapped to binary strings, which are countable (enumerate).)
• The set $\mathcal{B}$ of all infinite binary sequences is uncountable. (Proof by diagnolization.)
• Fixing a non-empty alphabet $\Sigma$, the set of all languages over $\Sigma$ is uncountable. (Construct a bijection between the set of all languages of $\Sigma$ and $\mathcal{B}$ to have the characteristic string. The latter is uncountable, the former is also uncountable.)
• The set of TMs is countable but the set of languages is uncountable. So there are languages not described by TMs.

#### A Turing-Unrecognizable Language

1. L is co-Turing-recognizable if $\overline L$ is Turing-recognizable.

2. A language is decidable iff it’s Turing-recognizable and co-Turing-recognizable. (Run two TMs in parallel.)

3. The language $\overline{A_{TM}}$ is not Turing recognizable. (L is undecidable $\Rightarrow$ L not Turing recognizable $\vee$ $\overline L$ not Turing recognizable $\Rightarrow$ $\overline L$ not Turing recognizable)

### Reducibility

1. The language $HALT_{TM}=\{\langle M,w \rangle| \text{M is a TM and M halts on input } w \}$ is undecidable. (Proof by contradiction. Assume there’s a decider R for $HALT_{TM}$. Construct a decider for $A_{TM}$ (“R rejects, reject” eliminates the loop case.))

2. A Computable function f $\Rightarrow$ there exists some TM M, on every input w, halts with just f(w) on its tape.

3. A is mapping reducible to B ($A \leq_{m} B$) if there’s a computable function f between them. f is called a reduction from A to B.

4. If $A \leq_{m} B$ and B is decidable, A is decidable.

5. If $A \leq_{m} B$ and A is undecidable, B is undecidable.

6. We can also prove $HALT_{TM}$ to be undecidable by finding a reduction from $A_{TM}$. ($w \in A_{TM} \Leftrightarrow f(w) \in HALT_{TM}$)

7. The language $REGULAR_{TM}=\{\langle M \rangle| \text{M is a TM and L(M) is a regular language} \}$ is undecidable.

8. If $A \leq_{m} B$ and B is Turing-recognizable, A is Turing-recognizable.

9. If $A \leq_{m} B$ and A is not Turing-recognizable, B is not Turing-recognizable.

10. The language $EQ_{TM}=\{\langle M_1, M_2 \rangle| M_1 \ M_2 \text{ are TMs and } L(M_1) = L(M_2) \}$ is not Turing-recognizable. (Proof by $A_{TM} \leq_{m} \overline{EQ_{TM}}$ ($M_1$ rejects everything and $M_2$ accepts if M accepts w.))

## Complexity Theory

### Time Complexity

#### Measuring Complexity

1. The running time or time complexity of M is the function f(n)’s maximum number of steps that M uses on any input of length n. (Worst case.)

2. Big O and small o:

• Big O - there exists a number c and $n_0$
• Small o - for any c > 0 there exists $n_0$
3. Polynomial bound and exponential bound:

• Polynomial bound - $f(n) = O(n^p)$
• Exponential bound - $f(n) = 2^{(O(n^\delta))}$
4. The time complexity class TIME(t(n)) is the set of all languages that are decidable by an O(t(n)) time single-tape DTM.

5. For every t(n) time deterministic multi-tape TM there exists an equivalent $O(t^2(n))$ time deterministic single-tape TM.

• This only gives existence. But there can be better implementations.
6. The running time of nondeterministic decider N is the maximum number of steps f(n) uses on any branch.

7. For every t(n) NTM there exists an equivalent $2^{O(t(n))}$ time DTM. (By BFS.)

#### The Class P

1. $P = \bigcup_{k \in \mathbb{N}_0} TIME(n^k)$

2. The PATH problem/language is The language $PATH=\{\langle G,s,t \rangle| \text{G is a directed graph that has a path from s to t} \}$. $PATH \in P$ (BFS)

3. $RELPRIME =\{\langle x,y \rangle| \text{x and y are relatively prime} \}$. $RELPRIME \in P$ (Euclidean algorithm to find the gcd (log depth, unary -> O(n)).)

4. Every context-free language is in P. (DP.)

#### The Class NP

1. The nondeterministic time complexity class NTIME(t(n)) is the set of all languages that are decidable by an O(t(n)) time NTM.

2. $NP = \bigcup_{k \in \mathbb{N}_0} NTIME(n^k)$

3. $P \subset NP$

4. A verifier for a language A is a deterministic decider $V_A$ that takes as input $\langle w,c \rangle$ such that it holds: $w \in A$ iff there exists c such that $\langle w,c \rangle$ is accepted by $V_A$. c is called certificate.

5. $V_A$ is a polynomial time verifier if it has a polynomial running time in the size of its input. A is called polynomially verifiable if there exists a polynomial time verifier for it.

6. An alternative definition for NP: NP is the class of languages for which there exist polynomial time verifiers.

7. There exists a polynomial time verifier for A iff there exists a nondeterministic polynomial time TM that recognizes A.

8. A Hamiltonian path in G is a directed path that goes through each node exactly once. $HAMPATH \in NP$.

9. A clique is a subgraph such that there exist edges between all paris of nodes. A k-clique is a clique with k vertices. $CLIQUE \in NP$

10. $SUBSETSUM \in NP$

#### Polynomial Time Reducibility

1. A is polynomial time (mapping) reducible to B ($A \leq_P B$) if there’s a polynomial time computable function such that $w \in A$ iff $f(w) \in B$.

2. If $A \leq_P B$ and $B \in P$, then $A \in P$.

#### NP-completeness

1. A language B is NP-complete if:

1. $B \in NP$ and
2. $A \leq_P B$ for all $A \in NP$
2. If B is NP-complete and $B \in P$, then $P = NP$.

3. If B is NP-complete and $B \leq_P C$ for a $C \in NP$, then C is also NP-complete.

4. F is satisfiable if F has at least one model.

5. Cook-Levin: $SAT =\{\langle F \rangle| \text{F is a satisfiable Boolean formula}\}$ is NP-complete. The proof:

• $F = F_{cell} \wedge F_{start} \wedge F_{move} \wedge F_{accept}$
6. $3SAT =\{\langle F \rangle| \text{F is in 3CNF and is satisfiable} \}$ is NP-complete.

7. For all boolean formulas F in CNF there exists a boolean formula $F'$ in 3CNF such that F is satisfiable iff $F'$ is satisfiable. $F'$ can be built from F in polynomial time.

8. CLIQUE is NP-complete. (Proof $3SAT \leq_P CLIQUE$)

#### Hierarchy Theorems

1. Time constructible: f(n) at least $O(n \log n)$. f is time constructible if a computable function exists that maps the string $1^n$ (the unary representation) to the binary representation of f(n) in time $O(f(n))$. (output could also be unary.) (normal functions are all time constructible)

2. Let f be a time-constructible function. There exists a language A that’s decidable in $O(f(n))$ but not decidable in time $o(f(n)/\log f(n))$. (By increasing the time complexity by more than a logarithm of it, we increase the size of the time complexity class.)

3. Let $f_1 \ f_2$ be time-constructible and $f_1(n) = o(f_2(n)/\log f_2(n))$, then $TIME(f_1(n)) \subsetneq TIME(f_2(n))$

4. For $1 \leq \epsilon_1 < \epsilon_2$ and $n^{\epsilon_1}$ at least in $O(n\log n)$, it holds $TIME(n^{\epsilon_1}) \subsetneq TIME(n^{\epsilon_2})$

5. $EXPTIME = \bigcup_{k \in \mathbb{N}_0} TIME(2^{n^k})$.

6. $P \subsetneq EXPTIME$. (Even if P=NP, there are problems of exponential runtime.)

### Space Complexity

#### Measuring Space Complexity

1. Space complexity is the maximum number of tape cells that are scanned by M for an arbitrary input o length n.

2. SPACE(f(n)) is the set of all languages decided by $O(f(n))$ DTM.

3. $SAT \in SPACE(n)$.

4. NSAPCE is the set of all languages decided by $O(f(n))$ NTM.

5. The language $ALL_{NFA}=\{\langle A \rangle| \text{A is a NFA and } L(A) = \Sigma^*\}$. $\overline{ALL_{NFA}} \in NSPACE(n)$

6. f(n) at least $O(\log n)$. F is space constructible if a computable function exists that maps the string $1^n$ (the unary representation) to the binary representation of f(n) in space $O(f(n))$.

7. Savitch’s Theorem: $NSPACE(f(n)) \subseteq SPACE(f^2(n))$.

• Proof: Instead of calculating all branches, ask whether start config yields accepting config and check recursively, which leads to space reuse.
1. $PSPACE = \bigcup_{k} SPACE(n^k)$ (deterministic)

2. $NPSPACE = \bigcup_{k} SPACE(n^k)$ (non-deterministic)

3. $PSPACE = NPSPACE$

4. $P \subseteq PSPACE$ and $NP \subseteq NPSPACE$ and $NP \subseteq PSPACE$.

5. PSPACE-complete if:

• $B \in PSPACE$ and
• $A \leq_p B$ for all $A \in PSPACE$ (only poly-time reduction; using poly-space would be as hard as the full class)
6. The language $TQBF=\{\langle F \rangle| \text{F is a true fully quantified Boolean formula}\}$ is PSPACE-complete.

7. $P \subseteq NP \subseteq PSPACE = NPSPACE \subseteq EXPTIME$ (one of this subsets have to be a proper subset due to $P \subsetneq EXPTIME$)