Computability and Complexity Notes

Contents

This is the course notes for Computability and Complexity by Prof. Dr. Peter Zaspel at Jacobs University Bremen.

There exists

  1. an intuitive idea of an algorithm and
  2. a formalization via DTM/NTM/RAM

The hypothesis that both are equal is called Church-Turing Thesis.

  1. Formal definition of a TM:

    • A Turing machine is a 7-tuple, M=(Q,Σ,Γ,δ,q0,qaccept,qreject)M=(Q,\Sigma,\Gamma,\delta,q_0,q_{accept},q_{reject}), where Q,Σ,ΓQ,\Sigma,\Gamma are all finite sets and
      1. QQ is the set of states
      2. Σ\Sigma is the input alphabet not containing the blank symbol \sqcup
      3. Γ\Gamma is the tape alphabet, where Γ\sqcup \in \Gamma and ΣΓ\Sigma \subseteq \Gamma
      4. δ:Q×ΓQ×Γ×{L,R}\delta :Q\times \Gamma \rightarrow Q\times \Gamma \times \{L,R\} is the transition function
      5. q0Qq_0 \in Q is the start state
      6. qacceptQq_{accept} \in Q is the accept state, where qacceptqrejectq_{accept} \neq q_{reject}
      7. qrejectQq_{reject} \in Q is the reject state.
  2. Configuration of a TM:

    • Let M be a TM, u,vΓu,v \in \Gamma^* and qQq \in Q. The setting uqvuqv is a configuration of the TM, where
      • uu is the current tape content to the left of the head
      • qq is the current state
      • vv is the current tape content below (v1v_1) and to the right of the head (being terminated by blanks \sqcup)
  3. A configuration C1C_1 yields a configuration C2C_2 if M can legally move from C1C_1 to C2C_2 in a single step.

    • Leftward move: uaqibvuaq_i bv yields uqjacvuq_j acv iff δ(qi,b)=(qj,c,L)\delta(q_i, b) = (q_j, c, L)
    • Rightward move: uaqibvuaq_i bv yields uacqjvuacq_j v iff δ(qi,b)=(qj,c,R)\delta(q_i, b) = (q_j, c, R)
    • L-move, left tape end: qibvq_i bv yields qjcvq_j cv iff δ(qi,b)=(qj,c,L)\delta(q_i, b) = (q_j, c, L)
    • R-move, left tape end: qibvq_i bv yields cqjvcq_j v iff δ(qi,b)=(qj,c,R)\delta(q_i, b) = (q_j, c, R)
  4. Characterization of configuration:

    • Start configuration: u=ϵu = \epsilon, i.e. head is at leftmost end of tape
    • Accepting configuration: q=qacceptq=q_{accept}
    • Rejecting configuration: q=qrejectq=q_{reject}
    • Halting configuration: accepting or rejecting configuration
    • Remark: there’s exactly one accepting/rejecting state, but there can be many accepting/rejecting configurations.
  5. Accepted input and recognized language:

    • M accepts ww if a sequence of configurations C1,C2,,CkC_1 , C_2 , …, C_k exists, such that
      1. C1C_1 is the start configuration
      2. CiC_i yields Ci+1C_{i+1}
      3. CkC_k is the accepting configuration
    • The language of M is L(M)={wΣM accepts w}L(M) = \{w \in \Sigma^* | M \ \text{accepts} \ w\}. We say L is recognized by M.
  6. Turing-recognizable or recursively enumerable language, if some TM recognizes it.

  7. M halts on ww if a sequence of configurations C1,C2,,CkC_1 , C_2 , …, C_k exists, such that

    1. C1C_1 is the start configuration
    2. CiC_i yields Ci+1C_{i+1}
    3. CkC_k is the halting configuration
  8. Decider: if it halts on all inputs wΣw \in \Sigma^*. A decider M that recognizes L is said to decide L.

  9. Decidable/Recursive/Turing-decidable language: if there exists a TM (i.e. a decider) that decides it.

  1. All variants have the same power as a standard TM.
  1. The definition differs from the standard TM that: δ:Q×ΓkQ×Γk×{L,R}k\delta :Q\times \Gamma^k \rightarrow Q\times \Gamma^k \times \{L,R\}^k

  2. For every multitape TM M, there exits a single-tape TM S that recognizes the same language.

  3. Turing-recognizable iff some multitape TM recognizes it.

  1. The definition differs from the standard TM that: δ:Q×ΓP(Q×Γ×{L,R})\delta :Q\times \Gamma \rightarrow \mathcal{P}(Q\times \Gamma \times \{L,R\})

  2. For definition of yielding configurations, it differs from the standard TM that we need to change from == to \in. (One config can yield several or no config.)

  3. Exactly the same definition for accepted input and recognized language.

  4. For every NTM N, there exists a deterministic (single-tape) TM S that recognizes the same language.

  5. Turing-recognizable iff some NTM recognizes it.

  6. Nondeterministic decider: if all branches halt on all inputs.

  1. k - command counter; a - accumulator; c - register; x - input; hrh_r - position on input; y - output.

  2. Some commands:

    • read: a=xhr; hr=hr+1;k=k+1a = x_{h_r}; \ h_r = h_r + 1; k = k + 1
    • print a: y=ay;k=k+1y=ay; k = k + 1
  3. Input tape: read-only and read once, can only move to the right

  4. Program: finite sequence of commands.

  5. There exists a RAM that terminates and outputs 1 whenever a TM M accepts. The RAM terminates and outputs 0 whenever M rejects.

  6. Let R be an RAM. For an arbitrary input x it outputs y(x). There exists a multi-tape TM that accepts the input x after writing y(x) on one of its tapes.

  7. TMs and RAMs have the same expressive power.

  1. The language ADFA={B,wB is a DFA that accepts input string w}A_{DFA}=\{\langle B,w \rangle| \text{B is a DFA that accepts input string } w \} is decidable. (The problem of testing whether a given DFA accepts a given string is decidable.)

  2. The language ANFA={B,wB is a NFA that accepts input string w}A_{NFA}=\{\langle B,w \rangle| \text{B is a NFA that accepts input string } w \} is decidable. (Convert to NFA to DFA.)

  3. The language AREX={B,wB is a regular expression that generates string w}A_{REX}=\{\langle B,w \rangle| \text{B is a regular expression that generates string } w \} is decidable. (Convert R to DFA.)

  4. The language EDFA={AA is a DFA and L(A) = }E_{DFA}=\{\langle A \rangle| \text{A is a DFA and L(A) = } \emptyset \} is decidable. (No accept states are marked.)

  5. The language EQDFA={A,BA and B are DFAs and L(A)=L(B) }EQ_{DFA}=\{\langle A,B \rangle| \text{A and B are DFAs and L(A)=L(B) } \} is decidable. (Build C that’s empty iff A=B. Use EDFAE_{DFA} on C.)

  6. The language ACFG={G,wA is a CFG that generates string w}A_{CFG}=\{\langle G,w \rangle| \text{A is a CFG that generates string } w \} is decidable. (Convert G to CFG in Chomsky Normal Form; enumerate all derivations within 2n-1 steps.)

  7. The language ECFG={AA is a CFG and L(G) = }E_{CFG}=\{\langle A \rangle| \text{A is a CFG and L(G) = } \emptyset \} is decidable. (Mark all terminal symbols. Follow the rules until no new variables are marked. If the start symbol is not marked, accept.)

  8. But EQCFGEQ_{CFG} is not decidable.

  9. Every context-free language L is decidable. (Build a TM with CFG G with ACFGA_{CFG})

  1. The language ATM={M,wM is a TM and M accepts w}A_{TM}=\{\langle M,w \rangle| \text{M is a TM and M accepts } w \} is undecidable. (Proof by diagonalization. A TM D(M\langle M\rangle) that accept if M doesn’t accept M\langle M\rangle, which leads to contradiction.)

  2. The language ATM={M,wM is a TM and M accepts w}A_{TM}=\{\langle M,w \rangle| \text{M is a TM and M accepts } w \} is Turing-recognizable. (Build TM U. Simulate M on w. If M accepts, U accepts. If M rejects, U rejects.) (U is an instance of Universal TM) (U can run forever.)

  3. Sets:

    • A and B have the same size/cardinality if there exists a bijective function f with f:ABf: A \rightarrow B.
    • A is countable if it’s either finite or has the same size as N\mathbb{N}. Otherwise it’s uncountable.
      • The set Q\mathbb{Q} of rational numbers is countable.
      • The set R\mathbb{R} of real numbers is uncountable. (Proof by diagnolization.)
  4. There exist languages that are not Turing-recognizable.

    • Fixing a non empty alphabet Σ\Sigma, the set of all TMs over Σ\Sigma is countable. (M can be mapped to binary strings, which are countable (enumerate).)
    • The set B\mathcal{B} of all infinite binary sequences is uncountable. (Proof by diagnolization.)
    • Fixing a non-empty alphabet Σ\Sigma, the set of all languages over Σ\Sigma is uncountable. (Construct a bijection between the set of all languages of Σ\Sigma and B\mathcal{B} to have the characteristic string. The latter is uncountable, the former is also uncountable.)
    • The set of TMs is countable but the set of languages is uncountable. So there are languages not described by TMs.
  1. L is co-Turing-recognizable if L\overline L is Turing-recognizable.

  2. A language is decidable iff it’s Turing-recognizable and co-Turing-recognizable. (Run two TMs in parallel.)

  3. The language ATM\overline{A_{TM}} is not Turing recognizable. (L is undecidable \Rightarrow L not Turing recognizable \vee L\overline L not Turing recognizable \Rightarrow L\overline L not Turing recognizable)

  1. The language HALTTM={M,wM is a TM and M halts on input w}HALT_{TM}=\{\langle M,w \rangle| \text{M is a TM and M halts on input } w \} is undecidable. (Proof by contradiction. Assume there’s a decider R for HALTTMHALT_{TM}. Construct a decider for ATMA_{TM} (“R rejects, reject” eliminates the loop case.))

  2. A Computable function f \Rightarrow there exists some TM M, on every input w, halts with just f(w) on its tape.

  3. A is mapping reducible to B (AmBA \leq_{m} B) if there’s a computable function f between them. f is called a reduction from A to B.

  4. If AmBA \leq_{m} B and B is decidable, A is decidable.

  5. If AmBA \leq_{m} B and A is undecidable, B is undecidable.

  6. We can also prove HALTTMHALT_{TM} to be undecidable by finding a reduction from ATMA_{TM}. (wATMf(w)HALTTMw \in A_{TM} \Leftrightarrow f(w) \in HALT_{TM})

  7. The language REGULARTM={MM is a TM and L(M) is a regular language}REGULAR_{TM}=\{\langle M \rangle| \text{M is a TM and L(M) is a regular language} \} is undecidable.

  8. If AmBA \leq_{m} B and B is Turing-recognizable, A is Turing-recognizable.

  9. If AmBA \leq_{m} B and A is not Turing-recognizable, B is not Turing-recognizable.

  10. The language EQTM={M1,M2M1 M2 are TMs and L(M1)=L(M2)}EQ_{TM}=\{\langle M_1, M_2 \rangle| M_1 \ M_2 \text{ are TMs and } L(M_1) = L(M_2) \} is not Turing-recognizable. (Proof by ATMmEQTMA_{TM} \leq_{m} \overline{EQ_{TM}} (M1M_1 rejects everything and M2M_2 accepts if M accepts w.))

  1. The running time or time complexity of M is the function f(n)’s maximum number of steps that M uses on any input of length n. (Worst case.)

  2. Big O and small o:

    • Big O - there exists a number c and n0n_0
    • Small o - for any c > 0 there exists n0n_0
  3. Polynomial bound and exponential bound:

    • Polynomial bound - f(n)=O(np)f(n) = O(n^p)
    • Exponential bound - f(n)=2(O(nδ))f(n) = 2^{(O(n^\delta))}
  4. The time complexity class TIME(t(n)) is the set of all languages that are decidable by an O(t(n)) time single-tape DTM.

  5. For every t(n) time deterministic multi-tape TM there exists an equivalent O(t2(n))O(t^2(n)) time deterministic single-tape TM.

    • This only gives existence. But there can be better implementations.
  6. The running time of nondeterministic decider N is the maximum number of steps f(n) uses on any branch.

  7. For every t(n) NTM there exists an equivalent 2O(t(n))2^{O(t(n))} time DTM. (By BFS.)

  1. P=kN0TIME(nk)P = \bigcup_{k \in \mathbb{N}_0} TIME(n^k)

  2. The PATH problem/language is The language PATH={G,s,tG is a directed graph that has a path from s to t}PATH=\{\langle G,s,t \rangle| \text{G is a directed graph that has a path from s to t} \}. PATHPPATH \in P (BFS)

  3. RELPRIME={x,yx and y are relatively prime}RELPRIME =\{\langle x,y \rangle| \text{x and y are relatively prime} \}. RELPRIMEPRELPRIME \in P (Euclidean algorithm to find the gcd (log depth, unary -> O(n)).)

  4. Every context-free language is in P. (DP.)

  1. The nondeterministic time complexity class NTIME(t(n)) is the set of all languages that are decidable by an O(t(n)) time NTM.

  2. NP=kN0NTIME(nk)NP = \bigcup_{k \in \mathbb{N}_0} NTIME(n^k)

  3. PNPP \subset NP

  4. A verifier for a language A is a deterministic decider VAV_A that takes as input w,c\langle w,c \rangle such that it holds: wAw \in A iff there exists c such that w,c\langle w,c \rangle is accepted by VAV_A. c is called certificate.

  5. VAV_A is a polynomial time verifier if it has a polynomial running time in the size of its input. A is called polynomially verifiable if there exists a polynomial time verifier for it.

  6. An alternative definition for NP: NP is the class of languages for which there exist polynomial time verifiers.

  7. There exists a polynomial time verifier for A iff there exists a nondeterministic polynomial time TM that recognizes A.

  8. A Hamiltonian path in G is a directed path that goes through each node exactly once. HAMPATHNPHAMPATH \in NP.

  9. A clique is a subgraph such that there exist edges between all paris of nodes. A k-clique is a clique with k vertices. CLIQUENPCLIQUE \in NP

  10. SUBSETSUMNPSUBSETSUM \in NP

  1. A is polynomial time (mapping) reducible to B (APBA \leq_P B) if there’s a polynomial time computable function such that wAw \in A iff f(w)Bf(w) \in B.

  2. If APBA \leq_P B and BPB \in P, then APA \in P.

  1. A language B is NP-complete if:

    1. BNPB \in NP and
    2. APBA \leq_P B for all ANPA \in NP
  2. If B is NP-complete and BPB \in P, then P=NPP = NP.

  3. If B is NP-complete and BPCB \leq_P C for a CNPC \in NP, then C is also NP-complete.

  4. F is satisfiable if F has at least one model.

  5. Cook-Levin: SAT={FF is a satisfiable Boolean formula}SAT =\{\langle F \rangle| \text{F is a satisfiable Boolean formula}\} is NP-complete. The proof:

    • F=FcellFstartFmoveFacceptF = F_{cell} \wedge F_{start} \wedge F_{move} \wedge F_{accept}
  6. 3SAT={FF is in 3CNF and is satisfiable}3SAT =\{\langle F \rangle| \text{F is in 3CNF and is satisfiable} \} is NP-complete.

  7. For all boolean formulas F in CNF there exists a boolean formula FF’ in 3CNF such that F is satisfiable iff FF’ is satisfiable. FF’ can be built from F in polynomial time.

  8. CLIQUE is NP-complete. (Proof 3SATPCLIQUE3SAT \leq_P CLIQUE)

  1. Time constructible: f(n) at least O(nlogn)O(n \log n). f is time constructible if a computable function exists that maps the string 1n1^n (the unary representation) to the binary representation of f(n) in time O(f(n))O(f(n)). (output could also be unary.) (normal functions are all time constructible)

  2. Let f be a time-constructible function. There exists a language A that’s decidable in O(f(n))O(f(n)) but not decidable in time o(f(n)/logf(n))o(f(n)/\log f(n)). (By increasing the time complexity by more than a logarithm of it, we increase the size of the time complexity class.)

  3. Let f1 f2f_1 \ f_2 be time-constructible and f1(n)=o(f2(n)/logf2(n))f_1(n) = o(f_2(n)/\log f_2(n)), then TIME(f1(n))TIME(f2(n))TIME(f_1(n)) \subsetneq TIME(f_2(n))

  4. For 1ϵ1<ϵ21 \leq \epsilon_1 < \epsilon_2 and nϵ1n^{\epsilon_1} at least in O(nlogn)O(n\log n), it holds TIME(nϵ1)TIME(nϵ2)TIME(n^{\epsilon_1}) \subsetneq TIME(n^{\epsilon_2})

  5. EXPTIME=kN0TIME(2nk)EXPTIME = \bigcup_{k \in \mathbb{N}_0} TIME(2^{n^k}).

  6. PEXPTIMEP \subsetneq EXPTIME. (Even if P=NP, there are problems of exponential runtime.)

  1. Space complexity is the maximum number of tape cells that are scanned by M for an arbitrary input o length n.

  2. SPACE(f(n)) is the set of all languages decided by O(f(n))O(f(n)) DTM.

  3. SATSPACE(n)SAT \in SPACE(n).

  4. NSAPCE is the set of all languages decided by O(f(n))O(f(n)) NTM.

  5. The language ALLNFA={AA is a NFA and L(A)=Σ}ALL_{NFA}=\{\langle A \rangle| \text{A is a NFA and } L(A) = \Sigma^*\}. ALLNFANSPACE(n)\overline{ALL_{NFA}} \in NSPACE(n)

  6. f(n) at least O(logn)O(\log n). F is space constructible if a computable function exists that maps the string 1n1^n (the unary representation) to the binary representation of f(n) in space O(f(n))O(f(n)).

  7. Savitch’s Theorem: NSPACE(f(n))SPACE(f2(n))NSPACE(f(n)) \subseteq SPACE(f^2(n)).

    • Proof: Instead of calculating all branches, ask whether start config yields accepting config and check recursively, which leads to space reuse.
  1. PSPACE=kSPACE(nk)PSPACE = \bigcup_{k} SPACE(n^k) (deterministic)

  2. NPSPACE=kSPACE(nk)NPSPACE = \bigcup_{k} SPACE(n^k) (non-deterministic)

  3. PSPACE=NPSPACEPSPACE = NPSPACE

  4. PPSPACEP \subseteq PSPACE and NPNPSPACENP \subseteq NPSPACE and NPPSPACENP \subseteq PSPACE.

  5. PSPACE-complete if:

    • BPSPACEB \in PSPACE and
    • ApBA \leq_p B for all APSPACEA \in PSPACE (only poly-time reduction; using poly-space would be as hard as the full class)
  6. The language TQBF={FF is a true fully quantified Boolean formula}TQBF=\{\langle F \rangle| \text{F is a true fully quantified Boolean formula}\} is PSPACE-complete.

  7. PNPPSPACE=NPSPACEEXPTIMEP \subseteq NP \subseteq PSPACE = NPSPACE \subseteq EXPTIME (one of this subsets have to be a proper subset due to PEXPTIMEP \subsetneq EXPTIME)