Data Structures Lesson 16

Professor Abdul-Quader

Topological Sorting

HW

  • Build eight element heap in eight comparisons?
    • Hint: tournaments?

Probability

  • Directed graph with \(N\) nodes, labeled \(0, 1, \ldots, N - 1\).
  • Fixed probability \(p\) such that: between any two \(i\), \(j\), the probability of an edge \((i, j)\) is \(p\)>
  • What is the probability that there is a path from \(0\) to \(N - 1\)?

Examples

  • If \(N = 2\), then the probability is just \(p\).
  • If \(N = 3\), the probability is \(p + (1-p)p^2\). (the probability that there is an edge from \(0\) to \(2\), plus the probability that there is no edge from \(0\) to 2, but there is an edge from \(0\) to 1, and an edge from \(1\) to \(2\))
  • If \(N = 4\), the formula gets very complicated, very quickly.

Video

Simulation

Instead of computing this mathematically, let’s try to write a simulation.

  • Implement the hasAPath method. Starter code
  • Play around with different values of \(p\) and \(size\).
  • Check if your implementation works correctly by calling printGraph (and checking by hand if there is a path).

Simulation

  • Implement the simulate method and play around with \(size\) and \(prob\):
  • If \(p = 0.5\), for what values of \(size\) will the probability of a path be at least \(90\%\)?
  • If \(p = 0.25\), same question.
  • If \(p = 0.001\), for what values of \(size\) will the probability of a path be at least \(50\%\)?

Dependencies

Need to take many courses. Some courses have others as prerequisites. Suppose we can only take one per semester. What would be a valid order that we can take the courses in?

Example

  • MATH 1202
  • MATH 2010
  • MATH 3007: prerequisite MATH 1202
  • MATH 4010: prerequisites MATH 1202, MATH 2010
  • MATH 4020: prerequisites MATH 4010, MATH 3007
  • MATH 4040: prerequisites MATH 1202, MATH 2010
  • MATH 4060: prerequisite MATH 3007

Topological Sort

More general problem: given a directed graph, a topological sort is an ordering of all the vertices \(v_1, v_2, \ldots, v_n\) such that if there is an edge from \(v_i\) to \(v_j\), then \(i < j\).

Think about how to represent the course prerequisite problem from the previous slide as a directed graph, and this definition should make sense.

Topological Sort

  • Note: this means that we cannot find a topological sort on a graph that has a cycle. (Why?)
  • A directed grpah with no cycles is called a directed acyclic graph, or DAG.
  • Every DAG (including disconnected ones) can be topologically sorted. Proof?

Algorithm

  • Describe an algorithm which, given a directed graph, outputs a valid topological sort.
  • Implement this algorithm in the RandomDAG class (on replit)

Pseudocode

int[] inDegree
List<Integer> topologicalOrdering
Queue<Integer> queue
for (int vertex : vertices)
    calculateInDegree(vertex)
    if (inDegree[vertex] == 0)
        queue.enqueue(vertex)

while (!queue.isEmpty())
    int vertex = queue.remove()
    topologicalOrdering.add(vertex)
    for (int neighbor : getNeighbors(vertex))
        inDegrees[neighbor]--
        if (inDegree[neighbor] == 0)
            queue.add(neighbor)

return topologicalOrdering

DFS

What ordering do we get if we do a depth-first search?

void dfs(int node, boolean[] visited) {
    print(node)
    visited[node] = true

    for (int vertex : getNeighbors(node)) {
        if (!visited[vertex]) {
            dfs(vertex, visited)
        }
    }
}

Alternative

Two options for DFS:

  • Visit a node before we visit its child.
  • Visit a node after we visit its child.
  • Do we get a topological ordering in either case?

Scheduling Problem

Project 3: Given a list of courses, each of which just contains a list of student IDs (of students taking that course), find a schedule for these courses.

Restrictions

  • A student can be in only one room at a time: so two courses which share a student should not be scheduled at the same time.
  • Ideally, we would like the schedule to have the minimal number of time slots.
  • How can we represent this as a graph theory problem?

Graph Coloring Problem

A 3 coloring of the Petersen graph (Wikimedia Commons)

Given an undirected graph \(G = (V, E)\), a coloring is an assignment of colors to the vertices such that no two vertices which share an edge are given the same color. A minimal coloring is a coloring that uses the least amount of colors.

Question: how are these two problems related?

Eulerian Path / Circuit Problem

Seven Bridges of Konisgberg

Euler (1736): is there a way to cross every bridge in Königsberg exactly once? (Treat the picture as a (multi)graph. What are the nodes / edges?)

Eulerian Path / Circuit Problem

  • More generally: given a graph, is there a path which crosses each edge exactly once? (Eulerian path)
  • Given a graph, is there a path which crosses every edge exactly once and starts and ends at the same vertex? (Eulerian circuit)

Solution? Notice: every time you enter a vertex, you must also leave it (with the exception of the start / end vertices if it’s a path and not a circuit).

Similar Problems

  • Hamiltonian path: a path which visits every vertex exactly once.
  • Hamiltonian circuit: start and end at the same vertex (so visit that one twice), and visit all the others exactly once.
  • Traveling Salesman Problem: given a complete weighted graph (with non-negative weights), and an integer \(k\), is there a Hamiltonian cycle which has cost \(\leq k\)?

P vs NP

  • Formally: a problem is in the class P if there is an algorithm which, for any input of length \(n\), runs in polynomial time in terms of \(n\):

    \[P = \bigcup\limits_{i = 0}^{\infty} O(n^i)\]

  • These are easy to solve.

Class NP

  • A problem is in NP if there is an algorithm which checks a solution in polynomial time. (easy to check)
  • Example: composites. Given a number \(x\) of length \(N\) (in binary), determine if \(x\) is composite.
  • Verifier: on input \(x\) and \(c\), check if \(c | x\) (can be done in polynomial time in \(N\)). If so, output true. If not false.
  • Solver? On input \(x\), determine if any \(c\) divides \(x\)?
    • Naive: exponential in the length of \(x\). Why?

Examples

  1. Given a representationn (adjacency matrix or list) of a graph \(G = (V, E)\) and an assignment of colors to the vertices, check if it is a valid coloring.
  2. The Eulerian path/cycle problem is in P. (The algorithm we talked about is a proof of this fact.)
  3. Hamiltonian cycle problem? TSP?
  4. Non-Hamiltonian cycle? Give a graph \(G = (V, E)\), determine if the graph does not have a Hamiltonian cycle.

P vs NP

In the year 2000, the Clay Mathematics Institute posed a list a 7 Millenium problems, offering a $1M cash prize for a solution to any of these. The P vs NP question is one of them:

Problem: Prove or disprove that every problem in NP is also in P.

NP-Completeness

  • NP-complete: problems which, if solved, can be used to solve (in polynomial time) any other problem in the class NP.
  • Examples:
    • Hamiltonian circuit problem
    • Traveling Salesman Problem
    • Graph Coloring problem

Not known

Many other problems are known to be in NP, but not known to be NP-complete:

  • Integer factorization,
  • Discrete logarithm (both of these are used in encryption algorithms)
  • Graph isomorphism

Upcoming

  • Presentations (Thursday and Monday)
  • Quiz 2 (Thursday)
  • HW due Friday
  • Project 3 posted soon
  • Final project: group project (exploring a topic related to data structures in depth / beyond the curriculum)
// reveal.js plugins