Data Structures Lesson 23

Professor Abdul-Quader

Concurrent Data Structures

Last time

  • Threads, Executors, ThreadPools, etc
  • Thread Safety
    • Main concern: shared mutable state.
    • Race conditions
  • Synchronization issues: deadlocks

Synchronized

public synchronized void add(T item) {
  ...
}
public synchronized T get(int i) {
  ...
}
  • Thread 1: calls “add”
  • Thread 2: calls “get”
  • What happens?

Vectors

A Vector is like an ArrayList, with all of its methods synchronized. Is the following code thread-safe?

// ensures that there is exactly one "item" object in the vector v
public void addIfNotFound(Vector<T> v, T item) {
    if (!v.contains(item)) {
        v.add(item);
    }
}

List Race Condition

See ListRaceCondition. What happens when we run the code? Play around with the type of “Executor” we use (single vs multithreaded).

Question: why do we see a NullPointerException? What is the race condition here?

LinkedList Code

add calls the following:

void linkLast(E e) {
    final Node<E> l = last;
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
  • This is extremely non-thread safe.
  • Not meant to be!
  • Draw out race condition.

Video

  • Issue with Vector class
  • LinkedList race condition

Synchronized Collections

  • Vector, Hashtable, Stack are all synchronized
  • Collections class: synchronizedList, synchronizedMap, synchronizedSet, etc.
  • Two issues:
    • compound operations
    • iteration

Compound Operations

list = Collections.synchronizedList(originalList);
if (!list.contains(0)) {
  list.add(0);
}

Moral: may need to synchronize yourself.

Iteration

// Thread 1:

for (int i = 0; i < 10000; i++) {
  list.add(i);
}

// Thread 2:
for (int num : list) {
  System.out.println(num);
}

What could go wrong?

Iteration Issues

  • Iterators are fail-fast.
  • throw ConcurrentModificationException if they detect changes in another thread while iterating
  • Not fool proof!
  • If you need to, make sure to synchronize around the entire loop.
  • Watch out for “hidden iterators”: calls to containsAll, toString, and other methods may trigger an iteration through the entire collection.
  • Locking during iteration can be bad! (Starvation, deadlocks, contention)

Cloning

Intead of locking: you can clone the collection first.

ArrayList<Integer> list2;
synchronized(list) { // synchronize while cloning
  list2 = new ArrayList<>(list);
}
for (int num : list2) {
  System.out.println(num);
}
  • Downside?
  • Why does this work? Why don’t we need to synchronize the iteration?

Video

Concurrent Collections

  • Difference between synchronized and concurrent collections.
  • Synchronized: only one thread can call a method at a time
  • Concurrent?

Examples

  • CopyOnWriteArrayList
  • ConcurrentHashMap, ConcurrentSkipListMap
  • ConcurrentLinkedQueue
  • ArrayBlockingQueue, LinkedBlockingQueue

Copy on Write

  • Each “modificaction” (add, remove) creates a new copy of the underlying array.
  • No synchronization needed (why not?)
    • What happens if one thread changes the list while another thread is reading?
  • Example: event notification systems
    • Asynchronous “event” comes in.
    • Notify a large, static list of event handlers.
    • Only when we “register” a new event handler will we need to update it.

Scenario 1

CopyOnWriteArrayList<Integer> l; // initialized somewhere
// Thread 1:
for (int i = 0; i < 1000; i++) {
    l.add(i);
}
// Thread 2:
for (int num : l) {
    System.out.println(num);
}

What gets printed?

Scenario 2

// Thread 1 is the same as before
// Thread 2:
Iterator<Integer> it = l.iterator();
while (it.hasNext()) {
    int num = it.next();
    it.remove();
}

What happens?

Scenario 3

CopyOnWriteArrayList<Integer> l;
// l is initialized somewhere and shared
if (!l.contains(0)) {
    l.add(0);
}

Compound operations?

Video

HashMap

Recall: a HashMap uses “separate chaining” (not probing). What happens as we add more things to a separate chaining HashMap?

Re-sizing

  • Re-sizing (rehashing)! There is a particularly nasty race condition in the resize method of a HashMap.
    • Causes an infinite loop!
  • HashMaps are not designed to be thread-safe and you should not use them as if they were: even if you don’t care about data consistency.
  • Read: A Beautiful Race Condition

ConcurrentHashMap

  • Again: HashMaps use “separate chaining”: basically a large table of LinkedLists
  • ConcurrentHashMap uses “lock striping”: instead of one lock for the entire table, separate the table into “buckets”, each of which has its own lock.
  • Reads (get / iterators over the keySet or entrySet) do not contend with each other, or with writes (put).
  • Limited number of “writes” can be done concurrently.

Considerations

  • No need to lock on iteration: gives a “view” of the elements of the table as they existed at the time you iterate.
    • Does not necessarily update that “view” if there are modifications to the underlying table concurrently.
  • Atomic compound operations: putIfAbsent, computeIfAbsent, etc.
  • Resizing creates a separate table, locks on each “bucket” of the original table as it transfers

“The semantics of methods that operate on the entire Map, such as size and isEmpty, have been slightly weakened to reflect the concurrent nature of the collection. Since the result of size could be out of date by the time it is computed, it is really only an estimate, so size is allowed to return an approximation instead of an exact count. While at first this may seem disturbing, in reality methods like size and isEmpty are far less useful in concurrent environments because these quantities are moving targets. So the requirements for these operations were weakened to enable performance optimizations for the most important operations . . .”

Video

Reminders

  • Final Projects: choose topic ASAP!!!
  • Presentations will all be next Monday 5/5.
  • Papers due following Monday 5/12.
// reveal.js plugins