Teaching‎ > ‎

CS350 Spring 2009 TF

This is the teaching fellow's page. I will post some supplemental material here from time to time, allowing both students and the professor to keep track of what happened in discussion sessions.

General Information

You're welcome to make appointment by e-mail; checking my schedule first.

Cache Replacement Testing

posted Apr 30, 2009, 5:36 PM by Site Admin Role   [ updated Apr 30, 2009, 5:54 PM ]

Regarding the programming assignment, I've attached two simplified workloads that should help you test whether you implemented cache replacement correctly. This is what you should get when running 1 process on 1 core for 1,000,000 references with wrap-around.

cache lines used: 4, unused: 65532
 pid        hits      misses    eviction  trace file
   1      999996           4           0  data/simple-cache-addrs

cache lines used: 40, unused: 65496
 pid        hits      misses    eviction  trace file
   1      999936          64          24  data/synthetic-l2-cache-addrs

The simple-cache-addrs should be easy enough to see what's going on. The four addresses are mapped to four distinct sets, so there should be no eviction. The synthetic workload is slightly more involved. It is generated using the following for-loop:

for (int msb = 0; msb < 16; msb++)
  for (int index = 0; index < 4; index++)
    printf("%#x\n", (msb << 18 | index << 6));

The purpose is that, for each of the index 0 through 3, each of which map to a cache set, we want to populate that cache set 16 times. Since cache replacement is uniform random, eviction can occur quite often.


posted Apr 24, 2009, 11:01 AM by Site Admin Role

  • Making two threads take turns to print out "Giraffe", "Zebra", "Giraffe", "Zebra", ... (see attachment turns.c below).
    • Thread execution is concurrent by nature.
    • pthread_yield() and sched_yield() only works when another thread is blocked. On uniprocessor system, they can achieve the interleaving effect, but not on multiprocessor system.
    • Don't bother with sleep.
    • Use the "turns" critical section.
  • Array out of bounds access memory corruption (see corrupt.c below).
    • We read a line that is longer than what the buffer could hold, causing array out of bounds access.
    • Some memory is corrupted after the buffer, but that doesn't show during execution.
    • The corruption causes segmentation fault at seemingly unrelated spot in code.

Pthread in C++

posted Apr 23, 2009, 10:16 AM by Site Admin Role   [ updated Apr 23, 2009, 10:42 AM ]

Some people like to wrap everything into C++ classes, including threads. There is some intricate danger in wrapping a resource around a C++ object because the C++ object gives you the impression that the resource can be aliased (i.e. it is okay to have multiple pointers/references to the same object) when it is not. Also, since you're not allowed to peek inside the internal state of the object, the user may call object methods in an order that violates object consistency and produce undefined result. In the worst case, it may cause leakage of scarce resources like processes, file descriptors, and network sockets. Object-oriented encapsulation takes away your ability to reason about resource usage.

For example, suppose you write a C++ Mutex class to wrap around pthread mutex. What if the destructor is called when the mutex is still locked?

In the code I attach below, the Thread class represents a pthread. It has a start() method that begins execution of some threaded code and a join() method to wait until the thread finishes. What if start() is called twice on the same object? (Answer: the previous thread could be orphaned and we lose a reference to it while it's still running). What if the C++ object is destructed while a thread is still running? (Answer: the operating system wouldn't be able to properly clean-up the thread, such as freeing its run-time stack, which can be as large as 4MB). The example code I attach below tries to deal with these issues by doing lots of run-time checks. The run-time checks is what gives the impression that C++ is bloated.

Having warned you with all that, the code I attach below is tested to work on g++ (The C++ compiler in GCC) with -Wall.


posted Apr 17, 2009, 11:01 AM by Site Admin Role

  • Reading a line and rewinding file pointer in stdio.h; see rrcat.c (round-robin cat) below. Given a list of files in the command line, read a line from each of them in round robin order and continues indefinitely. If a file hits the end of file, rewind to the first line.
  • Basics on extracting bits (see shift.c below).


posted Apr 10, 2009, 11:03 AM by Site Admin Role

  • Singly linked list based stack, single thread.
  • Creating 32 producer and 32 consumer threads. Use pthread_create() to start threads. Each thread routine must look like this: void *thread_main(void *arg) {...}, i.e. takes one void * argument and returns a void * value.
  • Use pthread_join() to wait until another thread finishes.
  • What is causing the double-free error?
  • Establishing critical section using pthread mutex.
    • Entry code: pthread_mutex_lock()
    • Exit code: pthread_mutex_unlock()
    • Mutex needs to be initialized using pthread_mutex_init() before use, and destroyed with pthread_mutex_destroy() when no longer needed.
  • The "volatile" keyword for variables prevents compiler optimization from erroneously retaining value of memory location when it could be changed outside of the current control flow. Dekker's and Peterson's algorithm all need to use the volatile keyword in order to compile correctly.
  • See that the multi-threaded program actually saturates CPU on an 8 processor AMD Opteron machine.


posted Mar 10, 2009, 5:14 PM by Site Admin Role

Getting familiar with C programming. Let the class participated in brainstorming ideas for implementing problem 6. Went over some C language constructs that would be required for that problem.

Students are encouraged to do other problems on their own time as well. Unit tests for selected exercises are also included in exercises.tar.gz attached below. When you do problem 2 (strlen.c), problem 4 (assoc.c) and problem 5 (palin.c), you could modify the corresponding source file and do a "make test", which would create a unittest_main executable. Run the executable to see if your implementation works. You are encouraged to come up with a unit testing strategy for problem 3 as well.


posted Feb 27, 2009, 12:45 PM by Site Admin Role   [ updated Feb 27, 2009, 1:51 PM ]

  • (session 1 only) Given a PRNG that generates samples Xi with a known mean μ and variance σ2, we can construct a PRNG that generates samples Xi' with a specified mean μ' and variance σ'2 by letting Xi' = a + b * Xi. We get μ' = a + b * μ, and σ' = bσ. Solve for a and b.
  • (session 2 only) Went over some C language constructs (enum, union) that might be helpful for describing events (see event.h below).
  • Implemented a simple integer queue based on singly linked list (see queue.h, queue.c below)
  • Unit testing the queue (see unittest_main.c below).
Instruction for downloading and installing "check", the unit testing framework that I used:
  • Download the source package (check-0.9.6.tar.gz).
  • Unpack and change into source directory: tar zxf check-0.9.6.tar.gz && cd check-0.9.6
  • Configure: ./configure --prefix=$HOME/local
    Feel free to substitute $HOME/local to anywhere you see fit. This is the destination that "check" will be installed to. Be sure to modify the Makefile to tell it the correct prefix to use.
  • Make and make install: make && make install
    From this point on, you can remove the source directory and the tar.gz package, but keep the "prefix" directory (i.e. the destination).
  • Clean up sources: cd .. && rm -rf check-0.9.6
  • Clean up downloaded package: rm check-0.9.6.tar.gz
A complete transcription of these steps can be found in "installing-check.txt" below.

To make sure that unit testing works, try to break code in queue.c and see how the test fails. First comment out the lines commented by marker 1, make and run the unit test. You should see that the deq_null test fails. Restore queue.c to its original content. Now, comment out the line commented by marker 2, make and run the unit test. You should see that enq_deq_enq_deq test fails.

Homework 2

posted Feb 23, 2009, 5:15 PM by Site Admin Role   [ updated Feb 23, 2009, 5:16 PM ]

 A summary of formulas that might be helpful for homework 2 is attached.


posted Feb 20, 2009, 11:14 AM by Site Admin Role   [ updated Feb 20, 2009, 11:56 AM ]

  • How to read Zα/2 from the table in lecture notes.
  • How many more samples do we need to get in order to reach a certain confidence interval? See more_samples.c attached.
  • Sampling a pseudo random number generator and see how many samples we need in order to reach the target error. See confidence.c attached.
    • Let zed = 1.96 for 95% confidence interval.
    • First use drand48(), uniform distribution [0.0, 1.0), S2 = 1/12.
      • E = 0.005, expected N = 12805
      • E = 0.004, expected N = 20008
      • E = 0.003, expected N = 35570
      • E = 0.002, expected N = 80033
    • Then use dexprand48() (hw1 problem 5), exponential distribution λ = 1.0, S2 = 1/(λ2) = 1.0
      • E = 0.005, expected N = 153664
    • In general, if you want to halve the error, you need to quadruple the number of samples.


posted Feb 14, 2009, 3:16 PM by Site Admin Role

Just went over homework 1 today. Also did an M/M/1 example.

1-10 of 14