CS239-1 Homework Assignment #4 - May 3, 2007

Homework is due at the beginning of class, Thursday, May 10, 2007

  1. In the 2^2r experiment design for Time Warp discussed in class, we started with a 2^2 experiment based on the first set of run times, which were 820, 217, 776, and 197 seconds. Analyze a 2^2 experiment using the second set of values from the 2^2r experiment, which were 822, 228, 798, and 180 seconds. Use the analysis method based on solving the 4 equations in four unknowns. Analyze the allocation of variation for the variables.
  2. Repeat the previous question, but using the sign table method instead of solving the simultaneous equations, and using the third set of data instead of the second. This set is 813, 215, 750, and 220. Analyze the allocation of variation for the variables.
  3. Compare the three regression models for 2^2 experiments we have built for this data, the one in the lecture and the two you calculated above. Also compare them to the 2^2r regression model for the full set of 4 replications performed in the lecture. How much difference is there between the models? How does each of them compare to the 2^2r model built in class?
  4. Active networks are a technology where nodes in the middle of a network can perform complex and perhaps even arbitrary operations on packets passing through them on the way from the source to the destination. Such operations are sometimes called adaptations, since they are typically intended to adapt properties of the data flow to local network conditions. Panda was a piece of middleware we built to help deploy multiple adapters in an active network to support adaptive handling of data flows. For example, video streams could be filtered before passing through a limited link to reduce the bandwidth used, compression or decompression could be applied, or encryption could be performed to add security to a wireless link.

    The basic idea was that data would flow from the source node S through n intermediate nodes I1 through In, before being delivered to destination node D. Adapters of different types could be placed at some, all, or none of the intermediate nodes. More than one adapter could be placed on an intermediate node. We have chosen a particular configuration with 5 intermediate nodes for testing. We have the ability to control the bandwidth, delay, and loss rate on each link. We will use a test load of a high bandwidth, high resolution video stream as our workload. We will assume that all nodes are the same kinds of machines and all have the same hardware and software configuration. We have adapters that perform lossless compression and decompression, adapters that drop video frames, adapters that can buffer data, adapters that perform encryption and decryption, and null adapters that perform no transformation on the data, merely injecting the minimal possible costs to put any adapter in the data stream.

    What metrics should we use to evaluate whether Panda is a useful system? What factors might be of concern in measuring the system? Which factors would you choose as primary factors for the experiments? If you think you need to know something not specified here to make a proper choice, describe what it is and why it would be important to know it.

    Describe a 2^k design for an experiment using the primary factors you have chosen. Include the values you selected as levels for each factor.