CS239-1 Homework Assignment #5 - May 10, 2007

Homework is due at the beginning of class, Thursday, May 17, 2007

  1. In 2k-r fractional factorial designs, there are usually several possible different designs. Jain described two possible designs for a 24-1 design on pages 317-320. For his sample system, assume that it is known that D has practically no interaction with any other factors, but the other three factors may or may not have interactions among themselves. Given that assumption, which design is better, the design shown in table 19.5 of Jain's example, or the design described by the table below? Show the confoundings for the below design, and explain your reasoning for deciding which is better.

    Experiment No.		A	B	D	C	AD	BD	ABD
    	1		-1	-1	-1	 1	 1	 1	-1
    	2		 1	-1	-1	-1	-1	 1	 1
    	3		-1	 1	-1	-1	 1	-1 	 1
    	4		 1	 1	-1	 1	-1	-1	-1
    	5		-1	-1	 1	 1	-1	-1	 1
    	6		 1	-1	 1	-1	 1	-1	-1
    	7		-1	 1	 1	-1	-1	 1	-1
    	8		 1	 1	 1	 1	 1 	 1	 1
    


    20 points

  2. THE SITUATION:

    You have been hired as a consultant by Huge, Inc., to recommend the configuration for 10,000 workstations to be distributed to office employees. The company wants to be able to run Windows software, so you are limited to PC-compatible boxes. Because of budget constraints, you cannot buy the top of the line for every component, but the company is very concerned about performance. The primary work performed by the employees will involve word processing (using Microsoft Word), spreadsheets (Microsoft Excel), mail reading (Microsoft Outlook), web browsing (Internet Explorer), and presentations (Microsoft Powerpoint).

    You have decided to run initial benchmarks using two of the five applications, Excel and Powerpoint. Besides the workload, you have identified 5 other factors that will affect the purchase. A preliminary test on what you believe to be a slow configuration reveals that running a single benchmark may take an hour or more. This means that you would have to allocate at least 64 hours (2^6) for running a full factorial design. Worse, because you wish to present reliable results, you have decided that you would like to run each benchmark 5 times. However, you do not have the 320 hours that this would take.

    To save time, you decide instead to use a fractional factorial design. By switching to a 2^{6-2}r design, you can run only 80 experiments while still determining the major effects.

    The six factors you will investigate are:

          Factor    Description             -1 level        +1 level
    
            A       Workload                Powerpoint      Excel
            B       Operating System        Windows XP      Windows Vista
            C       CPU type                Dual Core Intel Multicore Intel Xeon
            D       Memory Size             2 Gb            8 Gb
            E       Disk drive              40 Gb, 5400 RPM 200 Gb, 7200 RPM
            F       Graphics accelerator    Verto Gforce    nVidia Quadro
    

    Because of your knowledge of the problem and operating systems, you believe that the size and speed of the disk drive will not interact significantly with the other factors.

    THE PROBLEM:

    A. Design a 2^{6-2}r experiment for 5 runs. Show the sign table and generator polynomial, and justify your choice of confoundings. Calculate the resolution of your design.

    B. The following link leads to a file containing the raw data for all possible combinations of the 6 factors, giving 5 output values for each combination. The output values are run times in minutes. Select the 80 lines corresponding to your fractional design and perform all appropriate analysis, as summarized in Chapter 19 of Jain. DO NOT ANALYZE ALL 320 COMBINATIONS. YOU SHOULD NOT NEED TO EVEN LOOK AT THE DATA FOR THE 240 COMBINATIONS YOU DID NOT CHOOSE TO EXPERIMENT WITH. I suggest using "grep" or a similar tool to extract the combinations you care about.

    CONDITIONS:

    For this problem, you may use any automated tool at all, so long as you show the sign table and intermediate results at the level generally shown in Jain. (You do not have to show both the Total and the Total/n lines.)


    50 points

  3. You are analyzing a subsystem of a ubiquitous computing environment that determines the physical location of a mobile computer and, based on that location, enrolls it into a suitable wireless network for that location. The system has three alternative location determination mechanisms (based on GPS, localizing using scene analysis of signal strengths of nearby wireless networks, or Placelab's technique of mapping observed wireless beacon IDs to known maps). Enrollment can be performed either with no security, with a simple exchange of credentials that allow secure sharing of a key, or with a full-fledged negotiation protocol that, in addition to setting up a key, allows flexible methods of identifying the user to the network and visa versa.

    The primary concern in this experiment is how long it takes to get the user up and working in the wireless network, which requires completion of both localization and enrollment. Based on the designs of the various systems, it is expected that the chosen localization component has no effect on the performance of the enrollment component, and visa versa.

    You have run a two-factor full factorial experiment of this situation, with the following results, in seconds:

    				No security	Credentials	Negotiation
    
    	GPS			20.7		 23.3		25.0
    
    	Scene Analysis		12.8		 15.8		18.1
    
    	Placelab		9.7		 11.8		14.6
    

    Perform the analysis to determine the effects for the experiment. Estimate the experimental errors and describe the allocation of variation for the effects and errors. Perform an analysis of the variance. Plot the quantile-quantile plot and residuals vs. predicted response, and comment on them.


    30 points