Sufficient statistics are in to start a full scale analysis of the Star Trek quiz. (The Babylon 5 quiz needs to be extended with more difficult questions before that can be effectively analyzed.) Details follow.

In the previous version, the only modelling was that of a single classroom. This one is far more involved. These were modelled on a two parameter Rasch fit, which means each question has two assigned parameters: difficulty and discrimination. Unlike the single classroom analysis of the last version, these definitions are driven by the ability of the student answering the question.

If a student has ability level x, then there is a certainly probability of answering a question correctly. If the difficulty of the question has the same value x as the testee’s ability, then there is a 50% probability that the student answers correctly. Students with lesser difficulty have less probability of answering correctly, while students with higher difficulty have greater probability of answering correctly. The rate at which these probabilities change is related to the item’s discrimination value. A higher discrimination means the change is more sudden.

148 people responded to the quiz, and only 18 of them got 100%. 17 of the 20 questions here can be assigned values on this scale. The remaining three (“Who played Spock,” “James T. Kirk’s middle name” and “bad shirt colour”) questions were too easy to be effectively modelled. (In fact, everybody answered “who played Spock” correctly.) The rest were modelled with a basic structure as follows.

First, I needed to define the parameters of the normal distribution. These is extremely preliminary, as 12% of the testing population got 100%. When a testee scores 100%, his or her ability cannot be measured. At best, a confident lower limit can be assigned. In the long term, I intend to add enough difficult questions that nobody gets 100% on the test, but every question is answered correctly by at least one testee. Still, there appeared to be enough distribution that the 18 testees who scored 100% could be removed and the remaining scores formed a decent normal (bell, Gaussian) distribution. Furthermore, that distribution’s upper tail would account for 14% of the population, which is close enough to the 12% that I went ahead with the fit using this data.

To complete the fit, one must arbitrarily choose the mean and standard deviation for the distribution. Both of these values are taken to be 42. Once that has been established, rudimentary fits can be performed to determine the difficulty and discrimination of the questions. Done properly, I would refit student performance and update the norms and keep going back and forth with Newton-Raphson or Runge-Kutte methods until it is the optimal fit for all parameters. That is the way this will be handled once I start programming the test software. While using Google Forms to administer the test, my analysis tools amount to spreadsheets, so I’m only doing a first generation fit at this point. Approximately 2/3 of testees fall between 0 and 84 for their overall performance.

The actual test parameters came out as follows:

Question Difficulty Discrimination
1 – Series creator -52 0.03
2 – Who played Spock? Lower than we can measure Indeterminate
3 – T. in “James T. Kirk” Can’t be statistically measured. (146 out of 148 answered this correctly.) Indeterminate
4 – Ship’s engineer -57 0.04
5 – Last name only? -110 – This is a very flaky fit. 129 out of 148 answered correctly. 9
6 – Serial number of ship -47 0.03
7 – First captain 119 0.05
8 – Shirt colour Too easy to measure. (144 out of 148 answered correctly.) Indeterminate
9 – First through Guardian of Forever 33 0.04
10 – Real name of Leo Walsh 46 0.035
11 – When was Kirk surgically altered? 91 0.015
12 – Theme music composer 92 0.025
13 – Art director / production designer reference on TNG? 1 0.06
14 – Final episode? 122 0.02
15 – Cyrano Jones profession -3 0.03
16 – What is quadrotriticale? 56 0.03
17 – Which episode was referenced in the 30 year anniversary? 1 0.05
18 – Species that experiences Pon Farr? -34 0.05
19 – Episode without nasty kids? 36 0.02
20 – Episode with no sequel? 20 0.02

Our DC comics quiz is complete, and our Marvel comics quiz is in progress. Expect to see the first of these on Monday.