The Force Concept Inventory (FCI) has influenced the development of many research-based pedagogies. However, no data exists on the FCI’s internal consistency or test-retest reliability. The FCI was administered twice to one hundred students during the first week of classes in an electricity and magnetism course with no review of mechanics between test administrations. High Kuder–Richardson reliability coefficient values, which estimate the average correlation of scores obtained on all possible halves of the test, suggest strong internal consistency. However, 31% of the responses changed from test to retest, suggesting weak reliability for individual questions. A chi-square analysis shows that change in responses was neither consistent nor completely random. The puzzling conclusion is that although individual FCI responses are not reliable, the FCI total score is highly reliable.