First Published in Nature (1907), No. 1949, Vol. 75, 450-451.
In these democratic days, any investigation into the trustworthiness and peculiarities of popular judgments is of interest. The material about to be discussed refers to a small matter, but is much to the point.
A weight-judging competition was carried on at the annual show of the West of England Fat Stock and Poultry Exhibition recently held at Plymouth, A fat ox having been selected, competitors bought stamped and numbered cards, for 6d. each, on which to inscribe their respective names, addresses, and estimates of what the ox would weigh after it had been slaughtered and " dressed." Those who guessed most successfully received prizes. About 8oo tickets were issued, which were kindly lent me for examination after they had fulfilled their immediate purpose. These afforded excellent material.
The judgments were unbiased by passion and uninfluenced by oratory and the like. The sixpenny fee deterred practical joking, and the hope of a prize and the joy of competition prompted each competitor to do his best. The competitors included butchers and farmers, some of whom were highly expert in judging the weight of cattle; others were probably guided by such information as they might pick up, and by their own fancies.
The average competitor was probably as well fitted for making a just estimate of the dressed weight of the ox, as an average voter is of judging the merits of most political issues on which he votes, and the variety among the voters to judge justly was probably much the same in either case. After weeding thirteen cards out of the collection, as being defective or illegible, there remained 787 for discussion. I arrayed them in order of the magnitudes of the estimates, and converted the cwt., quarters, and lbs, in which they were made, into lbs., under which form they will be treated.
Distribution of the estimates of the dressed weight of a particular living ox, made by 787 different persons.
According to the democratic principle of "one vote one value," the middlemost estimate expresses the vox populi, every other estimate being condemned as too low or too high by a majority of the voters (for fuller explanation see " One Vote, One Value," NATURE, February 28, p. 414), Now the middlemost estimate is 1207 lb., and the weight of the dressed ox proved to be 1198 lb.; so the vox populi was in this case 9 lb., or 0.8 per cent of the whole weight too high. The distribution of the estimates about their middlemost value was of the usual type, so far that they clustered closely in its neighbourhood and became rapidly more sparse as the distance from it increased.
Diagram from the tabular values.
But they were not scattered symmetrically. One quarter of them deviated more than 45 lb. above the middle most (3.7 per cent.), and another quarter deviated more than 29 lb. below it (2.4 per cent.), therefore the range of the two middle quarters, that is, of the middle-most half, lay within those limits.
It would be an equal chance that the estimate written on any card picked at random out of the collection lay within or without those limits. In other words, the "probable error" of a single observation may be reckoned as 1/2 (45+29), or 37 lb. (3.1 per cent.). Taking this for the p.e. of the normal curve that is best adapted for comparison with the observed values, the results are obtained which appear in above table, and graphically in the diagram.
The abnormality of the distribution of the estimates now becomes manifest, and is of this kind. The competitors may be imagined to have erred normally in the first instance, and then to have magnified all errors that were negative and to have minified all those that were positive. The lower half of the "observed" curve agrees for a large part of its range with a normal curve having the p.e.=45, and the upper half with one having its p.e.=29. I have not sufficient knowledge of the mental methods followed by those who judge weights to offer a useful opinion as to the cause of this curious anomaly. It is partly a psychological question, in answering which the various psychophysical investigations of Fechner and others would have to be taken into account. Also the anomaly may be partly due to the use of a small variety of different methods, or formulae, so that the estimates are not homogeneous in that respect.
It appears then, in this particular instance, that the vox populi is correct to within 1 per cent of the real value, and that the individual estimates are abnormally distributed in such a way that it is an equal chance whether one of them, selected at random, falls within or without the limits of -3.7 per cent and +2.4 per cent of their middlemost value.
This result is, I think, more creditable to the trust-worthiness of a democratic judgment than might have been expected.
The authorities of the more important cattle shows might do service to statistics if they made a practice of preserving the sets of cards of this description, that they may obtain on future occasions, and loaned them under proper restrictions, as those have been, for statistical discussion. The fact of the cards being numbered makes it possible to ascertain whether any given set is complete.
A facsimile of the original article is available here. Galton's piece resulted a few letters to the editor of Nature that were printed (along with Galton's responses) a few weeks later. The letters can be seen here and here in their original format but are reproduced below. In them Galton reveals that he had also calculated the mean answer as opposed to just the median revealed in the article.
Nature March 28 1907, NO. 1952, VOL. 75
LETTERS TO THE EDITOR.
IN reference to the weight-judging competition, Mr. Gallon says that " the average competitor was probably as well fitted for making a just estimate of the dressed weight of the ox as an average voter is of judging the merits of most political issues on which he votes." These competitions are very popular in Cornwall ; but I do not think that Mr. Gallon at all realises how large a percentage of the voters-the great majority, I should suspect -are butchers, farmers, or men otherwise occupied with cattle. To these men the ability to estimate the meatequivalent weight of a living animal is an essential part of their business ; and, as an instance of their training, I may mention that one of the butchers here has a son under thirteen years of age who is an adept at this vArk, and is already, I am told, one of the best weight-judges in the district. This boy has been trained to it by his father, and already surpasses his instructor. Moreover, many of the competitors doubtlessly compete frequently, compare notes afterwards, and correct future estimates by past experience. Now the point of all this is that, in so far as this state of things prevails, we have to deal with, not a vox populi, but a vox expertorunt. - I am afraid -that the majority of such competitors know far more of their business, are far better trained, and are better fittedto form a judgment, than are the majority of voters of-any party, and of either the uneducated or the so-called " educated " classes. I heartily wish that the case were otherwise.
F. H. PERRI-COSTE.
Polperro, Cornwall, March 21.
I INFERRED that many non-experts were among the competitors, (1) because they were too numerous (about 800) to be mostly experts ; (2) because of the abnormally wide vagaries of judgment at either end of the scale ; (3) because of the prevalence of a sporting instinct, such as leads persons who, know little about horses to bet on races. But I have no facts whereby to test the truth of my inference. It would be of service in future competitions if a line headed " Occupation " were inserted 'in the cards, after those for the address.
MR. HOOKER, in NATURE of March 21, seems not to have quite appreciated my principal contention in the letters "One Vote, One Value" and " Vox Populi " of February 28 and March 7 respectively. It was to show that the verdict given by the ballot-box must he the Median estimate, because every other estimate is condemned in advance by a majority of the voters. This being the case, I examined the votes in a particular instance according to the most appropriate method for dealing with medians, quartiles, &c. I had no intention of trespassing into. the technical and much-discussed question of the relative merits of the Median and of the several kinds of Mean, and beg to be excused from not doing so now except in two particulars. First, that it may not be sufficiently realised that the suppression of any one value in a series can only make the difference of one half-place to the median, whereas if the series be small it may make a great difference to the mean ; consequently, I think my proposal that juries should openly adopt the median when estimating damages, and councils when estimating money grants, has independent merits of its own, besides being in strict accordance with the true theory of the ballot-box. Secondly, Mr. Hooker's approximate calculation from my scanty list of figures, of what the mean would be of all the figures, proves to be singularly correct ; he makes it 1196 lb. (which is the mean of the deviates at 5°, 15°, 95°), whereas it should have been 1197 lb. This shows well that a small orderly sample is as useful for calculating means as a very much larger random sample, and that the compactness of a table of centiles is no hindrance to their wider use. I regret to be unable -to learn the proportion of the competitors who were farmers, butchers, or non-experts. It would be well in future competitions to have a line on the cards for ` occupation." Certainly many non-experts competed, like those clerks and others who have no expert knowledge of horses, but who bet on races, guided by newspapers, friends, and their own fancies.
However it should be noted that in another letter a few weeks previous Galton had not exhibited much faith in the mean as a useful measure of collective judgement.
Nature, Volume 75, Issue 1948, pp. 414 (1907).
ONE VALUE, ONE VOTE
A CERTAIN class of problems do not as yet appear to be solved according to scientific rules, though they are of much importance and of frequent recurrence. Two examples will suffice. (1) A jury has to assess damages. 2) The council of a society has to fix on a sum of money, suitable for some particular purpose. Each voter, whether of the jury or of the council, has equal authority with each of his colleagues. How can the right conclusion be reached, considering that there may be as many different estimates as there are members? That conclusion is clearly not the average of all the estimates, which would give a voting power to "cranks'' in proportion to their crankiness. One absurdly large or small estimate would leave a greater impress on the result than one of reasonable amount, and the more an estimate diverges from the bulk of the rest, the more influence would it exert. I wish to point out that the estimate to which least objection can be raised is the middlemost estimate, the number of votes that it is too high being exactly balanced by the number of votes that it is too low. Every other estimate is condemned by a majority of voters as being either too high or too low, the middlemost alone escaping this condemnation. The number of voters may be odd or even. If odd, there is one middlemost value; thus in 11 votes the middlemost is the 6th; in 99 votes the middlemost is the 50th. If the number of voters be even, there are two middlemost values, the mean of which must be taken; thus in 12 votes the middlemost lies between the 6th and the 7th; in 100 votes between the 50th and the 51st. Generally, in 2n-1 votes the middlemost is the nth; in 2n votes it lies between the nth and the (n + 1)th.
I suggest that the process for a jury on their retirement should be (1) to discuss and interchange views ; (2) for each juryman to write his own independent estimate on a slip of paper ; (3) for the foreman to arrange the slips in order of the values written on them ; (4) to take the average of the 6th and 7th as the verdict, which might finally be approved by a substantive proposition. Similarly as regards the resolutions of councils, having regard to the above (2n -1) and 2n remarks.