dissenting.
I agree with the majority’s conclusion that the opinion of the magistrate, who tried the case below by consent, seriously fails to consider much of the extensive and sophisticated statistical evidence presented at trial, and partly fails to evaluate the various kinds of evidence, statistical and non-statistical, in the proper framework. I also agree, for the most part, with the majority’s explicaton of the proper framework for considering classwide and subsidiary individual claims of disparate treatment. I disagree, however, with its conclusions that under the proper analysis several of the plaintiffs’ claims of discrimination were clearly established. I conclude instead that these claims must fail, though on grounds at times somewhat different from those relied on by the district court, or at most should be remanded for further findings under the proper mode of analysis lest we usurp the role of factfinder. In the discussion below I restate the class action disparate treatment framework in order to clarify a point argued by the plaintiffs and left unclear in the majority opinion, and then turn to the particular claims of discrimination.
I. Legal Standards
As the majority notes, ante at 4, disparate treatment claims, whether brought on behalf of an individual or a class, begin with the requirement that the plaintiff establish a prima facie case of discrimination. In McDonnell Douglas Corp. v. Green, 411 U.S. 792, 802, 93 S.Ct. 1817, 1824, 36 L.Ed.2d 668 (1973), the Supreme Court specified the elements of a prima facie case of individual, racially motivated hiring discrimination: that the plaintiff belongs to a racial minority, applied and was qualified for an open position, was rejected, that the position remained open, and that the employer continued to seek similarly qualified applicants. Because these particular elements would make no sense if applied in other discrimination contexts, such as racial discrimination against non-minority-group members or discrimination in areas of employment other than hiring, the Court has emphasized that “[t]he facts necessarily will vary in Title VII cases, and the specification above of the prima facie proof required from [the plaintiff] is not necessarily applicable in every respect to differing factual situations.” Id. at 802 n. 13, 93 S.Ct. at 1824 n. 13; see also Texas Department of Community Affairs v. Burdine, 450 U.S. 248, 253 n. 6, 101 S.Ct. 1089, 1093 n. 6, 67 L.Ed.2d 207 (1981); Furnco Construction Corp. v. Waters, 438 U.S. 567, 575-76, 98 S.Ct. 2943, 2948-49, 57 L.Ed.2d 957 (1978); International Brotherhood of Teamsters v. United States, 431 U.S. 324, 358, 97 S.Ct. 1843, 1866, 52 L.Ed.2d 396 (1977). The critical characteristic of the prima facie case, no matter what form particular circumstances require it to take, is that it demonstrate that some decision was made and eliminate the “most common legitimate reasons on which an employer might rely” in making the employment decision at issue, so as, “absent other explanation, to create an inference that the decision was a discriminatory one.” Teamsters, 431 U.S. at 358 n. 44, 97 S.Ct. at 1866 n. 44. This principle, although first articulated in McDonnell Douglas, an individual disparate treatment case, applies with equal force in class actions. Id. at 358-59, 97 S.Ct. at 1866.
McDonnell Douglas also established that a plaintiff who succeeds in establishing a prima facie case must prevail unless the defendant dispels the inference of discrimination by “articulatpng] some legitimate, nondiscriminatory reason” for its action, 411 U.S. at 802, 93 S.Ct. at 1824, and that even if the defendant meets this burden of production, the plaintiff may prevail by overcoming the defendant’s proffered explanation, id. at 804, 93 S.Ct. at 1825. The Supreme Court has made clear that in individual disparate treatment cases the ultimate burden of persuasion always remains with the plaintiff. Burdine, 450 U.S. at 253, 101 S.Ct. at 1093. If the defendant never elicits or presents evidence *486suggesting a legitímate reason for its actions, this burden is met by the prima facie case alone, because the prima facie case constitutes “the establishment of a legally mandatory, rebuttable presumption” rather than “enough evidence to permit the trier of fact to infer the fact at issue.” Id. at 254 n. 7, 101 S.Ct. at 1094 n. 7. The purpose of the presumption created by this initial burden is to “flush out” the relevant evidence, by giving the parties in whose control it lies an incentive to produce it expeditiously. Id. at 253, 255 n. 8, 101 S.Ct. at 1093, 1094 n. 8. Once the defendant meets its burden of production, this purpose is discharged, and “the presumption drops from the case,” id. at 255 n. 10, 101 S.Ct. at 1095 n. 10, leaving the plaintiff with the “ultimate burden” (which he or she retained all along, and met initially because of the presumption) “of persuading the trier of fact that the defendant intentionally discriminated against the plaintiff,” id. at 253, 101 S.Ct. at 1093. Indeed, the Court recently emphasized that if the employer responds to the plaintiffs case by presenting evidence, whether the parties satisfied their intermediate burdens becomes irrelevant, and the factfinder should simply consider the ultimate issue whether the plaintiffs explanation is persuasive. United States Postal Service Board of Governors v. Aikens, 460 U.S. 711, 103 S.Ct. 1478, .1482, 75 L.Ed.2d 403 (1983). Aikens thus makes clear that the “shifting burdens” approach of McDonnell Douglas is not a prescription for refereeing carefully timed volleys of proof and counterproof, but rather “ ‘merely a sensible, orderly way to evaluate the evidence in light of common experience as it bears on the critical question of discrimination.’ ” Id., quoting Furnco, 438 U.S. at 577, 98 S.Ct. at 2949; see also Holden v. Commission Against Discrimination, 671 F.2d 30, 35-36 (1st Cir.) (allocation of burdens goes to analysis rather than timing of proof), cert. denied, 459 U.S. 843, 103 S.Ct. 97, 74 L.Ed.2d 88 (1982); Nulf v. International Paper Co., 656 F.2d 553, 560 (10th Cir. 1981) (same); Sime v. Trustees of the California State University & Colleges, 526 F.2d 1112, 1114 (9th Cir.1975) (same).
The applicability of the McDonnell Douglas analysis to classwide claims of discrimination is an issue challenged by the plaintiffs in this case and ambiguously resolved in the majority opinion. Title VII class actions and pattern-or-practice eases brought by the government on behalf of a class of employees, see 42 U.S.C. § 2000e-6 (1976), are generally bifurcated into a liability phase, in which the classwide discrimination is sought to be proved, and a remedial phase, in which individuals’ membership in the class and the extent of their injuries are at issue. Teamsters, 431 U.S. at 361, 97 S.Ct. at 1867. What must be proved in phase I is that the employer engaged in a consistent rather than sporadic pattern of discrimination. Id. at 336, 97 S.Ct. at 1855 (pattern-or-practice case); Franks v. Bowman Co., 424 U.S. 747, 751, 772, 96 S.Ct. 1251, 1257, 1268, 47 L.Ed.2d 444 (1976) (class action). Teamsters and Franks explicitly hold that if a plaintiff class or the government prevails at phase I the burden of disproving discrimination against individuals who seek relief at phase II shifts to the employer without more. Teamsters, 431 U.S. at 359 & n. 45, 97 S.Ct. at 1867 & n. 45; Franks, 424 U.S. at 772-73 & n. 32, 96 S.Ct. at 1268 & n. 32. The confusion that I perceive in this area stems from the manner in which the Court described this relationship between phase I and phase II: Prevailing at phase I is said to establish the prima facie ease for each individual class member at phase II, with the added twist that the burden of persuasion rather than production on the individual claim then shifts to the employer. Teamsters, 431 U.S. at 359, 97 S.Ct. at 1866, quoting Franks, 424 U.S. at 772, 96 S.Ct. at 1268 (“By ‘demonstrating the existence of a discriminatory hiring pattern and practice’ the plaintiffs had made out a pri-ma facie case of discrimination against individual class members; the burden therefore shifted to the employer ‘to prove that individuals who reapply were not in fact victims of previous hiring discrimination.’ ”); id. 431 U.S. at 360, 97 S.Ct. at *4871867. (“At the initial, ‘liability’ stage of a pattern-or-practice suit the Government is not required to offer evidence that each person for whom it will ultimately seek relief was a victim of the employer’s discriminatory policy. Its burden is to establish a prima facie case that such a policy existed.”). The plaintiffs interpret this formulation to mean that establishment of a prima facie ease of a classwide pattern of discrimination at phase I, rather than ultimately prevailing on that issue, ends the liability phase and hurtles the case into phase II individual remedial proceedings, shifting the burden of persuasion to the defendant. PB 6. This interpretation conflates the prima facie case at phase I and the prima facie case at phase II. Teamsters makes clear, however, that, as in individual cases, the plaintiff bears the ultimate burden of persuasion at phase I. 431 U.S. at 336, 97 S.Ct. at 1855 (“As the plaintiff, the Government bore the initial burden of making out a prima facie case of discrimination [citing McDonnell Douglas]. And, because it alleged a systemwide pattern or practice of resistance to the full enjoyment of Title VII rights, the Government ultimately had to prove more than the mere occurrence of isolated or ‘accidental’ or sporadic discriminatory acts. It had to establish by a preponderance of the evidence that racial discrimination was the company’s standard operating procedure— the regular rather than the unusual practice.”) (emphasis added) (citations and footnote omitted); id. at 361, 97 S.Ct. at 1867 (“[T]he question of individual relief does not arise until it has been proved that the employer has followed an employment policy of unlawful discrimination.”). The employer must have an opportunity to answer the classwide prima facie case, see Hazelwood School District v. United States, 433 U.S. 299, 309-10, 97 S.Ct. 2736, 2742-43, 53 L.Ed.2d 768 (1977), before it can be considered a “proved wrongdoer,” Teamsters, 431 U.S. at 359, 97 S.Ct. at 1866 n. 45, so as to justify shifting to it the burden of persuasion on the individual claims for relief, id. at 361-62, 97 S.Ct. at 1867-68.
I believe the majority agrees with this analysis, for it speaks of proof of a pattern or practice of discrimination constituting the prima facie case for individual class members. Ante at 470 n. 7. I consider the majority opinion ambiguous only because it rejects the McDonnell Douglas mode of analyzing the evidence in phase I. Ante at 470 n. 7. If this rejection is meant to indicate that the “proof” required in the liability phase of a class case is “lower” than that required in an individual case under McDonnell Douglas and Burdine, I disagree. The stated reason for rejecting the McDonnell Douglas approach is that its burden-shifting “minuet,” see Vuya-nich v. Republic National Bank, 521 F.Supp. 656, 661 (N.D.Tex.1981), vacated and remanded, 723 F.2d 1195 (5th Cir. 1984), is an unsuitable way to organize complex statistical proof. But, as I have argued above, the burden-shifting approach does not require that the evidence be organized and submitted in responsive volleys. Aikens establishes that the McDonnell Douglas/Burdine approach means no more than that when employers present no evidence the plaintiffs prevail if their own evidence passes a threshold of proof, and that when employers do present evidence the plaintiffs prevail if their own evidence is sufficient and more persuasive. 103 S.Ct. at 1482. This approach is not more unsuitable for class claims than for individual claims, because the approach concerns burdens rather than timing. There is no reason why the plaintiff’s burden in phase I of a class case should be higher or lower than in an individual case; indeed, by citing McDonnell Douglas, Teamsters indicates that the relative burdens between plaintiff and defendant are the same in both kinds of cases. 431 U.S. at 336, 97 S.Ct. at 1855. What will vary among different types of cases is the formulation of what proof is sufficient to meet these burdens; but the Court has interpreted the McDonnell Douglas principle to encompass these variations. Id. at 358, 97 S.Ct. at 1866 (“The importance of McDonnell Douglas lies, not in its specification of the discrete elements *488of proof there required, but in its recognition of the general principle that any Title VII plaintiff must carry the initial burden of offering evidence adequate to create an inference that an employment decision was based on a discriminatory criterion illegal under the Act.”). It therefore makes no sense to hold the principle inapplicable in class actions. I conclude that the McDonnell Douglas assignment of burdens, though not of course its particular formulation of what there constituted a prima facie case, applies in phase I of this case.
With these qualifications I agree with the majority’s analysis of the applicable legal standards. I hasten to add that the application of these standards in this case is simple in some ways and complex in others. It is simple in that phase I shifting burdens need not be considered at all: Because the employer did present evidence to counter the plaintiffs’, under Aikens the court should simply consider whether the plaintiffs’ evidence was persuasive. It is complex in that the bifurcation between phase I and phase II, like the shifting of burdens under McDonnell Douglas, may partly be more analytical than temporal. For example, evidence of individual instances of discrimination may be used in phase I as part of the proof of classwide discrimination, as well as in phase II either in establishing an individual claim (if no class claim is proved) or in answering the employer’s evidence (if success on the class claim shifts the burden of proof); likewise, the employer's evidence rebutting the inference of discrimination in individual instances may be used to help counter the classwide case in phase I, see, e.g., Paxton v. Union National Bank, 688 F.2d 552, 587 (8th Cir.1982), cert. denied, — U.S. -, 103 S.Ct. 1772, 76 L.Ed.2d 345 (1983), or to meet its burden on the individual claims. Despite these complexities, the framework is useful for organizing and evaluating the evidence.
II. Specific Claims
Background
St. Cloud State University is one of seven public institutions governed by the Minnesota State University board (formerly the Minnesota State College Board). St. Cloud was once purely a teachers’ college, and was called St. Cloud State College until 1975, when it assumed its current name in recognition of the growth of its more broadly based educational programs. V.15.56, V.16.73. It now offers a few two-year associate’s degree programs, a broad range of four-year bachelor’s degree programs, and some graduate programs. V.16.11-12. Currently the university is divided into several “colleges”: the College of Business (containing the departments of Accounting, Business Education and Office Administration, Management and Finance, Marketing and General Business, and Quantitative Methods), the College of Education (containing the departments of Educational Administration/Leadership; Health, Physical Education and Recreation; Psychology; and Special Education); the College of Fine Arts (containing the departments of Art, Music, and Theatre); the College of Industry (containing the departments of Industrial Education and Technology); and the College of Liberal Arts and Sciences (containing the departments of Biological Science; Chemistry; Earth Sciences; Economics; English; Foreign Languages and Literature; Geography; History; Interdisciplinary Studies; Mass Communication; Mathematics and Computer Science; Philosophy; Physics and Astronomy; Political Science; Sociology, Anthropology, and Social Work; Speech Communication; and Speech Science, Pathology and Audiology). Px. 268. Each department is headed by a chairperson, who is a member of the faculty budgeted for that department rather than a member of the university administration. Each college is headed by a dean. The deans answer to the Vice President for Academic Affairs, who together with the Vice Presidents for Student Life and Development, Administrative Affairs, and University Relations answers to the President. The deans, vice presidents, and President, together with various assistants and associates, are members of the administration. Grachek II at 60-66; *489Px. 266; V.15.154-55. The State University Board appoints the presidents of the seven state universities, and delegates most authority for appointing other administrators and faculty to the universities themselves. V.15.71-74.
The teaching faculty at St. Cloud are employed under three kinds of contracts and at four (formerly five) ranks. They may be employed as fixed-term, probationary, or tenured employees. Fixed-term appointments, which ordinarily may extend no more than two years, are typically used to fill temporary or last-minute vacancies, or to fill positions necessitated by “enrollment bulges,” for which permanent funding is unavailable. Probationary or tenure-track appointments are year-to-year appointments of longer expected duration during which faculty members are reviewed as candidates for tenure. Tenure is an employment status automatically granted to any probationary employee retained more than six years, but which may be granted earlier, which provides the expectation of indefinite employment and some actual guarantees of job security, such as a requirement of notice prior to termination. V.3.4-6; V.15.95-96; V.16.19-22. In addition, faculty members hold various ranks: professor (I), associate professor (II), assistant professor (III), and instructor (IV). A fifth rank, assistant instructor, was merged with rank IV in 1975. V.3.11; V.15.13-14; Dx. 69. Faculty members are assigned a rank when they are initially hired, and thereafter may be promoted to higher ranks. They may be nominated or request to be considered for promotion or tenure, and are then evaluated by their department chairperson and college dean, the chairperson of an evaluation committee, the Vice President for Academic Affairs, and the President. The President has the ultimate authority to grant or deny the requested change in status. V.3.38-39; V.19.86, 90; Px. 271b at 27; Px. 271c at 25-26.
Until 1971 “minimum academic achievement” standards for assignment to rank, with specified exceptions, were established by state regulations promulgated by the State College Board. Dx. 6, § (o). In 1971 the Board authorized the seven individual institutions to set their own standards and promulgated an interim policy, to be effective until the new standards were formally adopted. Dx. 7. Both the original and interim policies used attainment of or progress toward an academic degree as the sole criteria for rank determinations. Between 1971 and 1972, at the request of the Vice President for Academic Affairs, St. Cloud’s deans formulated more stringent and considerably more specific guidelines, using additional criteria such as teaching ability, scholarship, and amount of experience, to guide and regularize their recommendations for initial appointments and promotions. These “deans’ guidelines,” meant to help standardize the evaluation process even though they could not dictate the final decisions, were distributed to the faculty in March 1972. Dx. 15; Px. 260d; V.16.31-36. The interim State College Board Operating Policy, which remained in effect despite the existence of the deans’ guidelines, was superseded in 1974 by the official adoption of A.P.T. (appointment, promotion, and tenure committee) Guidelines for Appointment and Promotion, which were similar to the deans’ guidelines. Dx. 16; V.16.41-42. In 1976 the State University Board entered a collective bargaining agreement with the Inter-Faculty Organization/Minnesota Education Association ("IFO/MEA”), the exclusive bargaining agent of the faculty members of all seven state universities. Px. 271a; V.15.-46-51. Article XI, section B of the agreement, and of successor agreements effective in 1977 and 1979, Px. 271b, 271c, set new minimal criteria for appointment to rank, consisting of degree and experience requirements less detailed than the deans’ or A.P.T. guidelines; provided that the parties would meet and confer about qualifications and criteria upon request; and vested in the employer the discretion to assign a faculty member to a rank lower than that for which he or she was minimally qualified and to determine what degrees or experience satisfied the generalized criteria. *490New detailed Guidelines for Retention, Tenure, and Promotion, similar to but progressively more comprehensive than the deans’ and A.P.T. guidelines, were adopted by the faculty senate at St. Cloud in 1976 and 1977 pursuant to the meet-and-confer provision. Dx. 17, 18; V.16.42-44, 46-49. These guidelines were not re-promulgated under the third collective bargaining agreement, but continued to be used in evaluating promotion recommendations. V.16.48-49. Article XXIV of the second and XXV of the third collective bargaining agreements list seven general criteria (such as teaching ability, scholarly achievement, length of service, and experience) to be used in making promotion decisions, but does not specify what weight the criteria should be assigned or how the requirements should vary from rank to rank.
The State College Board regulations discussed above contained a salary schedule that prescribed minimum and maximum salaries for each rank, with some overlap between ranks. Dx. 6, § (p). The minimum/maximum salary range approach was retained under the three collective bargaining agreements, Px. 271a, art. XI; Px. 271b, art. XI; Px. 271c, art. XI; V.16.103, but was changed in the course of the third agreement to a more precise step-and-lane approach, Px. 271c, app.; V.15.52. To understand how particular salaries were set under these systems requires some background explanation of how the universities are funded. The state university system is funded by biennial appropriations by the Minnesota legislature to the State University Board, which in turn allocates the available money among the seven universities. V.15.20. Before 1977 the legislature appropriated one average faculty salary for every nineteen “full-time equivalent” students (i.e., counting part-time students proportionately to their course load), plus additional money to be used for salary increases; since 1977 it has used the 1977 enrollment base, and has funded subsequent growth in the size of the faculties solely from increased tuition revenues. V. 15.44-46. The specific salary within the salary range that a faculty member was awarded on initial appointment depended to some degree on the available appropriation for that year. V.16.105. This initial salary became a base salary that could be increased in subsequent years in five ways: by across-the-board increases, by promotion in rank or degree completion, by performance adjustments, by market factor adjustments, and by equity adjustments. The legislature typically appropriated funds equal to a specific percentage of current faculty salaries to be used for increases, and the State College Board/State University Board specified how the funds it provided to each university would be allocated between flat sum or percentage across-the-board increases, merit/performance increases, and other adjustments. Dx. 21, ex. 6a, 7a, 8a, 10a; Dx. 23 at 4. Each year the university determined a uniform increase to accompany promotions to each rank or completion of a degree (e.g., $600 for promotion to rank I, $400 for promotion to rank II, $300 for promotion to rank III, and $500 for completion of a doctoral degree, see id.), although the amounts varied from year to year, and occasionally a shortage of allocated funds required the university to award “dry promotions,” or promotions without increase in salary, V.15.29; V.16.103-06. Performance increases were awarded in each year that there was sufficient funding, to faculty members .who applied, upon the department’s or dean’s recommendation and with the approval of the Vice President for Academic Affairs and the President; except that no increase could be awarded if it would push a faculty member’s salary above the maximum allowed for his or her rank. Prior to the adoption of the step- and-lane schedules the increases were either a flat amount or divided into two steps in each year, although the amounts varied from year to year. V.16.106-13. Scarcity of and competition for a particular faculty member’s skills, either generally or because of the state of the job market in a particular year, always played a part in setting his or her initial salary within the permissible range, but only in 1979 when *491step-and-lane salary schedules were adopted did the universities begin awarding some market factor increases (the number depending on available funds) of one step in the schedule to existing faculty in seven specified high-competition, scarce market fields, such as economics and computer science. V.19.145-50. Each of the foregoing types of adjustment is an alteration of the base salary rather than a one-time payment. Because market and funding circumstances played such a great role in the determination both of initial salaries and of several of the types of subsequent increases, and because the accidental variations in those amounts, depending on the year in which the determinations were made, carried over from year to year, unwarranted differences between the salaries of similar faculty members were not uncommon. To counter this inevitable but unintended consequence of the funding structure, the universities historically set aside a portion of the salary-increase funds for equity adjustments, to be added, like the other kinds of increases, to the base salary. V.15.28-30, 172; V.16.103, 114-15. The step-and-lane salary schedules adopted in 1979 were intended to prevent the problem of distortion that equity adjustments had been used to remedy. V.16.113.
In 1971 the State College Board created a commission of women to study the status of women in the state colleges. V.15.15-16. In 1972 the commission issued two reports that compiled and analyzed data on such things as the number, rank, responsibilities, salaries, tenure status, age, and degree of job satisfaction of women as compared to men on the faculties, and made a series of recommendations stressing that women should be sought out and encouraged to apply for faculty, administrative, committee, and other positions, and that the procedures used to grant benefits such as salary, promotion, tenure, and sabbatical leave be scrutinized to ensure equal treatment of men and women. Px. 302a, 302b. Partly in response to these recommendations the State College Board created a Human Rights Compliance Commission to develop an affirmative action program for the state college system. V.15.-16-17. On August 28, 1972, it adopted such a plan, Px. 272b at 13; V.15.17-18, although the state college system was not a federal contractor required by Executive Order No. 11,246, § 202(1), 30 Fed.Reg. 12,319 (1965) (as amended), reprinted in 42 U. S.C. at § 2000e (1976), to have an affirmative action plan, see Px. 275, and was not required by Minnesota state law to have such a plan until two years later, see Px. 272c; no state or federal agency has ever found probable cause to believe that St. Cloud has discriminated, so as to justify a remedial affirmative action order, see V. 16.79, and universities were not even required to complete self-evaluations of policies and practices in this area until one year from March 24, 1972, the effective date of the Education Amendments of 1972, see 34 C.F.R. § 106.3(c) (1983).
One of the requirements of the 1972 affirmative action plan was that each of the state colleges conduct an “immediate inquiry into the salaries of minorities, women and men with like degrees, comparable years of service, and rank or position to ascertain where salary inequities exist or have existed.” Px. 272b at 14-15. The State College Board requested the Minnesota legislature to supplement the 1973 biennial appropriation to cover the costs of correcting any such inequities, but the legislature declined. V.15.19-20. The Board therefore directed the state colleges to make affirmative action equity adjustments before making any of the other types of salary adjustments with the funds allocated to salary increases. V.15.20-21; Dx. 21, ex. 9a-9c. To help in determining the extent of salary discrepancies the Chancellor of the State College Board invited the technical assistance of the Wage and Hour Division of the U.S. Department of Labor. V.15.21-23; Px. 208; D.R. 87. The Wage and Hour Division proposed to calculate the discrepancies by comparing women and minority faculty members to other faculty with similar degrees and experience, without regard to rank. The state colleges accepted this proposal, and made the ad*492justments in the 1973/74 academic year. V.15.22-23; V.16.116-18; Dx. 22; Px. 209-14. Subsequently', a class action was filed on behalf of women at three of the seven state colleges under the equal pay provision of the Fair Labor Standards Act, 29 U.S.C. § 206(d) (1976), seeking further wage adjustments retroactive to the 1972/73 academic year. This case was settled in 1976 by the defendants’ payment, without admission of liability, to seventy-five women, including the named plaintiff and two of the intervenors in this case, of specified sums stipulated to be a “complete and final” adjustment of all such wage claims to date. Dx. 10; V.2.230-32. At the same time an equal pay action brought by the Secretary of Labor against the State of Minnesota was dismissed upon the stipulation that the defendant, again without admission of liability, had made the payments noted above and was currently in compliance with the Fair Labor Standards Act. Dx. 11; see V.15.78-79.
The present action was filed on June 16, 1976, following plaintiff Mary Craik’s unsuccessful bid for a chairpersonship. She alleged classwide and individual discrimination against women in recruiting, hiring, appointment to rank, promotion, chairper-sonship appointments, compensation, benefits, and work atmosphere, and sought relief for all past, current, or future women employees and applicants for employment at St. Cloud under Title VII of the Civil Rights Act of 1964 and the fourteenth amendment to the United States Constitution. D.R. 1. On August 28, 1978, the district court declined to certify so broad a class because of problems of typicality and adequacy of representation, but certified a narrower class of all past or current women members of the teaching faculty at St. Cloud, conditional on the intervention of additional plaintiffs who claimed “injuries ... more representative of the class allegations.” D.R. 23. Julie Andrzejewski, Joan Hemmer, and June Goemer thereafter intervened. D.R. 30, 50. On September 29, 1978, the parties stipulated to the dismissal with prejudice of the claims of discrimination in insurance and retirement benefits, and of discrimination in compensation based on an equal pay theory through the 1975/76 academic year. D.R. 32, 37. A motion to expand the class to include past or current administrators who were not members of the teaching faculty was denied. D.R. 60. The remaining claims were tried before a United States magistrate by the parties’ consent. D.R. 59. Chairpersonships
The plaintiffs offered three kinds of evidence to support their claim that St. Cloud systematically discriminates against women in the appointment of chairpersons: statistics showing that the percentage of past and present chairpersons who are women is much smaller than the percentage of women on the faculty; evidence of failure to comply with affirmative action policies, used to show bad intent; and an individual case of alleged discrimination. Before discussing this evidence, I shall sketch briefly the method by which chairpersons are appointed at St. Cloud.
Starting in 1976 such appointments have been made under procedures prescribed by the collective bargaining agreement. Article XX of the agreement and of successor-agreements specifies that the President, after consultation with the department faculty, shall determine whether candidates shall be sought outside as well as within the university, and that the department shall select a nominee from among the candidates by a secret-ballot election. This nominee is presented to the President, who must appoint within ten working days, or reject the nomination and be prepared to justify the decision before the department. If the nominee is rejected, the department must conduct a second secret-ballot election and nominate a different candidate, who must be appointed or rejected in the same manner as the first nominee. The President has the power to appoint, in consultation with the department, a short-term chairperson to fill a temporary vacancy caused by leave of absence, failure to appoint the department’s second nominee, or the like. The term of a chairperson, other than that of a temporary chairperson ap*493pointed under this procedure, is three years.
The plaintiffs assert in their brief, PB 12, without citation to the record, that before 1976 the President had unfettered discretion to appoint chairpersons. My examination of the voluminous record, though not exhaustive, indicates that this is true only in the sense it is true for appointments after 1976: the President had the ultimate power of appointment. Both the plaintiffs’ and the defendants’ witnesses testified that the power to appoint was exercised only in consultation with the faculty. Charles Graham, the President of St. Cloud, when asked on cross-examination whether before 1976 he “could have picked anyone out of the sky virtually ... [w]ho was on the faculty,” testified:
St. Cloud has never used a system for selecting Chairpersons of the President picking somebody out of the sky. There has always been involvement of the departmental members in that process. Sometimes a formal election, always a formal consultation process.
V.17B.31. See also V.15.134 (testimony of Garry D. Hays, Chancellor of the State University Board);1 V.12.157-58 (testimony of Robert Becker, special assistant to President Graham);2 Knutson at 41 (deposition of Jack Knutson, former chairperson of psychology department, regarding pre-collective-bargaining departmental interview process). In fact, Craik herself testified that the last vacancy in the chairper-sonship of her own department prior to collective bargaining was filled under procedures identical to those codified by the collective bargaining agreement in 1976. V.5.111-12.3 It therefore appears that during the entire period under scrutiny in this case the President wielded no initiative in exercising the ultimate authority to appoint chairpersons, but relied on the initiative of the affected department.
The plaintiffs’ first category of proof, statistical evidence, showed that three of seventy-one chairpersons, or 4.2%, who *494served between 1970/71 and 1979/80 were women, at a time when women’s representation on the faculty ranged from 18% to 21%'. Px. 268, 44. They argue that this low rate of representation justifies an inference of discrimination. That inference would be unwarranted, however, if the applicant pool contained disproportionately few women, the reason for this underrepre-sentation was nondiscriminatory, and women were selected in proportion to their representation in the pool; in fact, each of these propositions finds support in the record. The 1972 report of The Commission on the Status of Women in the State College System, offered into evidence by the plaintiffs, stated that “part of the explanation for women not holding chairmanship positions [in the seven state colleges] is their own reluctance to seek such positions[, which] is made clear by data showing that while 44.9 per cent of the men state that they would be willing to accept a department chairmanship at their own or another school, only 28.4 per cent of the women would be willing.” Px. 302b at 7. No specific data on how many women sought chairpersonships at St. Cloud prior to collective bargaining exist in the university’s records. Px. 340 at 4. Data on elections conducted under the collective bargaining agreement have been preserved, and show that of fifty-three elections held between 1975/76 and 1979/80, twenty-four were contested; in two of the contested elections women ran only against other women; and in five of the contested elections women ran against men, and men won and were appointed in each. Dx. 77, 78. The plaintiffs argue, PB 16, that the women candidates’ loss in all five of these elections constitutes the “inexorable zero” found so damning in Teamsters, 431 U.S. at 342 n. 23, 97 S.Ct. at 1858 n. 23. But zero is not always “inexorable.” The Teamsters zero was observed after hundreds of hiring decisions, id. at 341 n. 21, 97 S.Ct. at 1857 n. 21, a result that was surely statistically significant. The disparity observed in this case, by contrast, is not statistically significant. The candidates in the five elections included fifty men and eight women. Dx. 77.4 If all of these candidates were pooled and chance were the sole determinant of outcome we would expect women to win .69 of the five positions; the standard deviation is 2.63; and the observed outcome — zero—is .26 standard deviations from the expected outcome. The probability that the observed outcome occurred purely by chance is about 40% under a one-tailed test, or about 80% under a two-tailed test — far higher than the level at which a social scientist would become suspicious so as to deem the result statistically significant.5 In reality, of course, *495there was not simply a single pool of fifty-eight candidates for five positions; in some of the individual contests the women candidates were not as outnumbered as in others. Even if we assume, however, that each of the elections was a contest between one man and one woman the results would not be startling. In that case we would expect women to win 2.5 of the positions; the standard deviation would be 1.58; and the observed outcome would be 1.58 standard deviations from the expected. The probability of the observed outcome occurring by chance would be between 5% and 10% under a one-tailed test, or between 10% and 20% under a two-tailed test.6 Thus, even if we focus only on the contested elections in which women competed against men, rather than on all contested elections, as the magistrate did, M. 68, and as the majority finds improper, ante at 15, we do not observe enough statistical disparity to make us suspect discrimination.
Far more troubling than the women candidates’ lack of success in these few elections is the widespread lack of women candidates for election, a phenomenon demonstrated for the early years by the report on the status of women, Px. 302b, and quantified for the later years by Dx. 77. If the defendants’ actions deterred members of the plaintiff class from entering the applicant pool, we could not simply rely on the success rate of actual women applicants in determining whether discrimination existed. The plaintiffs assert, and the majority agrees, that such misconduct on the defendants’ part was proved. I disagree. It is true, as the majority states, ante at 16, that subjective selection processes “might ... account for women’s reluctance to run,” but this possibility alone cannot override the specific finding by the magistrate, who observed the witnesses and heard the testimony, that “[f]ew females have chosen to become candidates but there is no evidence that the low number of female candidates is caused by any discriminatory practice against females by the defendants,” D.R. 98.
The plaintiffs point to two specific kinds of action by the defendants to support their contention that the university suppressed class members’ participation, but neither category is helpful to their case. First, they point out that several of the departments have no women members, making it impossible to select a woman chairperson unless candidates from outside the university are sought. It is undisputed that such external searches are possible only when there is a position to be filled within the department, because the chairperson is a member of the department’s budgeted faculty, see V.18.164-66; V.19.433-34; V.27.-105-06; but the plaintiffs argue that external searches for chairpersons of all-male departments were allowed only three times between 1975/76 and 1979/80, see Px. 336, *496even though there were department vacancies filled in the same year as chairperson appointments in an additional ten cases, see PB 17. But even if this allegation is true7 it is not helpful, because the potential beneficiaries — -women candidates from outside the university — are not members of the certified class. Inviting outside applications could not enhance the opportunities for current faculty members to be appointed; indeed, it could do the opposite by providing more competitors for the class members who did choose to run, as in the very election Craik complains of.
Second, the plaintiffs argue that the university failed to implement its affirmative action plan, by not setting goals and timetables for appointment of women as chairpersons, by the President’s failure to exercise his asserted total discretion prior to collective bargaining to appoint class members as chairpersons, and by the President’s failure to exercise his power under the collective bargaining agreement to reject department nominees in favor of temporary chairpersons of his choice. The university system’s affirmative action plan provides that “[t]he Chancellor and each of the college Presidents shall require of the Vice-Chancellors and each department within the colleges the development of numerical goals and timetables for overcoming present imbalances of minorities and women in all job classifications.” Px. 272b at 20-21, 272d at 8. I agree that willful failure to comply with the plan would be somewhat probative of intent (even though St. Cloud’s affirmative action plan was voluntary, unlike mandatory affirmative action plans, which require at least good faith efforts to meet goals and timetables, see U.S. Commission on Civil Rights, Affirma*497tive Action in the 1980s: Dismantling the Process of Discrimination 23-24 (Clearinghouse Publication 65, 1981)), but I believe proof of such willful failure here is lacking. The defendants interpreted the plan not to require goals and timetables for chairperson appointments because they considered the department’s role to be dominant, both before and after collective bargaining, in the selection process. V. 15.133-34 (testimony of Chancellor Hays). I disagree with the majority’s outright rejection of this interpretation on the ground that one cannot bargain away rights assured by Title VII, ante at 472 n. 10, because, as the majority itself notes, “[njeither Title VII, 42 U.S.C. § 1983, nor the Fourteenth Amendment requires an employer to institute an affirmative-action plan,” ante at 472. Title VII is therefore not a barrier to “bargaining away” an undertaking to set goals and timetables for ehairpersonships. Moreover, even if the affirmative action plan, “properly construed” in some sense, would not allow the setting of goals and timetables for chairpersons to be excused in these circumstances, the university’s and State University Board’s contrary interpretation is not so implausible as to justify the conclusion that they acted in bad faith or the inference that they discriminated against women.
The plausibility of the interpretation putting chairperson appointments outside the area for which the university undertook to set goals and timetables depends to a degree on the interpretation of the President’s power to ignore departments’ nominations, for if those recommendations lack any force, they are no obstacle to the setting of goals. As I have discussed above, there is no evidence to support the plaintiffs’ assertion that prior to collective bargaining the President had the first and last say in such appointments; rather, the President’s role seems to have been the same as it is under collective bargaining, to act on departmental nominations. The plaintiffs argue further, however, that even after 1976 the President could reject department nominations, and would have forced departments to select women chairpersons had he been truly devoted to the principles of affirmative action; they conclude that failure to do so bespeaks discrimination. I believe this argument is mistaken for several reasons. First, the President had no obligation under Title VII or the fourteenth amendment to override department choices so as to appoint women to ehairpersonships in proportion to their representation on the faculty, because if the plaintiffs had no Title VII right to goals and timetables, still less did they have a right to specific performance of the timetables they desired. The limits of the President’s discretion to accept or reject department nominees therefore must be determined by reference to the collective bargaining agreement rather than by reference to Title VII. The collective bargaining agreement (which we have no reason to believe is different from the uncodified practice prior to collective bargaining) requires that the President reject nominations only for cause, and that even temporary chairperson appointments be made in consultation with the faculty. It is at least plausible that these provisions make the President’s ultimate power to appoint, like the power of the President of the United States to appoint with the advice and consent of the Senate, somewhat bounded. This interpretation gains additional plausibility from the magistrate’s finding (which cannot be rejected as clearly erroneous) that the relationship between the unionized faculty and administration was rather adversarial. M. 69-70. Indeed, President Graham testified that he understood his role in reviewing nominations to be limited to correction of gross errors of judgment. V. 17A.29.8 The defendants’ *498deference to departmental selection of chairpersons, precluding the independent setting of goals and timetables, is therefore not so unreasonable an interpretation as to justify an inference of bad faith. Of course, if the defendants were aware that the departmental selection process was itself discriminatory they would have had “cause” under the collective bargaining agreement to reject a nomination. But because underrepresentation of women is not by itself discriminatory under federal law, see 42 U.S.C. § 2000e-2(j) (1976), generalized underrepresentation seems to be insufficient cause to reject a nominee;9 and, as discussed below, I believe discrimination was not proved in the particular election in which Craik participated. Because the defendants reasonably concluded that department nominations could be overridden only for cause, that no such cause existed, and that goals and timetables were unnecessary for decisions so largely beyond their control, I cannot conclude that they shirked their affirmative action undertakings, thereby displaying bad intent.
Neither the failure — if there was a failure — to conduct external searches for chairpersons nor the university’s affirmative action record, therefore, supports an inference that the defendants discouraged class members from seeking chairperson-ships. The remaining evidence on which the plaintiffs rely to show that the applicant pool was skewed by the defendants’ actions is less specific, relating to the atmosphere at St. Cloud, discussed further below. Evaluation of this kind of evidence depends almost entirely on credibility determinations, on which we are in a poor position to second-guess the magistrate. I conclude that the magistrate was not clearly erroneous in finding the defendants not responsible for the women faculty members’ low participation rate.10
As the final evidence of classwide discrimination in regard to chairpersonships the plaintiffs cite Craik’s unsuccessful bid for the chairpersonship of the psychology department in 1976. This individual instance of alleged discrimination was used to supplement the statistical evidence of underrepresentation in order to bring “the cold numbers convincingly to life.” Teamsters, 431 U.S. at 339, 97 S.Ct. at 1856. Without specifying the plaintiffs’ burden of proof on this issue, the majority uses the instance to bolster the statistical classwide case, ante at 16; yet later when considering the individual case in detail it would shift the burden of disproving discrimination to the defendants on the ground that the class case was proved, ante at 481. But it is improper to use the instance both ways: If proof of the individual instance is necessary to complete the class proof, shifting of the burden of persuasion on the individual claim is premature; only if the class claim is proved inde*499pendently should the burden of persuasion shift to the “proved wrongdoer,” Teamsters, 431 U.S. at 359 n. 45, 97 S.Ct. at 1866 n. 45. Because I believe the class claim was not independently proved, Craik’s individual claim should be evaluated as an individual ease. Therefore, although I agree with the majority that the magistrate erred in considering Craik’s individual claim under the McDonnell Douglas/Burdine test prior to considering the class claim, that error is inconsequential because of the disposition of the class claim.
Craik’s claim of discrimination in the 1976 selection of the psychology department chairperson is based on a series of allegedly discriminatory incidents directly related to the appointment, placed in context by a background atmosphere of hostility toward women, as illustrated by a different series of allegedly discriminatory incidents unrelated to the election. All the parties concede that at the time of the appointment the psychology department was in upheaval, but the defendants argue that the tensions were created by academic factionalism rather than by hostility to the prospect of a woman chairperson, as the plaintiffs suggest. It seems clear that two sources of unrest, one chronic and one election-specific, underlay the period of the appointment. First, the department had long been balkanized by differences in both theoretical and subject-matter orientation. The latter part of the difficulty was caused in part by the inclusion of the department within the College of Education rather than the College of Liberal Arts and Sciences, which would be more typical, an apparent vestige of St. Cloud’s evolution from a teachers' college. The consequent rivalry between those whose interests were defined by a particular theoretical approach and those whose interests were defined by subject-matter area, between specialists in educational psychology and counseling and those whose interests were more generalized, made the department less than cohesive and prompted periodic calls for reorganization or splitting the department between the colleges. Dx. 88, 89, 103; Px. 319 at 105-09; V. 5.117-18; V. 13.128-67; V. 18.26-29; V. 24.144-46; V. 30.140-42; V. 31.41-42, 127-29; Knutson at 28. Knut-son resigned the chairpersonship partly because “there had been for years some degree of misunderstanding, shall we say, animosity, squabbling, bickering, political machination which had been going on, still are going on, and I got tired of trying to act as a referee between people.” Knutson at 28. At the time of the 1976 election the department was divided into four subgroups (behavior analysis, counseling, general psychology, and educational psychology), and each department member had formal allegiance to one group. Dx. 80; V. 5.92- 93; V. 13.129-31. During the year between Knutson’s resignation as chairperson and the 1976 election a committee of four faculty members, one representing each subgroup, chaired the department. V. 5.92- 93; Px. Id, le. The factionalism continued under the committee, as it was perceived to be favoring one subgroup or another. Knutson at 29-30.
Second, in the course of evaluating applicants the department chairperson search committee became embroiled in a dispute about the requirements of affirmative action. After the candidates were narrowed to nine, including two women, there was extensive discussion of whether affirmative action would require one of the women to be hired, whatever the relative merits of the others. Knutson at 39-40. Several members believed that one of the two women finalists must be hired. Id. One member, Robert Murphy, resigned after a majority of the search committee recommended that four candidates be interviewed, because he believed affirmative action policies required the elimination of the male candidates. Id. at 46-47; Px. Ir; V. 31.132-36. The committee proposed informing the finalists “that there are both men and women among the finalists, that St. Cloud State University is an Equal Employment/ Affirmative Action Employer, that there are no women chairpersons in the College of Education at present, and that they are invited to call ... if they are interested in arranging an interview,” Px. *500Is; P. depos.ex. 8, but the administration advised the department not to send the letter because of possible “reverse discrimination” implications, Knutson at 47-50; V. 18.29-30; V. 24.132-34. Subsequently Lowell Gillett, the acting Vice President for Academic Affairs, met with the department and explained that affirmative action required efforts to seek women and minority candidates, but that it did not require preferential treatment of such a candidate if another had superior qualifications. Px. lv; V. 24.147-50; see also V. 16.80.
Against this background several incidents occurred that the plaintiffs argue demonstrate sex discrimination. First, one member of the department responded to Murphy’s statement that only women candidates should be considered by remarking that the department would be “stuck with a woman,” Knutson at 46, and responded to the administration’s clarification of the requirements of affirmative action by remarking “that means, then, that we won’t get stuck with an inferior woman,” V. 24.-150. These remarks can be interpreted two ways: as expressing hostility toward women, as the plaintiffs urge, or as expressing resentment at the prospect of having to use sex rather than relative merit as the criterion for choosing among the candidates, even if the woman candidate happened to prove “inferior,” as the defendants urge. See V. 7A.-82-83; V. 24.173-74; V. 30.42; V. 31.-60-63; V. 31.105-07; Kleiber at 19. After hearing the testimony of numerous witnesses to the statements, including the speaker, the magistrate credited the defendants’ explanation. M. 17.
Second, in the middle of the affirmative action controversy the department voted to suspend the search and continue temporary chair arrangements, pending a review of the department’s structural organization. The minutes of a series of department meetings reveal that reorganization of the entire College of Education had been discussed earlier that academic year, Px. It; that Knutson discussed the disputes within the search committee with Kenneth Ames, the Dean of the College of Education, who in turn discussed them with Vice President Gillett, id,.; that the committee recommended discontinuing the search while reorganization of the department was studied, mistakenly believing that Gillett had recommended that course, id.; see V. 24.-135-38, 146-47; V. 5.116-17; Px. 2; that Gillett met with the department and explained his preference that the search continue even if reorganization was under consideration, Px. lv; that in a later meeting the department, with several students participating, nevertheless voted to discontinue the search, Px. lw, although the necessary administrative approval was not sought, V. 18.32; V. 24.152; and that two days later the vote was rescinded because student participation: was improper under the collective bargaining agreement, because suspending the search after the position had been advertised might subject the department to grievance proceedings, and because Craik had been advised by the state university system’s affirmative action officer that suspension might implicate affirmative action concerns, Px. lx; see Px. 2. The plaintiffs argue that the search was suspended because of apprehension that a woman would be appointed, but testimony in the record indicates that department members may have favored the proposal as a means of defusing the long-standing explosive disagreements within the committee and department, rather than a means to avoid affirmative action. Knutson at 52-53; Kleiber at 22-25; V. 25.11-12. Some department members testified that they wanted to suspend the search until after the reorganization because it would be unwise to appoint a chairperson and immediately thereafter to reorganize the department. V. 30.19-21; V. 31.137-39. The magistrate credited the defendants’ explanation. M. 11, 14.
Third, when she was being interviewed by the department as a chairperson applicant Craik was asked two questions she considered improper and indicative of hostility: whether she would sue if she lost the election, V. 2.171; V. 31.64, and whether she would work as hard on the chairperson’s duties as she did on women’s issues *501and other outside interests, V. 2.171, V. 31.142-43. In reply, the defendants offered evidence that the first question was legitimate because Craik had indicated a willingness to take action in connection with the election, Px. lw, lx, 2, and had been rumored to be prepared to sue if she lost, V. 31.64-66; and that the second question was also asked of other candidates, V. 31.144, and was appropriate because Craik had extensive outside commitments to women’s and other organizations, V. 1.87-88; V. 2.79-85; V. 31.143, because another candidate had expressed a desire to spend considerable time on outside consulting, V. 7A.83-84, and because the chairpersonship was demanding, Knutson at 27. Although the magistrate did not make detailed findings on these issues, he held that a bias against women was not demonstrated in the preelection incidents. M. 17.
Fourth, the plaintiffs argue that the voting pattern in the election was suspicious because on the first ballot Craik received ten votes, Terrance Peterson received seven votes, and Neil Wylie received six votes, but on subsequent ballots when the candidate with the least votes was dropped, all of the votes for Wylie shifted to Peterson. Px. laa, 4, 5. It is not true, as the plaintiffs assert, that “[t]he probability that all six of these votes would shift from one man to another is non-existent.” PB 21 n. 18. But even though the probability that it would occur by chance is small, the inference that sex discrimination rather than some other cause was at work is not compelled. The defendants presented evidence that all five members of the behavioral analysis subgroup cast votes for Wylie on the first ballots, and switched to Peterson when Wylie was eliminated because they believed Peterson better represented their subgroup’s factional interests. Dx. 80; V. 30.44-45, 148; V. 31.59, 66, 130-31, 146. The sixth faculty member who originally voted for Wylie, Mary Boltuck, testified that she later voted for Peterson because of his administrative experience and because his academic interests were representative of three of the subgroups. V. 30.114. Each of the three subgroups other than behavioral analysis, although not unanimous, tended to favor either Craik or Peterson on the first ballots. Dx. 80; see V. 30.23-25; V. 31.107-08. The magistrate credited the defendants’ explanation that the voting pattern was attributable to factional politics and reasonable assessments of Craik’s and Peterson’s relative qualifications. M. 15-16; D.R. 79.
Finally, the plaintiffs argue that President Graham displayed indifference to the sex discrimination issue by offering the chairpersonship to Peterson on May 28, 1976, before the department search committee had submitted an affirmative action report, contrary to the requirements of the affirmative action plan, see Px. 272b at 19,11 and by failing to investigate the conduct of the election despite Craik’s complaints to Graham and Dean Ames that she felt bias existed, V. 2.178-82, 185-86; V. 5.132-36; V. 17A.30-31; V. 18.32-33. Because the affirmative action report eventually submitted by Knutson on June 11, 1976, stated that a majority of the search committee believed the department and administration had violated affirmative action policies, on the ground that Craik was well *502qualified and the secret-ballot process might conceal discriminatory motivations, the plaintiffs urge that proper investigation would have led Graham to exercise his power under the collective bargaining agreement to reject Peterson as the department’s nominee. There is evidence in the record, however, that Graham and his advisers did not act blindly in approving the nomination. Vice President Gillett, Dean Ames, and James Kitchen, St. Cloud’s affirmative action officer, closely monitored the search process: Gillett consulted with Ames and Kitchen, V. 24.133-41, attended one department meeting to clear up disputes, V. 24.142-50; Px. lv, and became “very much involved in the process,” Knut-son at 48; Ames attended three department meetings, Px. It, lv, ly, and discussed disputed issues with members of the search committee, V. 18.29; V. 24.133; and Kitchen met with members of the search committee and with the committee as a whole “to discuss the recruiting procedures and ... what effect affirmative action would have on the entire search process,” Px. 141a, was “in close contact ... every day on the phone several times” with the head of the search committee when disputes arose, Knutson at 58; see also V. 5.124, and may have spoken with Craik after the election, V. 2.188. Craik testified that on May 17, 1976, the day of the election, while she was “very emotional and upset ... and angry and frustrated,” she complained to Ames of the voting pattern, the search cancellation, the “stuck with a woman” comments, and the questions at her interview, but was unsure whether she identified the department members who made the remarks, V. 5.132-33; she also testified that she discussed the same matters with Graham a day or two later, V. 5.134-35. Ames and Graham both testified, and the magistrate found, M. 17, that Craik did not identify those who made the remarks. V. 17A.31, 35; V. 18.33. Graham testified that his calendar showed that he met with Craik on May 26, 1976, V. 17A.30, and that he interpreted the collective bargaining agreement to require the decision to appoint or reject the nominee to be made by May 28, ten days following the election, V. 17A.34. On May 28 he met with Gillett, Ames, Kitchen, and Becker, his assistant, to discuss the nomination. V. 17A.32-33. Both Ames and Gillett had attended the meeting at which one of the “stuck with a woman” remarks was made and at which cancellation of the search was discussed, Px. lv; V. 18.25-32; V. 24.143-50, and therefore had first-hand knowledge of the context of those incidents. At the May 28 meeting Graham and the others discussed these incidents and the interview-question incident, and concluded that they were not so suggestive of discrimination as to invalidate the election. V. 17A.35-38; V. 18.34; V. 24.172-76. There was testimony that the shortness of the time in which a decision had to be made and the impossibility of determining who had cast which votes in the secret-ballot election made further investigation difficult. V. 12.153; V. 17A.35-36; V. 18.73; V. 24.175, 177; Px. 141a, 141b; see also Knutson at 79. Instead, they considered that even if two different people were responsible for the remarks and question, invalidation of their votes as biased could not deprive Peterson of a majority of the votes. V. 17A.36-37; V. 24.175. They also independently assessed the relative strength of Peterson and Craik as candidates to determine whether the nomination was suspect on that basis, and concluded that Peterson was an appropriate nominee, not inferior to Craik. V. 17B.18; V. 18.34-36; V. 24.18-22, 167-70, 177. Having satisfied himself that the nomination was not procedurally or substantively outlandish, Graham thought he must accept the department’s choice. V. 17A.29. The magistrate concluded that intentional discrimination on Graham’s part had not been shown. M. 17-18.
The plaintiffs argue, however, that making the decision to appoint without the benefit of the affirmative action report indicates lack of good faith. But evidence in the record suggests that the administration was attempting to comply with the directives of both the collective bargaining *503agreement and the affirmative action plan, and to reconcile those directives when they seemed to conflict. There was testimony that the psychology department election was one of the first elections conducted under the 1976 collective bargaining agreement, V. 17A.29, and that the provision requiring appointment or rejection within ten days of the election was strictly construed, V. 17A.33-35. At the same time, it appears that despite several requests by Dean Ames, V. 18.41-43, Knutson was waiting until after the administration decided whether to appoint Peterson to submit the affirmative action report, and that when the ten days passed without the report appearing he was pressured to prepare it immediately, although he “had informed them what the content of the report would be even before.” Knutson at 75-76. The delay seems partly attributable to Knutson’s understanding of what the report should contain. The affirmative action plan required that “[wjhen qualified minority or female applicants are not recommended for hiring, those charged with the hiring responsibility must submit written justification for same as part of these individual reports,” Px. 272b at 19, and Knutson testified that he could not determine why faculty members had not voted for Craik because of the secret-ballot process prescribed by the collective bargaining agreement, Knutson at 78-79. The affirmative action report eventually submitted (nominally by a majority of the search committee, although three of the six members testified they did not subscribe to it, see V. 30.25-28, 45-47, 112-13) reflects this concern, for one of the reasons it concludes affirmative action policies were not complied with is that the secrecy of the ballots masked the individual voters’ criteria of selection. Dx. 25b at 4.12 To an extent this concern is procedural, dealing with inability to say whether the criteria were bad rather than concluding that they were in fact bad;13 but to the extent it is substantive, dealing with the actual appropriateness of the criteria, it is speculative, for the report refers only to the possibility that some voters acted improperly.14 To support their contention that some voters actually acted improperly, the plaintiffs emphasize that the search committee ranked Craik first among the seven candidates remaining after the two top candidates withdrew. But this ranking was a composite of the individual rankings compiled by the search committee members, Knutson at 63, rather than one on which there was consensus, see, e.g., Y. 20.26-27, 46, 113, and the candidates were ranked to determine who would be interviewed rather than as a recommendation of preference, Knutson at 40-42, 44-45, 55-57; V. 30.26, for the committee rankings were supposed to be confidential, Knutson-at 57-58. That other faculty members ranked the candidates differently is not strong evidence of sex discrimination, especially because of the department’s history of political intrigue, because some members of the search committee *504who ranked Craik lower than Peterson originally ranked another woman candidate, who later withdrew, first, V. 30.21-22, 43; V. 31.145, and because of extensive testimony, which the magistrate credited, M. 14-16, that department members believed Peterson’s qualifications to be objectively stronger than Craik’s, DB 23.15 There is therefore substantial evidence to support the conclusions that Graham carefully considered Peterson’s nomination before he confirmed it; that his failure to consult the affirmative action report prior to confirming the nomination was due to the time constraints and his inability to compel the immediate production of the report; that in any case the report concluded only that discrimination might have occurred; and that, even if one looks behind the report, no discrimination in fact occurred.
Although the plaintiffs raised serious questions whether discrimination was manifested in these various incidents surrounding the election, the evidence is not so one-sided, particularly in view of the role of assessments of credibility, as to compel the conclusion that the magistrate’s findings are clearly erroneous. The plaintiffs urge, however, that other incidents, unrelated to the election, demonstrate a pattern of hostility toward women within the psychology department and administration that should have colored the magistrate’s assessments of credibility. I believe this evidence, like the direct evidence concerning the election, is not compelling.
For example, because Craik was asked in 1970 whether she was interested in teaching statistics and then was not assigned to teach the course, the plaintiffs argue that she was the victim of sexual stereotyping. No evidence was offered to show the reason for that particular course assignment, but because there was testimony that in general faculty members prefer to and do retain courses they have once taught, V. 2.54-55; V. 14.58; V. 30.10-11, a nondiscriminatory reason for the assignment— that another faculty member was unwilling to relinquish the course — was possible. M. 19. In the absence of further evidence, I cannot agree with the majority’s conclusion, ante at 34, that an inference of stereotyping is necessary; the bare incident does not speak for itself. As another example, the plaintiffs argue that when Mary Dwyer, a class member, joined the psychology department to create a human services program, she was ostracized by the counseling subgroup, to which she considered herself to belong, by being assigned a storage closet for an office and being denied a mailbox and secretarial service. But the testimony shows that her office was a regular office, subsequently assigned to other faculty members, including men, V. 7A.92, which had been used for files and had not been cleaned out when she arrived, V. 7A.60-62, 91-92, and that at first she was told not to use the counselor education secretary, and was given a mailbox among other faculty members’ rather than a special box among the counselors’, because of a temporary dispute whether she was to be included in the counseling group, based on disagreement whether she had been hired exclusively for the undergraduate human services program, rather than for the relatively autonomous graduate counseling program, Y. 6.188, 196-98; V. 7A.69; V. 30.9-10; V. 31.87-88, 94-96, 97-99, and exacerbated by resentment on the part of the counseling group members that the department had voted to hire Dwyer in preference to another candidate they believed more qualified, V. 31.89-93. Dwyer also complained that in 1977 her photograph disappeared from a bulletin board with other faculty members’ photographs, and was not replaced; but there was no testimony regarding who removed the picture, and a *505male faculty member’s picture was also missing. V. 7A.26-29. Again I believe this evidence is not so unequivocal that it would be clear error not to infer sex discrimination. Further, the plaintiffs argue that David Lesar, a department member, displayed hostility toward women because he told the chairperson he had trouble working with Dwyer because she “remind[ed him] of a prison matron.” V. 31.-101. But Lesar testified that he did not use the word “matron” to refer to older women, although others recounted his remarks using the terms “matronly,” “mature,” and “middle-aged,” V. 2.203; V. 7A.68-69; Knutson at 14-15, but used the term as descriptive of a “stern” and “unfeeling” personality type that he believed Dwyer, and not women in general, possessed, V. 31.100-02, 118-21; he also testified that he did not consider Craik to be matronly or to have that kind of personality, V. 31.102, 119-21. Without having observed the witnesses I cannot conclude that it would be plainly wrong to ascribe this incident to a particular personality conflict rather than general hostility toward women. As another example, the plaintiffs claim that Becker scornfully ignored Craik’s insistence that the sex-neutral term “chairperson” rather than “chairman” or “he” be used when referring to the head of the faculty senate, V. 2.221-22, but Becker testified that the document in which the offending terms appeared was a directive to the current head of the senate, who happened to be a man; that the terminology was altered when another faculty member protested; that Craik later approached him as he was leaving a luncheon off campus for the Secretary of State of the State of Minnesota and publicly criticized his choice of terminology in strong language; and that in annoyance he replied in kind, V.21.68-71; see V. 17B.44-45. The magistrate’s finding that sex discrimination did not underlie the incident is therefore adequately supported in the record. M. 20. As a final example, the plaintiffs argue that St. Cloud’s affirmative action committee failed to meet regularly during 1978-79 and 1979-80, contrary to the affirmative action plan’s requirement of monthly meetings, thereby demonstrating lack of concern about discrimination. But there was testimony that because the committee had a broad membership, meetings were difficult to schedule and had poor attendance, V. 20.26, on the part of both male and female members, Grachek II at 77-78. Barbara Grachek, the affirmative action officer, therefore suggested that an individual approach might prove more effective, and that approach was followed for a trial period. V. 20.26; Grachek II at 78-79. At the same time a separate Title IX committee, whose functions were similar, continued to meet regularly, V. 20.27, 31-32, and in 1980 the two committees were merged and have met regularly since, V. 20.27-31. There was therefore sufficient evidence to conclude that the administration was not indifferent to these issues.
It is true that the magistrate did not discuss each (although he did discuss many) of the innumerable incidents mentioned at trial, of which the foregoing are a representative sample. In each case, however, the incident’s tenor was the subject of dispute that could be resolved only by assessing the demeanor and credibility of the witnesses; and the magistrate did make ultimate findings that the incidents as a whole did not show discrimination, although they assuredly showed a great deal of friction. M. 19-25. When the reasons for an ultimate finding of this kind are clear, a district court is not bound to discuss each item of evidence or to make explicit subsidiary findings, Banerjee v. Board of Trustees of Smith College, 648 F.2d 61, 66 (1st Cir.), cert. denied, 454 U.S. 1098, 102 S.Ct. 671, 70 L.Ed.2d 639 (1981); Stanley v. Henderson, 597 F.2d 651, 654 (8th Cir. 1979) (per curiam); Falcon Equipment Corp. v. Courtesy Lincoln Mercury, Inc., 536 F.2d 806, 808 (8th Cir.1976) (per curiam); United States v. F.D. Rich Co., 439 F.2d 895, 898-99 (8th Cir.1971), unlike an administrative agency, whose policymaking and discretionary powers sometimes make more detailed findings necessary for adequate judicial review, see *506Citizens to Preserve Overton Park, Inc. v. Volpe, 401 U.S. 402, 417-21, 91 S.Ct. 814, 824-26, 28 L.Ed.2d 136 (1971); SEC v. Che-nery Corp., 318 U.S. 80, 94-95, 63 S.Ct. 454, 462, 87 L.Ed. 626 (1943); S. Breyer & R. Stewart, Administrative Law and Regulatory Policy 348-50 (1979). To include in this case a litany of findings, based on credibility assessments, that each instance of dispute was based on personality clashes or misunderstandings rather than on discrimination would serve little purpose, for the individual determinations would reveal little more than, and would be as difficult to review as, the collective determination of credibility. But even if the lack of such a detailed discussion undermined our confidence that the magistrate had considered all the evidence in making the ultimate finding, we should not reverse based on our own imperfect assessment of the credibility of the transcribed testimony, but remand for proper findings — although I feel somewhat sheepish suggesting that the magistrate’s seventy-one page opinion and twenty-five page findings of fact and conclusions of law are insufficient.
In summary, I conclude that the magistrate’s rejection of Craik’s individual claim of discrimination is not clearly erroneous, because both the direct and indirect evidence presented at trial are capable of a nondiscriminatory interpretation; and that the rejection of the class claim of discrimination in regard to chairperson appointments is also not clearly erroneous, because of the weakness of the statistical and background evidence, the susceptibility of the evidence regarding class members’ low participation rate to a nondiscriminatory interpretation, and the failure of the individual claim. Although the opposite conclusion might have been equally justified had the magistrate made contrary assessments of the witnesses’ credibility, the assessments he did make seem unassailable; but if his lack of specificity casts doubts on the thoroughness of his assessments, the case should be remanded rather than reversed.
Rank
The plaintiffs argue that women faculty members at St. Cloud have been disproportionately concentrated in the lower ranks as compared to men, through discriminatory placement and promotion decisions. The bulk of their evidence is statistical. Px. 305b, for example, shows that between 1974 and 1979 a greater proportion of the women than of the men served as instructors (rank IV), and conversely a smaller proportion of the women than of the men served as full professors (rank I).16 These *507statistics alone do not necessarily justify a conclusion that women have been discrimi-natorily relegated to the lower ranks, however, because (1) they are “snapshots” of the faculty in given years and therefore, given the low turnover rate of academics, include many faculty members hired in the past when the criteria for hiring and promotion may have been different and when the applicant pool of women may have been smaller; and because (2) they do not consider whether the women and men in the snapshot had similar qualifications. It would be wrong to conclude that discrimination exists if men dominate the upper ranks only because they earlier dominated the applicant pool and meanwhile have progressed through the ranks, and if the women on the faculty have been treated similarly to similarly situated men. These are precisely the things the defendant sought to prove at trial, through “flow” statistics comparing the treatment of women and their male contemporaries, and through regression studies showing that women and men with similar qualifications did not differ significantly in rank. The plaintiffs countered with their own analysis, using a different regression model and different categories of criteria relevant to rank, showing that significant differences did exist, and with other statistics.
To understand the battle of the regression studies some background discussion of multiple regression technique is necessary. See generally D. Barnes, supra note 5, at 293-378. Regression analysis is a mathematical technique for sorting out the effects of various independent variables on a single dependent variable, when one has hypothesized in advance which independent variables (i.e., variables determined independently of the system under study) are likely to affect the dependent variable. As a simple example, if we know the salaries and various other facts about a sample of salespeople we may determine whether and how salary (the dependent variable) is determined by the various other facts (the independent variables). One hypothesis *508might be that salary is somehow a function of sales. From the known sample data on total salaries and sales one can calculate the formula that most nearly describes the data; the formula will have the form
y = a + bx,
where y = the dependent variable (total salary), a = a constant (i.e., a previously unknown fixed base salary), x = the independent variable (sales), and b = the coefficient of x (i.e., a previously unknown percentage of sales or commission rate). Such a function can be graphed as follows:
[[Image here]]
As noted above, the calculated formula is the one that most closely describes the sample data; particular salaries in the sample may not fall exactly on the line, but should cluster around it. In fact the formula usually is calculated (in a manner mathematically complex but conceptually simple) by minimizing the square of the deviations between the actual data and the formula. Particular salaries therefore are actually described by the formula
y = a + bx + u,
where u = an error term (which will vary among the members of the sample). If the error terms are fairly small and fairly uniform the regression formula is a useful description of the data. If they are large or nonuniform, however, one can conclude that the formula does not describe the relationship between the variables well; the flaw may be that the true relationship is nonlinear or that critical independent variables are missing. For instance, in the sales example it may be necessary to take variables other than sales, such as age and experience, into account. In that case the regression formula will have the form
y = a + bixi + b2x2 + b3x3
where Xi through x3 are the multiple independent variables (sales, age, and experience) and bj through b3 are their coefficients (the calculated factors by which each of the variables independently affects salary). Such a formula is calculated and operates in a similar manner to the formula with a single independent variable. By sorting out the effects of the independent variables the regression formula allows one to compare the salaries of salespeople who do not share similar characteristics or to predict the salary of a salesperson whose characteristics are unlike the others’.
In this example all the independent variables (sales, age, and years of experience) are numerical, and can take a continuous range of values. Sometimes, however, a relevant independent variable is simply either present or absent, or dichotomous rather than continuous. For example, we may include college education as an independent variable in our hypothesis about salespeople’s salaries, assigning the variable the value 1 if college education is present and 0 if it is absent. If, all other variables being the same, a college-educated salesperson earns $100 more (or less) than a non-college-educated one, the coefficient of the college-education variable will be 100 (or -100). Dichotomous variables of this kind are called “dummy variables,” because their values are conventional rather than inherent.
The strength of an independent variable’s correlation (positive or negative) to the dependent variable is revealed by its coefficient in the regression formula. The weaker the relationship, the closer the coefficient of that variable will be to zero. If the formula exactly fit all of the data, the coefficient of a variable totally unrelated to the dependent variable would be zero. But because the formula is calculated from sample data rather than all the data in the world, and because some marginally relevant independent variables may be omitted, *509the formula is subject to a margin of error, and independent variables actually unrelated to the dependent variable may turn up with a coefficient slightly different from zero. To determine how confidently we can say that such variables are actually unrelated to the dependent variable, it is necessary to test statistically whether the coefficient is significantly different from zero, starting with a null hypothesis that the variables are unrelated (i.e., that b = 0). Significance testing here is similar to the standard deviation analysis described supra in note 5, because in both cases a certain amount of variation around the point predicted by the null hypothesis is normal, and the goal is to determine whether the deviation in a particular case is so large that the null hypothesis should be rejected. After calculating s.e.b, the standard error of the regression coefficient (a measure of the likely variation of the calculated coefficient, b, from the “true” coefficient, and the counterpart of the standard deviation), we can calculate the number of standard errors b is from zero by the formula b/s.e.b. The result is the coefficient’s calculated t value (the counterpart of the Z score). By reference to a Student’s t table one can determine whether b’s variation from zero' is statistically significant at whatever level of significance one chooses; the larger the t value, the smaller the probability that the coefficient’s variation from zero is due to normal variation, and the likelier we are to reject the hypothesis that b is not statistically different from zero. To reject the hypothesis that b is indistinguishable from zero is to reject the hypothesis that the dependent variable and independent variable are unrelated.
Slight errors in a regression coefficient therefore may result simply from the margin of error of the regression function itself, and allowances for such error can be made by assuming that the coefficient’s true value lies within a calculated “confidence interval” (of whatever degree of confidence we choose) above and below the calculated value of the coefficient. In addition to such errors a different kind of error in the value of a coefficient may result from the design of the regression model. In the abstract, a regression model becomes more accurate as more independent variables are included; that is, the calculation of the dependent variable becomes more accurate. Inclusion of numerous independent variables may decrease the accuracy of particular coefficients, however, if some of the independent variables are redundant. If two variables are correlated (because they are determined by or measure the same thing), inclusion of either one will be sufficient to capture the entire effect of their joint variation, but inclusion of both will decrease the coefficient that each would have independently, by splitting up the effects of their variation between them. For example, if age and experience are highly correlated in the salespeople-salary regression model discussed above, including both as independent variables will produce misleading coefficients, because part of the actual effect of variations in experience will be picked up in the age coefficient, or vice versa. This problem is called multicollinearity. In a properly designed regression model, therefore, the independent variables must be independent of each other.
Choosing the proper independent variables is therefore critical to the success of a regression model. It is important to note that the choice of variables should be the result of a theoretical determination, made before the data are examined, of what factors are likely to be relevant; the proper variables cannot be identified statistically, because statistically correlated relationships may be accidental, lacking practical significance. Indeed, the results of models constructed by trial and error in an effort to discover statistically significant relationships are suspect, because this method increases the likelihood of stumbling on coincidental correlations. See D. Barnes, supra note 5, at 369-70; Vuyanich v. Republic National Bank of Dallas, 505 F.Supp. 224 at 269 & n. 54.
The multiple regression techniques briefly sketched above may be used to test whether certain employment practices are *510discriminatory. For example, if we suspect that women’s salaries are discriminatorily lower than men’s, we can generate a regression equation from known data, using salary as the dependent variable, and including among whatever independent variables seem proper for the type of employment in question a dummy variable for sex (assigning one sex the value 1 and the other sex the value 0). If the coefficient of the sex variable is zero or not significantly different from zero (at whatever level of significance we choose), we can conclude that sex and salary are unrelated. If the coefficient is not indistinguishable from zero we can conclude that sex, or some missing variable correlated to sex, is related to salary. Such results going either way cannot be conclusive, despite their air of mathematical precision, because the calculation's are only as strong as the model itself. The model may be subject to attack because of omission of relevant independent variables, multicollinearity, or on other grounds. The fiercest disputes in cases of this kind therefore typically concern the design of the regression model. Because the choice of proper variables is not an issue on which statisticians have particular expertise, many of these disputes are no more arcane than everyday disputes about relevancy.
The plaintiffs in the present case performed a multiple regression analysis to test whether women at St. Cloud were dis-criminatorily assigned to lower ranks than similarly situated men. Their model included the independent variables sex, highest degree (accounting for the three degrees B.A., M.A. or other master’s degree, and Ph.D. or other doctorate), years of experience, and years from highest degree, and used rank as the dependent variable, assigning each of the four ranks a numerical value (1 = professor, 2 = associate professor, 3 = assistant professor, and 4 = instructor). Their data base included all St. Cloud employees (IFO/MEA employees, MSUAASF employees, and administrators, see supra note 16) rather than just the certified class of teaching faculty, V.ll.46-47, and covered the years 1973/74 to 1978/79. Their analysis produced a coefficient for the sex variable of between .29 and .25 for the various years in the study. Although they did not report the standard error or the t value for these coefficients, they noted that each coefficient was statistically significant at the 1% level (i.e., we can be 99% confident that each was different from zero). Px. 305o; V.ll.53-54. The plaintiffs’ expert concluded that during those years between 2.5 and 2.9 women out of every 10 were ranked one step lower than similarly qualified men. Px. 305o at 2; V.ll.55.
The defendants attacked this analysis on three grounds: that it included comparisons beyond the scope of the class, that it failed to include certain relevant independent variables, and that it improperly used a kind of dummy variable as the dependent variable. More specifically, the first objection concerned the plaintiffs’ inclusion in their study of all three categories of employees included in the university’s computerized data base (teaching faculty represented by IFO/MEA, middle-level managers represented by MSUAASF, and upper-level administrators), rather than merely the teaching faculty included in the certified class. This overinclusion is not limited to a handful of administrators, as the majority suggests, ante at 20 n. 15; see V.21.-12-14; supra note 16. The second objection concerned the failure to include several subcategories of M.A. as independent variables. Under the university’s series of appointment and promotion guidelines, see supra at 489-490, these subcategories played various roles in determining rank. Under the State College Board Operating Policies, for example, the minimum criterion for assignment to rank IV was a master’s degree; for rank III, a master’s degree plus 45 graduate quarter hour credits; and for rank II, a master’s degree plus 90 quarter hour credits. Dx. 6, 7. Contrary to the majority’s assertion, ante at 477 n. 15, these subcategories did not fall from use after 1971. Although the criteria became more stringent, up through the time of the trial the minimum criteria for rank *511III included completion of all the coursework beyond a master’s degree required for a doctorate (called “all but dissertation” or “ABD” status), often the equivalent of M.A. + 90 status; and the criteria for rank IV included “minimum appropriate preparation,” defined to include a master’s degree plus additional coursework. Dx. 17, 18. In any case, even if the master’s degree subcategories had fallen from official use in determining rank, their inclusion in the regression study might have been warranted because many of the rank determinations for members of the faculty included in the study had been made in 1971 or before. Px. 305o at 2. I therefore believe the defendants’ first two objections have merit. I agree with the majority, however, that the plaintiffs met these objections by reworking their analysis with the proper data for the years 1976, 1978, and 1980. Px. 339. This revised analysis produced a coefficient for the sex variable of between .11 and .13, which the plaintiffs report to be statistically significant at the 5% level.
The defendants’ critical objection to the plaintiffs’ model, however, was their third one. They argued that it was improper to assign artificial values of one to four to the four ranks and to use rank, so quantified, as the dependent variable, for several reasons. The first is the most important: The dependent variable cannot be a dummy variable in an ordinary regression equation because such equations are designed to yield a continuous range of values rather than discrete steps when particular values for the independent variables are supplied. Plugging particular values into the plaintiffs’ model will not yield one of four discrete values, as the model seems to expect, but will yield a range of values that anomalously may predict one will be less than an instructor or more than a professor, or any gradation in between. Because regression analysis of variables that can take only discrete values (termed “qualitative” rather than “quantitative”) would be extremely useful in both the natural and the social sciences, alternative models have been devised and extensively surveyed in the econometric literature. See, e.g., G. Maddala, Econometrics 162-71 (1977); G. Maddala, Limited-Dependent and Qualitative Variables in Econometrics (1983); Amemiya, Qualitative Response Models: A Survey, 19 J. Econ. Literature 1483 (1981). In simplest terms, such models employ a transformation function so as to yield, instead of the discrete values 0 or 1 for a dichotomous dependent variable, the probability that the variable will have the value 1. H. Theil, Principles of Econometrics 628-33 (1971). If in the present case, for example, there were only two ranks, called 0 and 1, regression analysis could be performed, despite the impossibility of limiting the results of the regression equation to those discrete values, by transforming the continuous range of results into probabilities that faculty members with particular characteristics will be assigned rank 1. It would be improper simply to interpret the raw results that fall between 0 and 1 in the plaintiffs’ model as such probabilities, because the ordinary regression function is not so designed.
The second problem with the plaintiffs’ dummy dependent variable model' compounds the first. Their dummy variable could take not two but four discrete values, arbitrarily assigned the numerical values one, two, three, and four. The model thus not only anticipated the generation of four discrete values (an expectation that suffers the flaws described above), but also assumed a particular numerical relationship among the ranks — specifically, that the “distance” between each of the successive ranks was equal. This assumption may or may not have been warranted; for example, it is possible that the rank of instructor is so much lower than the others that the proper values to be assigned to the four ranks were one, two, three, and ten. It is clear that the results of the regression could vary wildly depending on how these values are assigned. The plaintiffs’ assumption about the relationship among the ranks, other than their hierarchy, lacked foundation.
*512The final problem is related to the second. By combining the analysis of the four ranks in one equation the plaintiffs’ model assumes that the relative weight of the independent variables remains the same for each rank. This assumption may be unwarranted. It is possible, for example, that experience is an insignificant variable at the instructor level, but overshadows the other variables in importance at the professor level. In fact, it is rather unlikely that the relative importance of degree, experience, and time from degree is the same for each rank, because under the published criteria in effect since 1972, see Dx. 15-18, the same degree is required for ranks I and II, and no experience is required for rank IV. These requirements suggest that experience, which is unimportant for rank IV, may be determinative for distinctions between ranks I and II. Combining the analysis of the four ranks in one equation imposes a questionable uniformity on the criteria, and may produce misleading results.
The defendants’ analysis sought to correct each of these flaws in the plaintiffs’ study. To meet the first problem they performed a “logistic regression,” which is a transformation of the regression into probabilities by a technique known as the “logit model.” G. Maddala, Econometrics 163-64 (1977); G. Maddala, Limited-Dependent and Qualitative Variables in Econometrics 9 (1983); H. Theil, Principles of Econometrics 632-33 (1971). To meet the second and third problems they analyzed the four ranks separately. Their model thus operated in a three-stage sequence as follows: At the first stage it analyzed the probability that faculty members with given qualifications would be assigned rank I rather than a lower rank; at stage two it analyzed the probability that a faculty member with given qualifications would be assigned rank II rather than a lower rank; and at stage three it analyzed the probability that a faculty member with given qualifications would be assigned rank III rather than a lower rank (rank IV). This sequential analysis allows the use of a dichotomous dependent variable with the values of 0 or 1 at each step (stage one: professor = 1, lower ranks = 0; stage two: associate professor = 1, lower ranks = 0; stage three: assistant professor = 1, lower rank = 0); at each stage the transformed regression equation measures the probability that the dependent variable will have the value 1. See G. Maddala, Limited-Dependent and Qualitative Variables in Econometrics 49-51 (1983). Separating the analysis of each rank from the others also avoids any assumptions that the ranks are equally “spaced” and that the relative weights of the independent variables are uniform across ranks, and allows one to pinpoint at which ranks, if any, disparities in results are located.
To illustrate the other claimed flaws in the plaintiffs’ regression study, the defendants performed two logistic regression analyses, one using the independent variables employed by the plaintiffs, and one using the independent variables degree category (B.A., M.A., M.A. + 45, M.A. + 90; Ph.D.), sex, time from degree, and experience. In the first analysis the coefficient of the independent variable “sex” was not statistically significant (i.e., not significantly different from zero) for the first stage (professor vs. lower ranks), but was progressively significant at stage two (associate professor vs. lower ranks) and stage three (assistant professor vs. lower rank). Dx. 154a; V.32.36-39, 48. When the same analysis was done with the additional degree categories M.A. + 45 and M.A. + 90, however, the coefficients for the sex variable became insignificant at each stage. Dx. 154b; V.32.15-32. The defendants therefore argue that the apparent difference in treatment of the sexes observed in Dx. 154a is actually attributable to the neutral factor of educational achievement rather than sex. If St. Cloud was entitled to reward such achievement and in fact a smaller proportion of the women faculty members than of the men had M.A. + 45 or M.A. + 90/ABD status, the proportional underrepresentation of women at certain ranks was not the result of sex discrimination.
*513In addition to attacking the defendants’ logistic regression analysis on the ground that the method is useful only as a predictive and not a descriptive tool, V.32.94-95 (an argument I do not understand), the plaintiffs offered three responses to the defendants’ evidence. First, they argue that it is incomplete because it analyzes only the year 1978, and thereby avoids the more damning data from the years before the suit was filed. This argument is misleading, because the analysis was not limited to appointment and promotion decisions made in 1978, but included all the appointment and promotion decisions made concerning the faculty members employed at St. Cloud in 1978, regardless.of the date of the decisions. Because of the high level of continuity in the membership of the faculty from year to year, the data on such decisions are largely cumulative from year to year, so that examining the records of current faculty members will capture most of the relevant history. V.32.43-44; see V.11.7. The defendants’ failure to perform largely duplicative analyses for the earlier years therefore does not undermine the usefulness of their conclusions.
Second, the plaintiffs argue that the use of particular degrees and subcategories of degrees as criteria for appointment and and promotion has a disparate impact on women and has not been validated by proof of business necessity. This argument is troubled, because it assumes that the criteria must be compared with some independently determined job description. But I believe the criteria are themselves definitional, rather than being designed to satisfy a preexisting definition, and that universities are entitled to set their sights as high as they choose in determining the qualities of their faculties. See A. Larson & L. Larson, 2 Employment Discrimination § 50.71(c) (1983). Because higher academic degrees are widely regarded as evidence of educational achievement and scholarly promise, they are not a bizarre measure of the qualities universities ordinarily seek; and the courts should be timid of scrutinizing further the content of such academic determinations. See Lieberman v. Gant, 630 F.2d 60, 67 (2d Cir.1980) (Friendly, J.). Even if it were proper to inquire whether St. Cloud’s choice of criteria was compelled by external considerations, however, the plaintiffs’ argument would be unsuccessful, because the defendants presented evidence that St. Cloud’s increasing emphasis on doctorate and near-doctorate status, as it progressed from a teacher’s college to a full university with graduate programs, was required for accreditation. V.15.54-57, 79-80; V.16.58-63; V.18.13-18; Dx. 19, 20a, 20b, 20c.
Finally, the plaintiffs argue that the shortage of women at the highest degree levels at St. Cloud is but another manifestation of the university’s discriminatory practices, because the percentage of faculty members at those levels who were women was smaller than the percentage of women in the applicant pool. For example, the plaintiffs’ evidence showed that in the years 1976 to 1980 between 9.6% and 13.9% of the faculty members with Ph.D.s were women, Px. 305h, 305k,17 although the *514average nationwide availability pool of Ph.D. recipients in the approximate years 1970 to 1975, weighted by discipline to reflect the composition of St. Cloud’s faculty, was 21% women, id.; V. 10.49; V. 11.19— 23. The plaintiffs conclude that the disparity between the actual and available percentages of women with Ph.D.s is significant at the 1% level. Px. 305h at 2; V. 11.23-24. The short answer to this argument is that disappointed applicants are not members of the class certified by the district court; the definition of the class limits the inquiry to an examination whether women already on the faculty were treated discriminatorily, and the plaintiffs’ own evidence revealed that a much smaller proportion of the class members than of the men held higher degrees. The long answer to the plaintiffs’ argument is that they are comparing incomparables. Px. 305h and 305k are snapshot statistics, including faculty members appointed in the distant past, whereas their availability pool data were flow statistics confined to the recent past. The defendants presented extensive evidence that the percentage of Ph.D. recipients who were women rose dramatically in the early 1970s, Dx. 44 (Scientific Manpower Commission, Professional Women and Minorities (Supp. 1 1980)); Dx. 45 (National Center for Education Statistics, Degree Awards to Women: An Update (1979)), rendering suspect any assumption that the recent availability pool data were similar to data for earlier years; and in any case the defendants’ rate of hiring before 1970 was beyond the reach of the statute of limitation, and thus not directly relevant. See United Air Lines, Inc. v. Evans, 431 U.S. 553, 558, 97 S.Ct. 1885, 1889, 52 L.Ed.2d 571 (1977); Lamphere v. Brown University, 685 F.2d 743, 751 (1st Cir.1982) (per curiam). The proper comparison is therefore between the applicant pool and the rate of hiring during the limitation period. The defendants’ evidence on this issue shows that between 1974 and 1980 women received 41% of the faculty appointments for which there were applicants of both sexes, although they constituted 29% of the applicant pool, Dx. 47B, and 32% of the total faculty appointments, although they constituted 24% of the total applicant pool, Dx. 47;18 it also *515shows that 21.7% of the appointees who held doctorates or other terminal degrees were women, and 41.9% of the other appointees were women, Dx. 50.19 Women with doctoral degrees therefore were hired at the same rate as their representation in the availability pool as reported by the plaintiffs (21.7% vs. 21%), and overall women were hired at a rate in excess of their representation in the actual applicant pool. This is powerful evidence of nondiscrimination in hiring. The plaintiffs depreciate this evidence, like much of the defendants’ evidence, on the ground that it ignores the early part of the limitation period. But most of the statistics presented in the case, both by the plaintiffs and by the defendants, commence in 1974, because that was the year St. Cloud began to keep reliable computerized personnel files, V.29A.13, and the year EEOC record-keeping requirements began, 29 C.F.R. § 1602.49(a) (1975) (current version at id. (1983)). Because no dramatic change occurred between the two pre-lawsuit years covered by the data in this instance and the post-lawsuit years, there is no reason to suspect that the evidence as a whole is unrepresentative of still earlier years.
In addition to these broad attacks on the defendants’ regression analysis, the plaintiffs made more particularized arguments that at hiring women were not assigned the highest rank for which they were minimally qualified as frequently as men, and that afterwards women were not promoted to higher ranks at the same rate as similarly situated men. Because faculty members at St. Cloud progress through the ranks without skipping ranks, V. 10.113, and have been required since 1976 to spend three years in one rank before being considered eligible for promotion to the next, Dx. 17, 18, the effects of discriminatory initial placement could persist throughout one’s career at St. Cloud, and be exaggerated still further by discriminatory promotion rates.
The plaintiffs originally presented evidence of the number of years men and women faculty members holding each of the degrees Ph.D., M.A., and B.A. took to be promoted to each of the ranks. Their first such study included all employees (teaching faculty, administrative/service faculty, and upper-level administrators), and canvassed 635 promotions awarded to faculty members who were employed at any point between 1973 and 1979, regardless of the year of the decision; 411 of the decisions occurred before 1973, “some of them ... long before.” Px. 305p at 2; V.ll.64-65. This study found the average difference in the rate of promotion of men and women who held M.A. degrees to be significant, and concluded that the typical promotion pattern (to rank III with the M.A. and then to ranks II and I with the Ph.D.) took 14.6 years for men and 17.5 years for women. Px. 305p. Because the criteria for promotion grew more stringent in the 1970s, however, see supra at 489-490, the inclusion of the earlier promotions, made at a time when women constituted a smaller percentage of the faculty, could skew the results; the plaintiffs therefore reworked their study, using the same employees (holding M.A.s and Ph.D.s) but including only promotions made between 1973 and 1979. This study also found significant differences at the M.A. level. Px. 305q.
*516The defendants argued that these studies had little value because they failed to consider whether or when the employees in question were eligible for promotion, but used only degree status as the basis for comparison. The plaintiffs’ own evidence, showing that within the M.A. category women lagged behind men in reaching M.A. + 45 and M.A. + 90 or ABD status, suggests that the use of more narrowly drawn criteria for comparison would affect the results of the studies. The defendants offered several counterstudies comparing the treatment of men and women, taking criteria for rank determination into account. In Dx. 69 they summarized the percentages of newly hired men and women actually appointed at each rank between 1974/75 and 1979/80 who could have been appointed to no higher rank, based on the criteria in effect at the time of appointment; they concluded that overall 92.7% of the women and 83.3% of the men were appointed at the highest rank possible.20 In Dx. 54 they analyzed what percentages of the men and women eligible for promotion to each rank between 1974/75 and 1979/80 were actually promoted, and concluded that overall 52.1% of the eligible women and 38.3% of the eligible men were promoted.21 In Dx. 169 they analyzed the promotions to rank III effective between 1974/75 and 1978/79, and concluded that of those who were eligible for promotion, men were eligible for a slightly longer time than women before being promoted, and that 57.1% of those who were promoted as exceptions to the eligibility criteria were women. In Dx. 70 they analyzed, for men and women faculty members who held the appropriate degrees for promotion and were promoted to each rank between 1973/74 and 1979/80, the number of years spent in the previous rank and the number of years of creditable experience (i.e., experience that counted); they also listed the faculty members who were promoted without the appropriate degrees, as exceptions, and analyzed their years in rank and creditable experience. They concluded that women and men promoted with the appropriate degree spent similar amounts of time in each rank (and in fact each spent the same average number of years — 15.5— progressing through the ranks), and that on average women were promoted with slightly fewer years of experience. They also concluded that a greater proportion of the women (32.6%) than of the men (4.2%) was promoted without the appropriate degree, and that these exceptions had similar years in rank and creditable experience.22 When the plaintiffs objected that Dx. 70 *517was misleading because it counted as exceptions only those who failed to meet degree requirements and not those who failed to meet experience requirements, the defendants reworked the study, and found that, both among those who were promoted as qualified by degrees and experience and among those who were promoted as exceptions, men and women fared similarly. Dx. 161.23 Finally, in Dx. 139a and 139b they analyzed the promotion decisions made during 1972/73 and 1973/74, and concluded that similar percentages of women and men who were eligible were promoted at each rank.
The plaintiffs subjected these analyses to searching examination at trial in order to demonstrate that “eligibility” was not a concept susceptible to certain definition. For example, St. Cloud’s Institutional Studies and Research people, who prepared Dx. 54, did not always classify the same people as exceptions to the promotion criteria as Robert Becker, who prepared Dx. 69 and 70, did. Compare Px. 323 and Dx. 54B with Dx. 70 at 3; V.22.57-82; V.23.14-72; cf. Px. 258. A large amount of the disagreement appears to have arisen from the number and complexity of the standards, the interaction of successive sets of standards, and the history of interpolation when gaps in the written guidelines existed. In 1966 and 1971 the State College Board issued regulations providing only that promotion or appointment at rank I required a doctoral degree; rank II required a doctorate or M.A. + 90; rank III required an M.A. + 45; and rank IV required an M.A. Dx. 6, 7. These standards were not self-executing, entitling a faculty member with a particular degree a corresponding rank, but specified that any faculty member could “be assigned to a lower ranking group if deemed advisable.” Dx. 6. These bare standards continued in effect until 1974, despite the formulation in 1972 of the “deans’ guidelines,” Dx. 15, which were designed to specify when it would be advisable to appoint or promote to the maximum rank. Thus, although the guidelines specified both quantitative conditions (doctorate or other appropriate terminal degree and ten years’ teaching experience for rank I; doctorate or terminal degree and seven years’ teaching experience for rank II; doctorate or terminal degree and no experience, or ABD and four years’ teaching experience, for rank III; and master’s degree for rank IV) and qualitative conditions (teaching ability, demonstrated scholarship, academic responsibility, and professional development for rank I; teaching ability, scholarship, and academic responsibility for rank II) for rank determinations, these standards were considered advisory rather than new binding mínimums. Px. 260d. • There was testimony that a practice developed during the time these guidelines were in effect of recognizing unusual degree combinations or extensive experience as alternative qualifications for attainment of rank, in addition to those stated in the guidelines. V.22.30-31; V.23.60. In 1974 APT guidelines, largely replicating the deans’ guidelines, were adopted, superseding the State College Board regulations. Dx. 16. There was testimony that the interpretational history of the deans’ guidelines was carried over to the APT guidelines as well. V.22.65-69; *518V.23.23. These rules continued in effect until the collective bargaining agreement was signed in 1976. As noted supra at 489-490, the collective bargaining agreement provided general rank criteria (doctorate or appropriate degree plus ten years’ experience for rank I; doctorate plus seven years’ experience for rank II; doctorate or appropriate degree for rank III; appropriate preparation for rank IV), Px. 271a at 9, and provided that specific guidelines could be developed through the meet-and-confer process, id. at 9, 25. The guidelines adopted in 1976 after consultation with the faculty senate incorporated the standards of the predecessor guidelines, but in addition specified both which degrees or degree combinations would be considered “appropriate,” and which ranks would normally be attainable with those degrees through appointment, promotion, and promotion with extensive experience, thereby codifying the prior practice. Dx. 17; V.16.44. Even these more specific guidelines were subject to interpolation, however, as unusual degree combinations were encountered or gaps in the written standards were discovered. For example, the guidelines specify that the degree combination M.A. + C.P.E. is appropriate preparation for rank IV at appointment, rank IV through normal promotion, and rank II through promotion with extensive experience. There was testimony that the degrees M.S. + C.P.E. were interpreted by the university to satisfy this standard, though not literally included, because M.A. and M.S. degrees could be interchangeable, the label depending on the university awarding them. V.22.79; see also V.23.51-52, 59. There was also testimony that, although the standard in this case and several others literally allowed advancement only to ranks IV and II, the university interpreted it to allow an intermediate promotion to rank III, because at St. Cloud faculty members do not skip ranks. V.22.79-81; V.23.24. In 1977 new guidelines, largely the same as the 1976 guidelines but adding an explicit proviso that the President would determine whether exceptions to the written policies should be made, were adopted after consultation with the faculty senate. Dx. 18. In subsequent years the university did not meet and confer with the faculty about the guidelines, because their representative, IFO/MEA, refused to bargain about standards beyond the general and unweighted criteria for promotion included in the second collective bargaining agreement.24 V.16.48-49. Although the collective bargaining agreement standards thus were the sole official policy statements in effect after 1977, Grachek I at 71-72; Px. 261; V. 14.135, those standards by their nature required interpretation in order to be applied in particular cases; for example, the criteria include length of service at the university and related service outside the university, but fail to quantify the requirements for each rank. The university therefore used the old standards, which had previously been distributed to the faculty but ceased to be distributed when they ceased to be official policy, V.19.41-42,25 to interpret and quantify the collectively bar*519gained criteria. V.16.48-49; 50-51. The university was entitled to do so by the terms of the collective bargaining agreement. Px. 271b at 9 (“In all instances, the determination as to what shall be construed as appropriate experience and appropriate degree for purposes of assignment to rank shall be within the discretion of the Employer.”); Px. 271c at 9 (same).
On cross-examination of the defendants’ witnesses it was determined that many faculty members classified as eligible for promotion in Dx. 70 did not satisfy the written standards in effect at the time the promotion decision was made. Many of these classifications were attributable to reliance on the standards as glossed at the time rather than as written, V.22.65-68; one involved a judgment whether a candidate’s terminal degree was appropriate for his discipline, V.22.70-72; and one candidate was classified as eligible because he was interpreted at the time he was promoted to have ABD status, an interpretation that was later discovered to be mistaken, V.22.-63-64. The plaintiffs make three arguments concerning the defendants’ eligibility analyses. First, they argue that the defendants applied the expansive interpretation of the guidelines unequally, giving men but not women the benefit of the unwritten interpolations. They identified only two women out of the fourteen classified as exceptions, however, who should have been deemed eligible under the guidelines as expanded by interpretation, V.23.-69, still leaving women with a large proportion of the exception promotions. It is impossible from the record to determine what effect including the two women among the eligibles would have on Dx. 70’s comparison of the average years spent in rank and years of experience of men and women promoted as eligible, but the defendants did submit a revision of Dx. 70 going the other way, pulling out all the faculty members whom the plaintiffs argued were not eligible under the guidelines as written (one woman and seven men). As noted above, this study found that women and men fared similarly even when so classified. Dx. 161; see supra note 23. Second, the plaintiffs argue that when the guidelines are applied strictly, the proportion of promotions classified as exceptions is so great that the eligibility criteria must be abandoned altogether as a basis for comparing the rate of promotion of men and women. They conclude that their comparisons of men and women holding B.A., M.A., and Ph.D. degrees is valid. This argument overlooks the fact that many of the individuals they classify as exceptions were evaluated under some standards, albeit unwritten. If those standards were applied evenhandedly to men and women there is no cause to complain. The fluid state of the eligibility criteria and the conflicts over their proper interpretation clearly make it difficult to assess the evenhandedness of the standards’ application. The defendants therefore presented . evidence that, whatever the standards were, between 1975/76 and 1980/81, the years for which the data were readily available, similar percentages of the men and women who sought promotion under the largely self-nominating process were successful. Dx. 71, 165; V.27.130-31; see also V.19.93-94.26 This evidence, together with the results of the defendants’ regression analysis, strongly suggests that the more particularized analyses outlined above were not *520wholly unreliable. Finally, the plaintiffs argue that, even if similar proportions of the men and women were promoted as exceptions overall during the period 1973/74 to 1979/80, women received no exception promotions between 1971/72 and 1973/74, when men received eight. Although the defendants disputed whether some of the men were exceptions, the small numbers do not reveal statistically significant differences in any case. If women constituted 20% of the faculty and men constituted 80%, and women could be expected to receive exception promotions in the same proportion as men if no discrimination existed, women could be expected to receive 1.6 promotions. The standard deviation is 1.13, and the observed outcome — zero—is 1.42 standard deviations from the expected outcome. The probability that the observed outcome occurred by chance is about 10% under a one-tailed test, or about 20% under a two-tailed test.27 Statistical analysis therefore confirms what intuition would have urged: that the numbers involved are so small as to preclude meaningful deductions about the comparative treatment of men and women. Because none of the plaintiffs’ particularized contentions overrides the general evidence of nondiscrimination provided by the regression analysis, discrimination in promotion was not proved.
The plaintiffs argue further, however, that even if the defendants’ studies are credited, they show that during the pre-lawsuit period women were hired at the highest rank appropriate for their qualifications less frequently than men. In Dx. 69 the defendants showed similar rates of appointment to the highest possible rank for the period 1974/75 to 1979/80, but the plaintiffs argue that analysis of the pre-lawsuit years alone (1974/75 to 1976/77) reveals disparate rates for men and women. In particular, they focus on the appointments at the assistant professor level. At trial they determined that, although during the entire period of the study nearly equal percentages of the men and women appointed at that level were appointed at the highest rank for which they were qualified, during the pre-lawsuit years of the study 76.9% of the women and 87.2% of the men were appointed at their highest rank.28 V.27.39-45. The plaintiffs and the majority, ante at 22, conclude that during the pre-lawsuit period the defendants treated women worse than men, and that their *521conduct changed after the suit was filed. Neither the plaintiffs nor the majority consider, however, whether the difference in treatment in the early years is so large that chance may be rejected as an explanation. In fact, the difference is not statistically significant. This can be sensed intuitively by reflecting that if but one more of the thirteen women appointed at rank III in those years had been appointed at the highest possible rank, the percentages of men and women so appointed would be nearly the same. This intuition can be tested statistically. If we assume that under neutral circumstances 87.2% of the women (the same percentage as that of the men) would have been appointed to the highest possible rank, we would expect to find 11.3 (.872 x 13) of the women appointed to the highest rank. The standard deviation is 1.21, and the observed outcome is 1.07 standard deviations from the expected. The probability that the observed outcome occurred solely by chance is slightly more than 15% under a one-tailed test, or slightly more than 30% under a two-tailed test; these probabilities are far higher than the level that would make one suspect something other than chance was at work.29 Although assignment to the appropriate rank does not at first blush seem to allow for the operation of chance, the defendants offered testimony that when positions were advertised at a specified rank, candidates would be considered only for that rank, even if their qualifications were greater than the rank required. V.28.38-39. Slight differences in the rates at which this phenomenon affected men and women could be attributable to chance. Women’s rank assignments at rank III during the pre-lawsuit period were therefore not so different from men’s as to be suspicious. This conclusion is reinforced by the combined data for all ranks during the pre-lawsuit period: 92% of the women and 92.8% of the men were appointed to the maximum rank for which they were qualified. V.27.59-60. The evidence therefore does not support a conclusion that women were discriminatorily assigned to lower ranks than men before their lawsuit was filed.
In summary, I conclude that the defendants’ regression analysis, which unlike the plaintiffs’ used a proper methodology, showed that rank assignments at St. Cloud may be explained by neutral criteria. Although the magistrate despaired of understanding this powerful evidence, M. 45, 49, he did analyze the more particularized evidence offered by both parties, and after close reexamination of the evidence I cannot conclude that his finding of no discrimination was clearly erroneous.
Compensation
Finally, the plaintiffs contend that women’s salaries markedly lag behind men’s at St. Cloud. Again their expert offered a multiple regression study to demonstrate the alleged disparity. The parties agree that multiple linear regression is an appropriate model for analyzing salaries, ef. supra at 508, unlike rank, but disagree about the proper data base and the appropriate independent variables to be included. The plaintiffs’ original study included data on teaching faculty, administrative/service personnel, and upper-level administrators, V.l1.46-47, and included the variables highest degree (B.A., M.A., and Ph.D.), creditable experience, time from highest degree, and sex. Px. 305o. It concluded that between 1973/74 and 1978/79, all other variables being equal, women earned from $25 to $1198 less than men, differences that were significant at *522the 5% or 1% level in five of the six years. The study also added the independent variable rank in order to determine the differential between men’s and women’s salaries within each rank, and found that within ranks women fared better than men in two years (significantly better in one year), and worse than men in four years (significantly worse in three). The plaintiffs therefore concluded that the lot of women faculty members was worsening.
The defendants made several challenges to this study similar to those already discussed above. First, they argued and the magistrate found that inclusion of personnel other than the teaching faculty was improper. They supported this argument with evidence that the salaries of teaching faculty members, administrative/service personnel, and administrators were determined under three separate schedules. V.20.12-13. Second, they argued that the degree categories M.A. + 45 and M.A. + 90 should be included, because it was likely that the incremental increases in education elsewhere recognized in the university would affect salary. Third, they argued that rank must be included for similar reasons. Finally, they argued that, because of differences in the market value of different departments, it was improper to omit a variable that would take market value into account. Because restricting comparisons to faculty members within single departments would so shrink the range of comparison as to prevent any useful conclusions, they proposed using three broad groupings of similar departments, which they called divisions. One division contained the colleges of fine arts and liberal arts; the second contained the colleges of business and industry; and the third contained the college of education and miscellaneous related programs. V.21.167-68; V.28.91.
I believe the defendants’ first two arguments are incontestable, for they have a logical basis, and indeed the limitation of the data to teaching faculty and the inclusion of degree categories as variables are shown by the plaintiffs’ own revised regression analysis to make a significant difference. Px. 337-339. The plaintiffs make more spirited arguments, however, against the inclusion of rank and division as variables. They argue that rank should be omitted because it is highly correlated to sex, and therefore poses a multicollinearity problem, threatening to mask as rank differences disparities actually due to sex. This argument stands or falls with their arguments that rank is determined discrim-inatorily; because I consider those arguments to have been properly rejected, there is no reason not to include rank as an independent variable in the regression. They also argue that division is an improper variable, because it is an artificial construct that plays no actual organizational role at St. Cloud. The variable need not be formal to be functional, however. It is commonly known, and evidence was offered at trial to show, Dx. 68, that members of different academic fields command different salaries depending on the competition for their services from private industry and from other universities. These disparities are likely to be most pronounced when disciplines for which there is private demand are simultaneously expanding academically (such as accounting), and when disciplines for which there is little demand in private industry are shrinking academically (such as English or philosophy). Indeed, the majority acknowledges that scarce-market fields may legitimately command higher salaries. Ante at 479. If it happens through forces independent of St. Cloud that a smaller proportion of women than of men pursues the high-market disciplines, the incidental effect on women of rewarding scarcity cannot be termed discriminatory. I therefore believe division was properly included as a variable.
The plaintiffs submitted a revised regression analysis taking into account all of these variables (experience, degree category, division, rank, and sex), using data on the teaching faculty alone, and found no significant differences between men’s and women’s salaries. Px. 339. In addition the defendants offered studies comparing each class member to at least five, and often many more, men in similar departments and with similar education, rank, and experience during the years 1975/76, 1977/78, and 192%o. Dx. 56, 58, 60. The studies’ conclusions that women’s and men’s salaries were similar confirms the results of the regression analysis. I therefore believe the magistrate’s conclusion that salary discrimination was not shown is not clearly erroneous.
III. Conclusion
Almost all of the issues involved in this case are factual rather than legal. The few legal issues concern the proper rela*523tionship between classwide and subsidiary individual claims of discrimination. Although I agree with the majority that the magistrate, relying on past decisions of this circuit, erred in failing to consider the class claims first in order to relieve individual claimants of the burden of proof should the class claims succeed, I believe the error was of no moment in this case because the class claims did not succeed. The class claims’ lack of success turns on factual findings reviewable only for clear error. Pullman-Standard v. Swint, 456 U.S. 273, 287-90, 102 S.Ct. 1781, 1789-91, 72 L.Ed.2d 66 (1982). The majority concludes that many of the findings below were clearly erroneous because the magistrate failed to discuss all of the evidence offered at trial, and because the defendants’ widespread reliance on subjective evaluations calls for special scrutiny. Some of the gaps in the magistrate’s discussion of the evidence can be attributed to his generous policy of admitting huge amounts of peripheral evidence on issues barred by the limitation period or beyond the scope of the class, offered not for substance but for background. See, e.g., V.21.94. If the magistrate found such evidence to have little weight his failure to discuss each item is excusable. I have reviewed the evidence on a few of the numerous claims raised in this case (those on which the majority rejects the magistrate’s findings) in some detail, and conclude that the inferences of no discrimination drawn by the magistrate were supportable, although in some instances the contrary inferences might also have been supportable had different testimony been credited. Because the findings below do not lack support I believe they may not be rejected. The fact that subjective determinations are involved is not a thumb on the scale, discrediting the defendants’ evidence. Subjectivity is sometimes legitimate; it would be odd to forbid universities to seek or reward faculty members who excel in teaching or scholarship on the ground that such excellence has no objective measure. The presence of subjectivity is not substantive evidence, but only a signal that the actual evidence calls for careful examination. My review of the record does not make me believe that the actual evidence was insufficiently aired; but if the admittedly sketchy style of the magistrate’s opinion (no doubt occasioned in part by the vast array of issues presented) casts doubt on the sufficiency of his scrutiny, I believe the case should be remanded for additional factfinding, not reversed. I therefore dissent.
. Q. [P]rior to [1976] the presidents ha[d] the authority to appoint people to chairperson jobs, did they not?
A. There was generally still a search process, a selection process in most institutions after [and?] a faculty committee within that department made the recommendation to the president.
. Q. Prior to the collective bargaining agreement, could the president pick anyone he wanted essentially as the chairperson?
A. Subject to the processes that were used in the various schools, yes.
Q. Now, when you say the processes, he could allow the faculty in a department to recommend somebody to him, for example?
A. Well, going back prior to the collective bargaining — the schools varied on this from my understanding, and I’m most familiar with the College of Liberal Arts which involved an agreed upon policy in the College of Liberal Arts that the dean would interview with each of the departments and each of the faculty members in the department and have a general meeting with them and then the dean would make a recommendation on the matter. So it was really more of a dean selection of chairperson, and I don’t know, I have no knowledge the president under that process going back in at least the early '70’s, no knowledge as to whether presidents accepted or rejected or what the incidence was.
Q. But it was a management decision, it was the dean or the president who called the shots?
A. Just as it is in the final sense under the collective bargaining.
Q. But you didn’t have to go through any kind of a process of consulting with the faculty, he didn't?
A. Well, I think every president had pretty much accepted the process of whatever each particular college had established. I don’t know of any president who didn't follow that or, you know, I don’t remember any deans not following that.
. Q. Isn’t it a fact that the department didn’t have much of a choice when Dr. Knutson was appointed as the chair of psychology back in 1971?
A. That’s not true, at all.
Q. You don’t think that’s true?
A. I participated in the whole process and we voted. We had, if I remember, three ballots and he clearly had a majority.
Q. That’s your recollection?
A. The minutes would support that, I’m sure. I haven’t looked at them, but I do remember we went through basically the same kind of process when he was hired with the three candidates coming in for interview and the department vote and Knutson had the vast majority of votes and his name was sent out to the dean and approved and it was a very similar process.
. Election Candidates
Year Department Male Female
1975/76 Health, Physical Ed. & Rec. 2
1975/76 Psychology 17
1976/77 Elementary Education 1
1978/79 Health, Physical Ed. & Rec. 29
1978/79 Psychology 1
. The expected outcome (“E”) is calculated by multiplying the percentage of women in the applicant pool (%r) by the number of positions (5). Because we are dealing with a binomial distribution (that is, the elements in the distribution can have only two values, male and female), the standard deviation ("SD") may be calculated as
[[Image here]]
where Pm = the percentage of men, Pw = the percentage of women, and N = the number in the distribution. See D. Barnes, Statistics as Proof 81-82 (1983). Here,
[[Image here]]
The disparity between the observed outcome ("O”) and E is -.69. By dividing the size of the disparity by the standard deviation we can calculate the number of standard deviations the observed outcome is from the expected. The result is the calculated Z score. See id. 198-99, 257-60. Here Z = .69/2.63 = .26. The Supreme Court has held that, " ‘for ... large samples, if the difference between the expected value and the observed number is greater than two or three standard deviations,’ ” we may suspect that something other than chance or random variation is at play. Hazelwood, 433 U.S. at 308 n. 14, 97 S.Ct. at 2742 n. 14, quoting Castaneda v. Partida, 430 U.S. 482, 497 n. 17, 97 S.Ct. 1272, 1281 n. 17, 51 L.Ed.2d 498 (1976). This is true because we assume that, if chance were the sole determinant and the choices were repeated over and over, the observed results would cluster around the expected result in a normal distribution. By definition, 68% of the results should *495be within one standard deviation of (above or below) the expected mean, 95.4% should be within two standard deviations, and 99.7% should be within three standard deviations, if the results are determined solely by chance. See D. Barnes, supra, at 139-41. The Court’s "two or three standard deviations” rule seems to reflect the general principle employed by social scientists that a 5% or 1% likelihood that a given result was caused by chance is reason to suspect that something other than chance was at work. These probabilities can be determined more precisely than simply by eyeballing the number of standard deviations, as the Court did in Hazelwood and Castaneda, by reference to a Z table, or, when the sample size is small, to a Student’s t table. See id. at 404-07. (The Z table allows one to determine the likelihood of an outcome occurring by chance when the sample size is large. The Z table loses validity for small sample sizes, however, because then small actual variations in the observed results produce large proportional variations; the t table progressively corrects for this phenomenon for progressively smaller sample sizes. Id. at 260.) In addition, one can refine the inquiry to determine the chance likelihood that the observed results of repeated trials will all fall on one side of the expected mean (a one-tailed test), rather than being distributed above and below the mean (a two-tailed test). Id. at 199-202.
. SD = V.5 x .5 x 10 = 1.58
Z = 2.5/1.58 = 1.58
The probabilities under the one- and two-tailed tests are determined by reference to a Student's t table, see D. Barnes, supra note 5, at 406-07, accounting for a sample size of 10. See id. at 260.
. The plaintiffs base their assertion on a comparison of Px. 336 (a list of departments with no full-time women faculty members that held chairperson elections between 1975/76 and 1979/80) and Px. 328a-328f (annual lists of new faculty by sex, department, rank, type of appointment, type of search, and number and sex of applicants). My own comparison of these exhibits confirms that there were ten departments in which chairperson appointments coincided with other faculty appointments, but it is clear to me that in many cases the new faculty member could not have been made the chairperson. The new faculty appointments were as follows:
1976/77 Theatre: fixed-term assistant professor
1977/78 Management & Finance: fixed-term instructor Industrial Education: fixed-term instructor Technology: (1) probationary instructor (2) probationary instructor Political Science: (1) probationary instructor (2) fixed-term assistant professor
1978/79 Mass Communications: (1) fixed-term instructor (2) probationary instructor (3) probationary instructor
1979/80 Theatre: (1) probationary instructor (2) probationary instructor Chemistry: probationary associate professor Economics: probationary assistant professor Philosophy: (1) fixed-term assistant professor (2) fixed-term assistant professor
It was established that fixed-term appointments, unless they have special funding, may last no more than two years, see Px. 271a-271c, art. XXI; V.15.95; V.16.21-22, and that the term of a chairpersonship is three years. The university is not entirely free to designate a position as probationary rather than fixed-term: Every new faculty position added since 1977 to accommodate enrollment growth rather than to replace faculty members lost through attrition has been fixed-term, because of the lack of permanent funding, V.15.182-83; V.16.17, 20; see supra at 15; and fixed-term appointments are also used for inherently temporary positions such as those created to replace a faculty member on sabbatical, and for last-minute vacancies whose timing precludes a full-blown national search — and incidentally precludes timely consideration for the chairpersonship, V. 16.20-24. In addition, the overwhelming majority of chairpersons have doctoral degrees, compare Px. 268 with Px. 320 and Px. 325a, which is understandable in light of the chairperson's supervisory responsibilities over other members of the department at every rank; yet many of the faculty appointments the plaintiffs cite were at the instructor level, where the doctorate is unusual. Even though there are no published qualification requirements for chairpersons, the university is surely entitled to make a commonsense judgment whether faculty members newly appointed at the most junior levels, and unlikely to possess advanced degrees or extensive experience, are appropriate candidates for chairpersonships. Cf. Px. lj (required qualifications for psychology department chairperson). It is therefore unlikely that the new faculty and chairpersonship appointments could be merged in many of these cases. Finally, the defendants point out that chairpersonship appointments are made in the spring, to be effective the following academic year, and that it was not shown that the vacancies cited by the plaintiffs occurred early enough to allow an external search for the chairperson. DB 20.
. Q. Now, when you are considering an appointment recommendation in regard to a chair, do you make an independent judgment concerning the qualifications of the candidate or are you viewing the recommendations which are made to you? What role do you play?
A. I think under the collective bargaining contract the weight is very much in favor of the election by the — by the department, and I regard my role as a reviewing role to deter*498mine whether there have been any procedural errors or gross — or, in my opinion, a gross error in judgment before deciding whether to confirm the nomination which the department has made.
. It is clear from the following colloquy between plaintiffs' counsel and President Graham that the plaintiffs expected specific nominees to be rejected on the basis of the general underrep-resentation of women:
Q. Did you ever go in and consult with [the department faculty], Doctor, and say it is about time we start following our affirmative action plan and something we were told in a report in 1972, we've got to have some female Chairs, it is going to be your department, fellas?
A. We have — on occasion the Deans have before 1976 and probably to some degree since discussed this with Departments, and in a few instances before 1976 appointments were made of women after such consultation.
Q. You were able to get some women after consultation?
A. Yes.
V.17B.31.
. This conclusion does not leave the low participation rate inexplicable. As the 1972 report of The Commission on the Status of Women noted, the disparity might be attributable at least in part to deep-seated social conditioning and acceptance by men and women alike of sexual stereotypes. Px. 302b at 6, 7-8; see also id. at 16 (recommendation addressed to women faculty urging active pursuit of leadership positions). Even if these or other causes are based on discriminatory attitudes, however, the defendants cannot be held liable unless the fault is proved to be their own.
. The plan states:
Prior to confirmation of any appointment to a position within the Chancellor’s office or any of the colleges, a written report on the recruitment, evaluation and selection process shall be submitted to the Chancellor or to the college President, who shall approve the appointment within forty-eight (48) hours only if affirmative recruitment requirements have been met.
(Emphasis added.) There was some dispute at trial whether an affirmative action report had to be prepared prior to the initial contact with the proposed appointee, see, e.g., V.12.145-50; V.17A.43-45; Knutson at 77-78, because St. Cloud's affirmative action manual provided:
When the candidate has agreed to accept the position if offered, the operating supervisor should complete the official college form required to effect employment, attaching to it ... a State College Affirmative Action Recruiting Report which should include a clear statement by the operating supervisor explaining why he thinks the selected candidate is the best qualified of all the applicants for the position.
Px. 272b at 9.
. The report states:
The voting procedure used in the final selection process does not in any way take into account affirmative action considerations. In fact, since this was a subjective decision by the faculty, it is quite possible that they could have used discriminatory criteria in their selections (i.e., voting against a woman.)
. See Knutson at 78-79:
Q. And what was your justification, or did you have any justification as to why a female was not appointed?
A. I didn’t have one, and it was never — except that the IFO said you elect them so you elect them, and I felt, and I think the majority of the committee felt that was one of the bas[e]s for our statement that this was not in compliance with the policy, and that the — we were, you know, not doing what the policy said we were supposed to, and that is one of the major reasons that I felt and still do that it’s not in compliance.
. See id. at 71-72:
Q. [In] your report ... you make the comment that since this was a subjective decision by the faculty, it is quite possible that they could have used discriminatory criteria in their selections.... Why do you suggest this in your report?
A. Well, it’s an obvious possibility that could occur, could have occur[r]ed. I happen to personally believe that it occur[r]ed. I don't have any evidence but it does, I guess. As a possibility, not as an actuality there, I believe.
. Faculty members testified that they perceived Peterson to be stronger in administrative experience and skills, V.30.44, 114, 147; V.31.107-08, in communication skills, V.30.147; V.31.43-44, 107, 146-47, in his research orientation, V.30.44; V.31.146, in publications, V.30.44, in his support of professional development, V.31.44-45, in national recognition, V.30.81, in his liberal arts orientation, V.30.145-46, and in his ability, as an outsider, to help heal the department's schisms, V.30.23, 114, 147; V.31.67, 108, 150.
. Px. 305b reveals, in tabular form, what proportion of faculty members of each sex held positions at each rank. The data presented in this table can also be presented in a less precise but more immediately comprehensible graphic form as follows:
[[Image here]]
It will be noticed that in each year some percentage of the faculty members of each sex, ranging from 4.5% to 26.8%, is unaccounted for, despite Px. 305b’s representation that the percentages in each category add up to 100%. The plaintiffs' expert testified that the unaccounted for faculty members belonged to ranks other than the four listed in the table; these other ranks included assistant instructors (merged with instructors in 1975, see supra at 12) and *507administrators. V.10.22, 33. The data in Px. 305b suffer an additional flaw in that they include as members of the four ranks some non-teaching administrators, appointed prior to collective bargaining, who hold honorific academic rank, V.10.106-07; V.20.153; V.22.12-13; V.24.-45-46, and are therefore not members of the certified class of teaching faculty members, see D.R. 28-29, 77; M. 50. The unevenness of the omissions and the improper inclusions in the data undercut to an extent the comparability of the categories included in Px. 305b, but the general point of the disproportionate distribution is evident in any case. The plaintiffs also offered further data, in tabular form, for the three years 1975/76, 1977/78, and 1979/80, which are not subject to these limitations and may be graphed as follows:
[[Image here]]
Px. 305d. (Px. 305d contains data on bargaining unit code 1, i.e., IFO/MEA or teaching faculty, only. V.10.124. Px. 305c includes additional data on bargaining unit code 2 and code 3, which include "administrative and service” personnel (middle-level managers such as dormitory directors, registrar personnel, and financial aid officers, V.21.13-14), represented by MSUAASF (Minnesota State University Association of Administrative and Service Faculty), and "excluded managerial” personnel (upper-level managers such as the president, vice-presidents, deans, and their assistants and associates, V.21.-12-13), see V.10.118-19; these personnel are not members of the certified class of teaching faculty.)
. Px. 305h presents, in tabular form, the numbers and percentages of men and women on the faculty (including MSUAASF employees and administrators, see V.11.17) who held B.A., M.A., and Ph.D. degrees during the years 1976 through 1979. The data may be graphed as follows:
[[Image here]]
*514Px. 305k is a similar table, containing data for the years 1976, 1978, and 1980 and limited to teaching faculty, which additionally breaks down the M.A. degrees into the categories M.A., M.A. + 45, and M.A. + 90 or ABD. It may be graphed as follows:
[[Image here]]
. Portions of the yearly and summary data reported in Dx. 47 (all applicants and appointees) and Dx. 47B (applicants and appointees for positions for which there were applicants of both sexes) may be summarized as follows:
*515[[Image here]]
. Portions of the yearly and summary data reported in Dx. 50 may be summarized as follows:
[[Image here]]
. Part of the data contained in Dx. 69 may be summarized as follows:
[[Image here]]
. Part of the data contained in Dx. 54 may be summarized as follows:
[[Image here]]
. Part of the data contained in Dx. 70 may be summarized as follows:
[[Image here]]
Dx. 70 showed that 14 out of 43 women were promoted without the appropriate degree, and averaged 10.6 years in rank and 16.8 years of experience; and that 8 out of 191 men were promoted without the appropriate degree, and averaged 10.1 years in rank and 17.8 years of experience,
. The data contained in Dx. 161 may be summarized as follows:
[[Image here]]
Dx. 161 showed that 15 out of 43 women promoted, or 34.88%, lacked either an appropriate degree or the requisite years of experience, and averaged 10.4 years in rank and 15.8 years of experience; and that 15 out of 191 men, or 7.89%, lacked the appropriate degree or experi*518ence, and averaged 9.3 years in rank and 15.7 years of experience.
. Article XXIV of the agreement, Px. 271b at 26, and Article XXV of the successor agreement, Px. 271c at 25, provided in part:
The following criteria shall be used when considering a faculty member for promotion.
(1) Demonstrated ability to teach effectively-
(2) Scholarly achievement.
(3) Evidence of continuing preparation, study or research.
(4) Contribution to student growth and development.
(5) Service to the University and community-
(6) Length of service in rank and at the University.
(7) Related experience outside the University-
The President’s decision to promote or not to promote shall be based solely on his/her evaluation of some or all these criteria.
. The majority uses the failure to distribute the guidelines after 1978 to support its conclusion that discrimination in promotion occurred prior to 1976, in the pre-lawsuit period. Ante at 478. Because the guidelines were distributed during the entire pre-lawsuit period this inference is unsupportable.
. Dx. 71 and 165 may be summarized as follows:
SUCCESS RATE OF APPLICANTS FOR PROMOTION
Women Men
1975/76 33.3% (4/12) 40.3% (27/67)
1976/77 33.3% (6/18) 20.6% (14/68)
1977/78 40.0% (10/25) 28.9% (28/97)
1978/79 25.0% (6/24) 31.2% (25/80)
1979/80 27.3% (6/22) 36.0% (27/75)
1980/81 42.4% (14/33) 35.9% (28/78)
. This'analysis assumes that the eight exception promotions should, under a null hypothesis of no discrimination, be distributed proportionally between the men and women. In that case, the expected number of promotions of women is .20 X 8, or 1.6; SD = A/.2 x .8 x 8 = 1.13; and Z = 1.6/1.13 = 1.42. The probabilities are determined by reference to a Student’s t table, accounting for a sample size of 8. A similar result is obtained by assuming that women should have received a proportional number of exception promotions (two) in addition to those awarded to men. In that case, SD = aJ.2 x .8 x 10 = 1.26, and Z =2 /1.26 = 1.59. Reference to a Student’s t table, accounting for a sample size of 10, reveals that the probability of the observed outcome occurring by chance is between 5% and Wh under a one-tailed test, or between 10% and 20% under a two-tailed test. See supra note 5.
. The majority mischaracterizes this evidence when it says 76.9% of all the women hired and 87.2% of all the men hired were appointed at the assistant professor level. Ante at 22. Rather, as explained in the text, the evidence concerns the percentage of those hired at rank III who could have been appointed to no higher level. The data may be summarized as follows:
APPOINTMENTS TO RANK III
Percentage of men Percentage of women Total appointed appointed at highest appointed at highest at highest rank possible_ rank possible_ rank possible
1974/75-76/77 87.2% (34/39) 76.9% (10/13) 84.6% (44/52)
82.5% (80/97) 82.8% (24/29) 82.5% (104/126) 1974/75-79/80
. Significance testing is awkward in this instance because there is no larger population of women assistant professor candidates consisting partly of those qualified for no more than rank III and partly of those qualified for higher rank, from which the thirteen women appointees were selected. Instead I assume for purposes of this test that the expected proportions of women at highest rank and not at highest rank in the sample would match those of the men in the sample (87%/13%). In that case, SD =a/.87 x .13 x 13 = 1.21; the difference between the observed and expected outcomes is 1.3; and Z = 1.3/1.21 = 1.07. The probabilities that the observed outcome occurred by chance are determined by reference to a Student’s t table, accounting for a sample size of 13.