Richardson v. Lamar County Board of Education

729 F. Supp. 806 (1989)

Alice RICHARDSON, Plaintiff,
v.
LAMAR COUNTY BOARD OF EDUCATION, et al., Defendants.

Civ. A. No. 87-T-568-N.

United States District Court, M.D. Alabama, N.D.

November 30, 1989.

*807 *808 Joe R. Whatley, Jr., Falkenberry & Whatley, Birmingham, Ala., and Donald V. Watkins, Watkins, Carter & Knight, Montgomery, Ala., for Alice Richardson, plaintiff.

Donald B. Sweeney, Jr., Rives & Paterson, Birmingham, Ala., and Ronald H. Strawbridge, Vernon, Ala., for Lamar County Bd. of Education, et al., defendants.

MEMORANDUM OPINION

MYRON H. THOMPSON, District Judge.

Plaintiff Alice Richardson, an African-American, has brought this lawsuit claiming that defendant Lamar County Board of Education[1] wrongfully refused to renew her teaching contract in violation of Title VII of the Civil Rights Act of 1964, as amended.[2] Richardson charges the school board with two types of discrimination under Title VII. First, she asserts a claim of "disparate treatment":[3] that the school board refused to renew her contract because of her race. Second, she asserts a claim of "disparate impact": that the board's stated reason for not renewing her contract — that she had failed to pass the Alabama Initial Teacher Certification Test — is impermissible because the test has had a disparate impact on African-American teachers. The court's jurisdiction has been properly invoked pursuant to 42 U.S. C.A. § 2000e-5(f)(3).

Based on the evidence presented at a nonjury trial, the court concludes that Richardson may recover on her disparate impact claim but not on her disparate treatment claim. The court's disposition of Richardson's disparate treatment claim is simple and direct. The court simply applies the procedure set forth by the Supreme Court in Texas Department of Community Affairs v. Burdine, 450 U.S. 248, 101 S. Ct. 1089, 67 L. Ed. 2d 207 (1981). The court's disposition of her disparate impact claim is, however, much more difficult. The court first addresses and finds meritless two defenses raised by the school board: that Richardson's disparate impact claim is barred by principles of collateral estoppel and res judicata; and that under the framework set forth in Price Waterhouse v. Hopkins, ___ U.S. ___, 109 S. Ct. 1775, 104 L. Ed. 2d 268 (1989), Richardson would not have been reemployed even if she had passed the state certification test. The court then goes through a lengthy application of the disparate impact analysis outlined by the Supreme Court in Wards Cove Packing Co., Inc. v. Atonio, ___ U.S. ___, 109 S. Ct. 2115, 104 L. Ed. 2d 733 (1989).

I. BACKGROUND

Richardson taught in the Lamar County School System for three years, from 1983 to 1986. She was, however, unable to obtain a permanent teaching certificate and therefore had to teach with temporary and provisional certificates. To obtain a permanent certificate, Richardson, like all other teachers in the state at that time, had to *809 pass the Alabama Initial Teacher Certification Test, which consisted of a "core" examination and an examination aimed at the specific area in which the teacher sought to teach. Richardson wanted to teach in the areas of early childhood education and elementary education, and thus could meet the certification test's specific area requirement by passing the examination in either area. Between 1984 and 1986, Richardson failed the early childhood education examination twice and the elementary education examination three times.

In the spring of 1986, the Lamar County Board of Education decided that the elementary school where Richardson taught should be consolidated with another school. Because fewer teachers would be needed, the school board informed 15 nontenured teachers, including Richardson, that their contracts would not be renewed for the 1986-87 school year. Four of the 15 teachers were, however, rehired. Richardson, who would have acquired tenure if she had been rehired, was not one of the four.

Approximately a year later, in May 1987, this court enforced a consent decree requiring the State Board of Education to issue permanent teaching certificates to a court-defined class of black teachers who had failed the state teacher certification test.[4] Richardson received her certification pursuant to the consent decree.

II. DISPARATE TREATMENT CLAIM

As stated, Richardson charges the Lamar County Board of Education with two types of racial discrimination: "disparate treatment" and "disparate impact." With the former, an employee must prove intentional discrimination. Texas Department of Community Affairs v. Burdine, 450 U.S. 248, 253, 101 S. Ct. 1089, 1093, 67 L. Ed. 2d 207 (1981). With the latter, however, the employee challenges "practices that are fair in form but discriminatory in operation," Wards Cove Packing Co., Inc. v. Atonio, ___ U.S. ___, ___, 109 S. Ct. 2115, 2119, 104 L. Ed. 2d 733 (1989) (quoting Griggs v. Duke Power Co., 401 U.S. 424, 431, 91 S. Ct. 849, 853, 28 L. Ed. 2d 158 (1971)); the employee need not prove intentional discrimination. Wards Cove, ___ U.S. ___, 109 S.Ct. at 2119.

In Burdine, 450 U.S. at 253-56, 101 S.Ct. at 1093-95, the Supreme Court set out the procedure a trial court should follow in assessing a disparate treatment claim. An employee has the initial burden of establishing a prima facie case of intentional discrimination, which once established raises a presumption that the employer discriminated against the employee. If the employee establishes a prima facie case, the burden then shifts to the employer to rebut the presumption by producing sufficient evidence to raise a genuine issue of fact as to whether the employer discriminated against the employee. This may be done by the employer articulating a legitimate, nondiscriminatory reason for the employment decision, a reason which is clear, reasonably specific, and worthy of credence. The employer has a burden of production, not one of persuasion, and thus does not have to persuade the court that it was actually motivated by the reason advanced. Once the employer satisfies this burden of production, the employee then has the burden of persuading the court that the proffered reason for the employment decision is a pretext for intentional discrimination. The employee may satisfy this burden by persuading the court either directly that a discriminatory reason more than likely motivated the employer or indirectly that the proffered reason for the employment decision is not worthy of belief. By so persuading the court, the employee satisfies her ultimate burden of demonstrating by a preponderance of evidence that she has been the victim of intentional discrimination.

Where, as here, however, a disparate treatment case has been fully tried, the court need not employ the full Burdine analysis, but may simply proceed directly to the ultimate issue of discrimination. *810 United States Postal Service Bd. of Governors v. Aikens, 460 U.S. 711, 715, 103 S. Ct. 1478, 1482, 75 L. Ed. 2d 403 (1983); Moore v. Alabama State University, 864 F.2d 103, 105 (11th Cir.1989). However, as this court recently stated in Dunning v. National Industries, Inc., 720 F. Supp. 924, 929 n. 7 (M.D.Ala.1989),

trial courts should nonetheless use the Burdine analysis in fully tried cases in which the plaintiff relies on circumstantial evidence. Such cases pose difficult and sensitive issues of subjective intent and objective action. The Burdine analysis provides an invaluable method of weighing and considering evidence. By focussing the court's inquiry on the prima facie case, the employer's justification, and the issue of pretext, Burdine helps to assure that the court arrives at its ultimate conclusion less through intuition and more through factual reasoning and analysis. See, e.g., Noble v. Alabama Dep't of Environmental Management, 872 F.2d 361, 365 n. 4 (11th Cir. 1989); Nix v. WLCY Radio/Rahall Communications, 738 F.2d 1181, 1184 (11th Cir.1984).
Of course, the Burdine analysis should not be applied too rigidly; nor should it be viewed as an end in itself. In other words, it should not be used by the court as a "substitute" for reaching the ultimate issue of whether the plaintiff has, in fact, been a victim of discrimination or retaliation. Moore, 864 F.2d at 105.

With these principles in mind, the court will now apply the Burdine analysis to Richardson's disparate treatment claim.

Richardson may establish a prima facie case by showing that she is a member of a protected class; that she was qualified for her job; that she was not rehired; and that a person outside the protected class with equal or lesser qualifications was rehired. Nix v. WLCY Radio/Rahall Communications, 738 F.2d 1181, 1185 (11th Cir.1984); Lee v. Russell County Board of Education, 684 F.2d 769, 773 (11th Cir. 1982). Richardson has shown that she is African-American and thus a member of a protected class; that she was qualified for the position of a teacher in the Lamar County School System; that she was not rehired for the 1986-87 school year; and that numerous nontenured white teachers with qualifications equal to hers, in the broad sense that she and the white teachers were all qualified to teach in the Lamar County School System, were rehired for that school year.

The Lamar County Board of Education has, however, articulated a legitimate, nondiscriminatory reason for treating Richardson differently from the nontenured white teachers: Richardson did not have a permanent teaching certificate. Richardson has not shown this reason to be a pretext for refusing to rehire her because of her race. If Richardson had been rehired, she would have been in her fourth year in the school system and would have acquired tenure under state law; she would have become a permanent teacher. The superintendent and school board reasonably concluded that they did not want a tenured, or permanent, teacher with only a provisional certificate. The nontenured white teachers who were rehired did not pose the same problem for the school board. Either they had permanent teaching certificates, or although they had only provisional or temporary certificates they would not have acquired tenure when rehired.[5]

The court is sensitive to the fact that the decision makers here, the superintendent and members of the school board, are white. As the former Fifth Circuit explained in Phillips v. Joint Legislative Committee, 637 F.2d 1014, 1026 n. 21 (5th Cir.1981), subjective judgments about African-American employees when exercised by all-white supervisors or executives should be received with caution. However, as explained, the court is convinced that it was reasonable for the school board to conclude that Richardson should not be rehired as a tenured teacher without a permanent *811 teaching certificate. Richardson has not been a victim of intentional racial discrimination.

III. DISPARATE IMPACT CLAIM

The court turns next to Richardson's disparate impact claim. Richardson claims that the Alabama Initial Teacher Certification Test had an impermissible disparate impact on African-Americans, in violation of Title VII. The Lamar County Board of Education proffers two preliminary defenses to Richardson's claim: first, that her disparate impact claim is barred by principles of collateral estoppel and res judicata; and, second, that under the framework set forth in Price Waterhouse v. Hopkins, ___ U.S. ___, 109 S. Ct. 1775, 104 L. Ed. 2d 268 (1989), she would not have been reemployed even if she had possessed a permanent teaching certificate. The school board also denies that the state teacher certification test violates Title VII.

A. Res Judicata and Collateral Estoppel

The Lamar County Board of Education argues that, based on principles of res judicata and collateral estoppel, Richardson's challenge to the teacher certification test is barred by the consent decree entered in Allen v. State Board of Education. Allen was a class-action lawsuit brought by several African-American teachers who had failed the state teacher certification test, charging that the test violated Title VII and other federal statutory and constitutional provisions. They sued the State Board of Education and its members and superintendent.[6] The parties to Allen later agreed to a consent decree settling all claims. The decree required that the state pay $500,000 in liquidated damages and issue permanent teaching certificates to a large portion of the plaintiff class; the decree also provided for a new certification process.[7] As a member of the Allen class, Richardson received a permanent teaching certificate and shared in the liquidated damages.

Principles of res judicata apply to consent decrees as well as to ordinary judgments entered by a court. In re Birmingham Reverse Discrimination Employment Litigation, 833 F.2d 1492, 1498 (11th Cir.1987), aff'd sub nom. Martin v. Wilks, ___ U.S. ___, 109 S. Ct. 2180, 104 L. Ed. 2d 835 (1989).[8] For res judicata to preclude Richardson's suit against the school board, the following four elements must be present: there must be a final judgment on the merits; it must be rendered by a court of competent jurisdiction; the parties, or those in privity with them, must be identical in both suits; and the same cause of action must be involved in both cases. Hart v. Yamaha-Parts Distributors, Inc., 787 F.2d 1468, 1470 (11th Cir.1986). The third element is missing here. Although Richardson, as a member of the plaintiff class, was a party in Allen, the Lamar County Board of Education was not.

The school board argues that it was represented by the Alabama State Board of Education in the Allen litigation and that thus it was "in privity" to the lawsuit. Privity exists where a nonparty's interest is adequately represented by the party to the original suit.[9]Hart 787 F.2d at 1472; 1B J. Moore, J. Lucas, & T. Currier, Moore's Federal Practice ¶ 0.411[1] (1988). A court may find privity on this *812 ground in only a narrow set of circumstances where the party and the nonparty's interest are so "closely aligned" that the party is the "virtual representative" of the nonparty. First Alabama Bank of Montgomery, N.A. v. Parsons Steel, Inc., 747 F.2d 1367, 1378 (11th Cir.1984), reversed on other grounds, 474 U.S. 518, 106 S. Ct. 768, 88 L. Ed. 2d 877 (1986).[10] Whether the county school board's interests were so closely aligned with the State Board of Education's interests in Allen as to preclude Richardson's current suit is a question of fact for the court. First Alabama Bank of Montgomery, N.A., supra.[11]

Other than the fact that they both have a role to play in the education of the state's children, the school board has offered nothing to show that its interests are so closely aligned with those of the State Board of Education that the State Board was in effect representing the school board as well as itself in the Allen litigation. First of all, although the State Board of Education interacts with local school boards in educating the state's children,[12] the State Board and local school boards are by law separate entities, with different authority.[13] Secondly, all parties in Allen proceeded as if the State Board was an entity separate from the local boards. The plaintiffs challenged specific actions of the State Board; they did not challenge the actions of local school boards throughout the state. And along the same lines, when the State Board settled the damages portion of Allen for $500,000, there was no implication that it was paying these damages on behalf of the local school boards as well; and likewise there was no implication that if an Allen plaintiff had been injured by a local school board then any lawsuit she brought against it would be merged with the Allen consent decree. The court therefore concludes that the State Board of Education did not represent local school boards in Allen, nor were the interests of it and county boards of education so closely aligned that when the Allen plaintiffs sued and then settled with the State Board of Education they by implication also sued and settled with every local school board in the state.

This conclusion is buttressed by the well-settled principle that "a consent decree by definition binds only those who explicitly or implicitly consent to it." In re Birmingham Reverse Discrimination Employment Litigation, 833 F.2d at 1499; United States v. Jefferson County, 720 F.2d 1511, 1518 n. 19 (11th Cir.1983).[14] The Lamar County School Board was not involved in and did not consent, explicitly or implicitly, to the consent decree in Allen.

Likewise, the school board's collateral estoppel claim is without merit. The typical elements of collateral estoppel "are that the fact or point now in issue is (1) identical to an issue in the former action, (2) actually litigated and determined by the parties, and (3) necessarily so determined." Barber v. International Brotherhood of Boilermakers, 778 F.2d 750, 757 (11th Cir. 1985); accord Hart, 787 F.2d at 1473. Some courts and commentators are of the view that, because the issues have not been actually litigated when a lawsuit is resolved, the doctrine of collateral estoppel is *813 not applicable to a consent decree.[15] However, according to this circuit, a consent decree may preclude relitigation of an issue by a person that was a party to the original lawsuit in that limited circumstance where such an intent is manifested in the consent decree or otherwise. Barber, 778 F.2d at 757.[16] As discussed above, the Allen parties had no intent under the consent decree that local school boards would be released from claims. Richardson is therefore not collaterally estopped from bringing her suit against the Lamar County Board of Education.

B. Price Waterhouse Defense

With its second defense, the Lamar County School Board is essentially asking that the court use the "mixed motives" framework articulated by the Supreme Court in Price Waterhouse v. Hopkins, ___ U.S. ___, 109 S. Ct. 1775, 104 L. Ed. 2d 268 (1989). Within this framework, if the plaintiff presents "direct evidence" that impermissible discrimination was a substantial, motivating factor in an adverse employment decision, then the burden shifts to the employer to prove by a preponderance of the evidence that it would have made the same decision even if it had not allowed the illegal discriminatory facts to enter its decisionmaking process. If the employer fails to sustain this burden, then the employee has established liability under Title VII. Id., ___ U.S. at ___, 109 S.Ct. at 1788-89, 1795 (Brennan, J., plurality opinion); id., ___ U.S. at ___, 109 S.Ct. at 1798-99, 1804 (O'Connor, J., concurring in the judgment).[17] Relying on Price Waterhouse, the school board argues that, although it refused to renew Richardson's contract because she lacked a permanent teaching certificate, it would have taken the same action even if she had possessed one. The evidence does not support the school board's argument. The court is convinced, and so finds, that, if Richardson had passed the state teacher certification test by the end of the 1985-86 school year, the school board would have reemployed her for a fourth year.[18]

First of all, when initially confronted in a pretrial depositions with the question of whether Richardson would have been reemployed *814 if she had possessed a permanent teaching certificate, the school system's superintendent testified that "chances are she probably would." The school board argues that the court reporter incorrectly transcribed the superintendent's testimony. The court, having reviewed the entire deposition, and in particular the verbal exchanges between the attorneys after the above statement, is convinced that the court reporter correctly transcribed the superintendent's statement. The school board also suggests that, when the superintendent made the statement, he misunderstood the question. The court does not believe that the superintendent misunderstood the question posed to him. Instead, he initially answered forthrightly and honestly, and later tried to change his testimony after his attorney made him aware of the legal implications of his answer.[19]

Secondly, the evidence submitted by the school board reflects that there were nine non-tenured elementary level teachers, who had passed the state teacher certification test, who had the same educational degree as Richardson had,[20] and who had experience in the Lamar County School System that was less than, or equal to, Richardson's; these teachers were either immediately rehired at the end of the 1985-86 school year[21] or although not retained at the end of that school year were later rehired with the 1986-87 school year.[22] Six of these nine teachers also had "overall" teaching experience — that is, both within and outside the county school system — that was less than, or equal to, Richardson's.[23] The school board has not shown why Richardson, if she had passed the state certification test, would not have been employed instead of one of these teachers, in particular, one of those with less experience than Richardson.[24]

Moreover, the court does not believe the school board would have compared Richardson with the above nine teachers solely on the basis of their experience and education. The court believes that a critical factor would have been each teacher's actual teaching ability as observed by those in the school system. However, the school board had presented no such evidence comparing Richardson with the other nine teachers; in other words, the school board has not shown that Richardson was less qualified than the other nine teachers both on paper and in practice.[25]

*815 C. Disparate Impact Analysis

In assessing a disparate impact claim, a court should use a three-step process similar to the one used in assessing a disparate treatment claim. First, the employee must identify the specific employment practice challenged and, further, must show that the challenged practice falls significantly more harshly on one group than another, that is, that the practice under attack has created "adverse impact." Wards Cove, ___ U.S. at ___, 109 S.Ct. at 2124. If this showing is made, the burden then shifts to the employer to produce evidence of employment justification for the employment practice. The employer's burden is one of production not persuasion, for the burden of persuasion always remains with the employee. Finally, if the employer satisfies its burden, the employee may prevail only if she shows that the employer's justification for the practice has no basis in fact or that another practice, without a similarly undesirable adverse effect, would also serve the employer's legitimate employment interests. Id. at ___, 109 S.Ct. at 2126.

Richardson has appropriately identified two components of the state teacher certification test — the early childhood education examination and the elementary education examination — which she contends have had a racially disparate impact on her and other African-Americans. The initial issue for the court is therefore whether these two examinations in fact have had such an impact.

i. Adverse Racial Impact

Adverse impact may be established by statistical evidence alone so long as the statistical pool — or sample, if appropriate — is logically related to the employment decision at issue and the method of comparison applied to that pool or sample is meaningful. See Connecticut v. Teal, 457 U.S. 440, 102 S. Ct. 2525, 73 L. Ed. 2d 130 (1982).[26] In testing cases, it is well established that actual examinees constitute the most logical statistical pool, and that the appropriate method of comparison to be applied to that pool is a measurement of the differences in pass-fail rates. B. Schlei & P. Grossman, Employment Discrimination Law 100 (2nd ed. 1983); Shoben, Differential Pass-Fail Rates in Employment Testing: Statistical Proof Under Title VII, 91 Harv.L. Rev. 793 (1978).

In the present case, the pass and fail data were segregated into three categories, only one of which is useful. One category of data consists of the "total test administration" data. This breakdown includes multiple retakes by the same person. Because these data cannot demonstrate adverse impact against individuals, and are thus not capable of being tested for statistical significance, the court declines to use these data in determining whether Richardson has established a prima facie case. There are also data organized according to final candidate results. To accept this data would amount to an acceptance of the proposition that, regardless of how many times a candidate failed the test, she became a "success" as soon as she passed. Such a proposition is erroneous because it fails to recognize that the initial failure was a discrete injury. See Jenkins v. United Gas Corporation, 400 F.2d 28, 31-32 (5th Cir.1968) (subsequent obtainment of a benefit previously denied for discriminatory reason does not cure the original discrimination). A final category organizes data according to the pass-fail rates for those persons taking the test for the first time. Stated differently, the data record the extent to which black and white examinees passed or failed the test on their *816 first try. The court is of the opinion that, of the data presented, this "first time" category is the most appropriate for a determination of adverse impact.

There is no set mathematical threshold that must be met in order to show significant disparate impact, Moore v. Southwestern Bell Telephone Company, 593 F.2d 607, 608 (5th Cir.1979) (per curiam), and various formulas can be used to measure the degree of impact in a specific case. The Equal Employment Opportunity Commission (EEOC) generally regards a selection rate that is less than 4/5 , or 80%, of the rate for the group with the highest rate as an indication of significant adverse impact. 29 C.F.R. § 1607.4D (1988). Other courts have stated that, if the difference between the expected value and the observed number is greater than two or three standard deviations, then adverse impact has been shown. Hazelwood School District v. United States, 433 U.S. 299, 308-311 & nn. 14 & 17, 97 S. Ct. 2736, 2742-43 & nn. 14 & 17, 53 L. Ed. 2d 768 (1977); Castaneda v. Partida, 430 U.S. 482, 496 n. 17, 97 S. Ct. 1272, 1281 n. 17, 51 L. Ed. 2d 498 (1977). And one commentator has proposed that, in cases involving challenges to employment tests, courts should use the "test for difference between independent proportions." Shoben, Differential Pass-Fail Rates in Employment Testing: Statistical Proof Under Title VII, 91 Harv.L. Rev. 793, 797-800 (1981). In this case, however, the court need not decide what formula should be used to show adverse impact, because under any of the formulas both the early childhood education and the elementary education examinations have had a clear and significant adverse impact on African-American persons. The results of the application of the formulas to the early childhood education examination are as follows:

(1) EEOC formula: From June 1981 until June 1985, 310 black persons took the early childhood education examination for the first time, and 127 of them passed, for a pass rate of 41%. For this same period, 2,607 white persons took the examination for the first time, and 2,374 of them passed, for a pass rate of 91.1%. Thus, the pass rate for black persons is only 45% of that for white persons, well below 80%.
(2) Hazelwood formula: The difference between the expected value and the observed value is over eight standard deviations, much more than two or three standard deviations.[27]
(3) Shoben formula: The "Z" value under the test for difference between independent proportions is over 23, much more than the 1.96 necessary to support a finding of adverse impact.

The results of the application of the formulas to the elementary education examination are as follows:

(1) EEOC formula: From June 1981 until June 1985, 496 black persons took the elementary education examination for the first time, and 246 of them passed, for a pass rate of 49.6%. For this same period, 4,144 white persons took the examination for the first time, and 3,885 of them passed, for a pass rate of 93.8%. Thus, the pass rate for black persons is only 53% of that for white persons, well below 80%.
(2) Hazelwood formula: The difference between the expected value and the observed value is over nine standard deviations, much more than two or three standard deviations.[28]
(3) Shoben formula: The "Z" value under the test for difference between independent proportions is over 29, much more than the 1.96 necessary to support a finding of adverse impact.

*817 ii. Employment Justification

Since Richardson has established that the early childhood education and elementary education examinations had an adverse racial impact, the burden shifts to the Lamar County Board of Education to produce evidence of employment justification. An understanding of the history of the Alabama Initial Teacher Certification Test is important to determining whether the school board has met its burden and, if so, whether Richardson has, in turn, shown that the school board's justification for the certification test has no basis in fact.

a. History of the Early Childhood Education and Elementary Education Examinations

In 1979, amidst a national groundswell in favor of teacher competency testing, the Alabama State Board of Education placed development of a uniform certification test at the head of its agenda. It retained a professor at Auburn University to conduct a feasibility study regarding implementation of a teacher testing program in Alabama; the state's Assistant Superintendent for Teacher Certification also participated in the study. After a rather cursory investigation, the two educators recommended implementation of a testing program similar to one designed by a private test developer for the State of Georgia.

The State Board agreed with the recommendation. In January 1980, it awarded a contract to the private test developer on a noncompetitive basis.[29] While the board did not always express its purpose for imposing the test requirement with perfect clarity, both the test developer and the board understood that the test would measure whether a teacher possessed enough minimum content knowledge to be competent to teach in the classrooms of Alabama.

The time frame for development of the Alabama Initial Teacher Certification Test, as it came to be known, was quite short. The test developer had one year to complete development and implementation of 36 separate examinations. The test developer created a "core" examination and 35 additional examinations that covered specific subject areas. As stated, a teacher had to pass the core examination and one subject area examination in order to receive certification.

The Assistant State Superintendent, the sole ranking state official charged with oversight of the private test developer's contract compliance, had a doctorate in educational administration; but neither he nor anyone on his staff had any expertise in test development. And no outside experts were retained to monitor the test developer's work. The developer's work product was accepted by the state largely on the basis of faith.

The test developer began by preparing a preliminary planning document. It next asked the State Department of Education to appoint Alabama educators to the various committees and panels necessary for completion of the project. According to criteria provided by the developer, these educators were selected to represent a fair cross section of persons from different geographic areas throughout the state. They were also selected in such a way that African-Americans and women were fairly represented overall; however, not all committees and panels had minority representatives.

The test developer's technical staff and subject area consultants then formulated topic outlines for the various examinations. They consulted state education standards, state courses of study, materials related to Alabama's student competency tests, and examples of textbooks used in Alabama public schools. They also developed actual test objectives. These objectives were more explicit statements of concepts embedded in the topic outlines. The objectives were reviewed by the developer's editors and management. The developer's in-house work was far below average.

*818 In October of 1980, approximately 200 Alabama educators attended a two-day conference to review the topic outlines and objectives for 36 examinations. They had previously been mailed orientation materials. After additional orientation, they were divided into curriculum committees to review the topic outlines for comprehensiveness, organization, accuracy, and absence of bias. The committees then reviewed the objectives to ensure that they matched the topic outlines. Taxonomic level, significance of content, accuracy, level of specificity, suitability, and lack of bias were considered. Decisions were reached by consensus during both stages of review. Modifications and deletions were recorded by the test developer's personnel assigned to each committee. In some cases, however, the developer made additional changes, or ignored suggested changes, without obtaining clearance from committee members. No effort was made at any time to link the topic outlines and objectives to the statemandated curriculum for teacher training programs.

The test developer then sent a job analysis survey packet to approximately 3,000 in-service teachers throughout Alabama. The purpose of this survey was to determine the job relatedness of the test objectives.[30] However, in nine fields where there were fewer than 200 teachers throughout the state, the test developer's process resulted in very small response rates. The survey packet was sent to persons certified and teaching in specific content areas. The packet included a set of objectives for that content area, a survey form, and a set of instructions. The teachers were asked whether they had taught or used each objective in the past two school years. If the answer was yes, they were asked to rate the objective in terms of time and essentiality. The scales used to record those responses were balanced in favor of indicating that an objective was job related, and teachers were instructed to resolve doubts in favor of job relatedness. The results of the job analysis survey were tallied in such a way that responses from only those who indicated that an objective had been used in the last two years were reflected in the data. Those who indicated that an objective had not been used were ignored.

In January of 1981, the curriculum committees met for a second time. They were provided results from the job analysis survey and were asked to determine which objectives should generate questions to appear on the examinations. This step was called "objective selection." The survey results were a major determinant of which objectives were ultimately selected.

The test developer then prepared a "blue-print" for each examination. These blue-prints specified the number of test questions, or items, necessary to measure each objective. Test items were drafted by the test developer's content area consultants and edited by its staff. Again, the developer's in-house work was far below average.

In March of 1981, the test items were reviewed by Alabama curriculum committees for "item/objective" match, significance of content, accuracy, clarity, and absence of bias. This "item review" process lasted for two days. Committee revisions were recorded by the test developer's personnel. However, in some cases, the developer ignored the suggested changes, or made additional recisions, without consulting committee members for approval. In other cases, the developer simply added new items that had never been reviewed by committee members. As many as 20 items for each 120-question examination fell into one of these categories.

In late April of 1981, the test developer convened a separate group of educators to review the test items once again for content validity. The purpose of this session, which lasted one day, was to provide an independent check against the judgments already rendered by the previous committees of Alabama educators. The new panelists reflected a fair cross section of persons in their field and were qualified to make content validity judgments in their *819 field. Each educator worked separately, but votes were tallied as if educators had served on a committee. After orientation, the educators were asked to judge whether each item matched its objective, was accurate, was free of bias, and was not tricky, misleading, or ambiguous. If the item met these criteria, the item was rated content valid by that judge. If the item was deemed invalid, the judge's reason for rejecting that item was recorded. The test developer compiled these content validity ratings; a level of agreement among judges greater than 50% was required for an item to be deemed content valid. While a majority of items appearing on the final test instruments reflected the judgment of Alabama educators that those items were content valid, a significant number of items appearing on the tests did not reflect that judgment. These included those items that had been revised by the developer without obtaining clearance from the panelists.[31]

The judges were also asked to make cut-score decisions for those items they had rated content valid. For these items, and those items only, judges were asked whether a teacher with minimum content knowledge in the field should be able to answer the item correctly. A yes-no response was requested. Judges were disqualified from making that same cut-score determination for any item they had previously rated content invalid. In essence, their expert judgment as to those items was ignored.

The test developer then assembled and produced the actual test instruments for all 36 examinations. Each examination had 100 items tentatively designated as scoreable and 20 items tentatively designated as nonscoreable. The examinations were first administered to a group of actual candidates. The test developer had originally contemplated a separate field tryout, but time constraints prohibited such a course. After the first administration, the developer examined item statistics to flag problem questions. Based on this item analysis, it selected 100 scoreable items and 20 nonscoreable items for each examination. The developer did not conduct empirical bias studies to determine whether the difficulty of items varied according to the race of examinees.

The test developer then set a minimum cut score for each examination. The developer's original plan was to take the panelists' cut-score ratings and subject them to a 10% non-cumulative binomial algorithm. This level of agreement among judges would then determine the minimum cut score. However, the developer's procedure yielded cut scores that were so astoundingly high that they signaled, on their face, an absence of correlation to minimum competence. For example, of the more than 500 teachers who took the first administration of the core examination, none would have passed if the original cut-score methodology had been followed.

Faced with this problem, the test developer made various mathematical "adjustments" to the original cut score. First, the developer applied a 10% cumulative binomial algorithm. When the cut scores still remained too high, it applied a 5% cumulative binomial algorithm. This process of applying successively stricter algorithms was referred to at trial as a "binomial twist." The developer engaged in this process without consulting the State Department of Education or any Alabama educators. In two fields — that of Music and that of Speech, Communication, and Theatre — the 5% binomial twist yielded cut scores that were much too low. The developer simply applied a different mathematical algorithm to those examinations; again, the developer consulted no one. For all special education and school counseling examinations, the developer recommended a uniform cut score cap of 80 to the State Department of Education. This recommendation was based on the developer's experience in the Georgia testing program. However, in Georgia, the decision to place a cap on cut scores was reached by state officials in conjunction with Georgia educators.[32]

*820 The State Department of Education was then given the option of dropping the cut scores, as set by the developer, by two or three standard errors of measurement (SEM's). It was clear at that time that cut scores, even after the various adjustments catalogued above, were not measuring competence. For example, even after the developer's 5% binomial twist, 78% of the teachers taking the first administration of the core examinations would have failed. The same would have been true for 93% of those taking the school counseling examination, 89% of those taking the learning disability examination, and 97% of those taking the library media examination. Instead of challenging what the developer had done, the state simply dropped the cut scores three SEM's in order to arrive at a "politically" acceptable pass rate. In so doing, the state knew that the examinations were not measuring competency.

In 1982, the test developer formulated nine additional examinations. Its test construction procedures and quality of execution were essentially the same, with the following exceptions. First, the developer's job analysis survey form contained a rating scale with additional errors. Second, a more restrictive binomial table was used to calculate agreement among panelists on content validity questions. Third, a more accurate cut-score methodology was employed.

In 1983, the developer conducted a "topicality review" to update ten of the examinations already in use. A curriculum committee performed item and objective review. The committee's tasks were to determine whether items had become stale because of changes in the teaching field and to identify problems with items by reference to item statistics for the first eight administrations of the certification test. On average, 50% of the items in any given examination were replaced or revised. The developer did not convene a separate panel, as it had during the initial test development, to provide an independent screen for content validity, nor was an independent cut-score panel convened. The curriculum committee provided ratings used to set cut scores.

b. Validity of the Early Childhood Education and Elementary Education Examinations

The Lamar County Board of Education contends that the state teacher certification test was designed to determine whether a teacher is competent to teach in Alabama's classrooms. Richardson claims, as stated, that the early childhood education and elementary education examinations were invalid, that they did not measure competency.

Generally, validity is defined as the degree to which a certain inference from a test is appropriate and meaningful. APA Standards at 94.[33] It is suggested that validity evidence must necessarily be restricted to success on the job; and, to be sure, there are Title VII decisions that have approached the question of validity by asking whether a given score on a test yields an appropriate and meaningful inference about successful performance on the job. See, e.g., Contreras v. City of Los Angeles, 656 F.2d 1267, 1271-1272 (9th Cir.1981), cert. denied, 455 U.S. 1021, 102 S. Ct. 1719, 72 L. Ed. 2d 140 (1982); Guardians Association of New York City Police Dept., Inc. v. Civil Service Commission, 630 F.2d 79, 91 (2d Cir.1980), cert. denied, 452 U.S. 940, 101 S. Ct. 3083, 69 L. Ed. 2d 954 (1981). However, there is no magic to using success on the job as an anchor point for validity. Success on the job is just one of many constructs that a test can measure. Thus, a sound inference as to a different construct, such as minimal competence, may also form the basis for a finding of validity. In short, a test will be valid so *821 long as it is built to yield its intended inference and the design and execution of the test are within the bounds of professional standards accepted by the testing industry. APA Standards at 9; cf. Washington v. Davis, 426 U.S. 229, 247 & n. 13, 96 S. Ct. 2040, 2051 & n. 13, 48 L. Ed. 2d 597 (1976) (validity need not be limited to inference about success on the job).

In order to be valid, a licensure or certification test must support the inference that persons passing the test possess knowledge necessary to protect the public from incompetents. APA Standards at 63. Part of an appropriate validation strategy for licensure and certification tests is to define clearly and correctly the domain of minimum content knowledge necessary for competence. The test domain, once defined, must then be translated into actual test questions that measure competence. At all stages, validity flows from the expert judgment of practitioners in the field being tested. The test developer's role is to employ professionally accepted practices that accurately marshal the expert judgment of those practitioners. When the questions on a given test actually measure what practitioners in the field consider to be content knowledge associated with competency, the test instrument is held to possess content validity. However, mere content validity does not alone establish test validity. No matter how valid the test instrument, an inference as to competence or incompetence will be meaningless if the cut score, or decision point, of the test does not also reflect what practitioners in the field deem to be a minimally competent level of performance on that test. Again, the test developer's role in setting a cut score is to apply professionally accepted techniques that accurately marshal the judgment of practitioners.

In assessing the overall validity of the Alabama Initial Teacher Certification Test, the court must therefore address both content and cut-score validity. The test developer retained by the State Board of Education followed a multi-step procedure to build 36 teacher certification examinations in 1981. With minor variations, it followed the same procedure when it built nine additional examinations in 1982. The developer then applied a third procedure when it revised ten examinations in 1983. The content validity of each of these examinations turns on whether the developer's procedures were adequate, or were outside the bounds of professional judgment. For reasons that follow, the court concludes that the developer's procedures violated the minimum requirements for professional test development. Accordingly, none of the examinations, including the early childhood education and elementary education examinations, possesses content validity.

The test development process was outside the realm of professionalism due to the cumulative effect of several serious errors committed by the developer when it formulated the 45 examinations in 1981 and 1982. First, while practicing teachers were asked to offer their judgment about the job relatedness of test objectives, it is clear that the test developer's survey instrument distorted that judgment. Scales were balanced in favor of finding job relatedness and respondents were specifically instructed to resolve all doubts in favor of job relatedness. Moreover, the response of those teachers who indicated that they had not used an objective was ignored.

Second, Alabama educators serving on curriculum committees selected test objectives based on those survey results. It has been suggested that the survey was used only in an advisory capacity and that any survey errors were offset by the overall judgmental process undertaken by committee members. However, it is plain that the survey was conducted to solicit critical firsthand knowledge from in-service teachers. It is equally plain that curriculum committee members, aware that the survey had been conducted for that purpose, took the survey results quite seriously. The court concludes that the overall judgmental process for determining job relatedness of test objectives was distorted significantly by survey error.

Third, a significant number of items appearing on the examinations failed to reflect accurately the collective judgment of curriculum committee members. In some *822 cases, changes to actual test items were not implemented. In other cases, items that had never been reviewed by a curriculum committee appeared on examinations. It is suggested that, in any testing program of this size, a certain number of errors of this type will be found. The court agrees with this proposition in principle; however, the evidence reflects that the error rate per examination was simply too high.

Fourth, Alabama educators were never asked to determine whether the test items themselves were job related, even though such an approach is standard practice in the testing industry.

Fifth, many items appeared on the examinations even after they had been rated content invalid by the requisite number of Alabama panelists. It is suggested that, before any such item appeared on a final test form, it was revised by the test developer, and that all revisions were approved by Alabama panelists. However, neither the State Board of Education nor the test developer produced any documentation of this alleged revision and approval process. Moreover, not a single panelist was called at trial to confirm that the process had actually occurred. The court finds that no such process occurred and that the test developer simply substituted its own judgment for that of Alabama educators.

In 1983, the test developer conducted a topicality review for ten of the examinations already in use. It is suggested that, even if those ten examinations were previously content invalid, they gained content validity by way of the topicality review process. The court does not agree. The topicality review process resulted in changes to, or replacement of, only about 50% of any given examination's 120 items. Items that were not revised or replaced therefore remained just as invalid as they were at birth. Moreover, as to items that were revised or replaced, there was no separate content validity determination. The court agrees with Richardson's experts that, on balance, these two factors rendered the ten examinations subjected to the 1983 topicality review to be content invalid as well.[34]

Richardson advances an array of challenges to the cut-score methodology employed by the test developer. It is clear that, as to the 35 examinations developed in 1981, the cut scores bear no rational relationship to competence as that construct was defined by Alabama educators.[35] The *823 evidence reveals a cut-score methodology so riddled with errors, that it can only be characterized as capricious and arbitrary. There was no well-conceived, systematic process for establishing cut scores; nor can the test developer's decisions be characterized as the good faith exercise of professional judgment. The 1981 cut scores fall far outside the bounds of professional judgment.

First and foremost, it is undeniable that cut scores for the 35 examinations developed in 1981 do not reflect the judgment of Alabama educators who served as panelists on the minimum cut score committees. This is a crucial error, because competence to teach is a construct that can only be given meaning by the judgment of experts in the teaching profession. Here, expert panelists who rated an item invalid as to content were automatically disqualified from going on to indicate whether that item should be counted toward the minimum cut score. This means that when a panelist indicated that an item should be excluded — because it contained inaccurate content, did not measure an objective, was tricky, ambiguous, or misleading, or was biased — that panelist's opinion was ignored for purposes of determining whether the item measured competence and should contribute to the cut score. The exclusion of such opinions resulted in a series of cut scores that reflected a distorted notion of competence.

Second, the court has no doubt that, after the results from the first administration of those 35 examinations were tallied, the test developer knew that its cut-score procedures had failed. The proof of this fact is that none of the more than 500 teachers who took the first administration of the core examination would have passed if the original cut score, calculated according to the developer's original plan, had been utilized. The court cannot conclude that all Alabama teachers who took that examination were totally and completely incompetent. It follows, therefore, that the developer knew that its cut-score procedure had utterly failed to reflect a valid construct of competence.

Third, instead of notifying the State Department that its cut-score procedure had malfunctioned, the test developer attempted to mask the presence of system failure by making various unilateral mathematical "adjustments" to the original cut score until an "acceptable" score had been reached. The most common adjustment was application of a "binomial twist" to the data collected from Alabama educators. This adjustment tended to lower cut scores. It is argued that lowering cut scores offset any system failure that might have occurred previously. This argument, however, misses the mark. The critical factor with respect to cut-score validity is not whether there was a net change in cut-score level, but whether the cut score itself accurately reflected the expert judgment of Alabama educators about whether examinees possess the competence to teach. This construct of competence cannot be guessed at by out-of-state test makers. It is also argued that the developer's resort to the "binomial twist" was an exercise of "tempered judgment" in light of actual examination data. Again, however, the fatal error is that it was the developer, and not Alabama educators, that exercised this judgment.[36]

Fourth, in two fields — that of music, and and that of speech, communication and theatre — the 5% binomial twist yielded cut scores that were much too low. In those areas, the developer simply applied a different mathematical algorithm to arrive at an acceptable cut score. Again, the developer substituted its judgment about competence for that of Alabama educators.

Fifth, for all special education and school counseling examinations, a uniform cut score of 80 was adopted. To be sure, the *824 State Department of Education made this decision, based on a policy judgment that no score should exceed 80. However, it is clear that the developer played an advisory role in that decision and that its advice was completely irresponsible. The developer recommended holding the scores at 80 based on its experience in the Georgia testing program. However, in Georgia, the decision to place a cap on cut scores was reached by the State Department of Education in conjunction with Georgia educators. The test developer never suggested that the State Department consult Alabama educators, and there is no evidence that such consultations in fact occurred. In effect, the developer assumed that the judgment of Georgia educators in a different testing program would be good enough for the people of Alabama. Once again, cut scores bore no relation to the expert judgment of Alabama educators. Moreover, if the rationale for adopting a cut score of 80 was to place a cap on such scores, it is difficult to understand why the cut scores for five special education examinations were actually raised to 80.

Sixth, the State Board did not drop the cut scores, as set by the developer, to advance bona fide psychometric or policy purposes. The board did not drop the scores three SEM's to account for measurement error; the developer recommended a drop of only two SEM's for that purpose. Nor were scores dropped three SEM's to reduce adverse impact against blacks; the State Assistant Superintendent in charge of the certification test was vehemently opposed to taking race into account in setting the cut scores. Finally, while cut scores may have been lowered by three SEM's in part for the permissible purpose of maintaining an adequate teacher supply, the court is convinced that the primary purpose for dropping three SEM's was to mask the obvious system failure generated by the developer's cut-score methodology. For example, even after the developer's binomial twist, 78% of the teachers taking the first administration of core examinations would have failed, and the same would have been true for 93% of those taking the school counseling examination, 89% of those taking the learning disability examination, and 97% of those taking the library media examination. It is apparent that these pass rates did not reflect a fair construct of minimal competence. Further adjustments were employed to back into a passing rate that would appear tolerable and reasonable. The State Board of Education and the test developer in effect abandoned their cut-score methodology, with the result that arbitrariness, and not competence, became the touchstone for standard setting.

The court would be inclined to uphold the cut-score procedures employed for the nine examinations developed in 1982 and the ten examinations subjected to topicality review in 1983; however, each of these examinations has already been shown to be content invalid. Since a valid cut score cannot be generated by items that lack content validity, the validity of the cut-score procedure itself is not enough. Accordingly, the cut scores for the 1982 and 1983 examinations are also invalid.

In reaching the above conclusions, the court has been sensitive to a number of factors. First of all, as stated earlier, close scrutiny of any testing program of this magnitude will inevitably reveal numerous errors, and these errors will not be of equal footing. Secondly, cut scores cannot be determined with mathematical certainty, and political considerations may properly enter into cut-score decisions. The court's task therefore is to assess the sum gravity of the defects found, and to determine whether, as a result of these defects, the examinations are invalid as to content and cut scores. The court recognizes that, in carrying out this task, it must proceed with caution, and even deference. Although the court must assess the credibility of testimony advanced by each side and arrive at an independent judgment, the court should not readily set aside the findings of those who developed a test; the mere fact that the court sees things differently should not, by itself, be considered sufficient to impeach such findings. But while a court should eschew an idealistic view of test validity, it should also be careful not to apply an "anything *825 goes" view. In other words, the mere presence of conflict in expert testimony does not prove that a test fails to meet minimum standards; nor does it prove that a test meets such standards. A court should find a test invalid only if the evidence reflects that the test falls so far below acceptable and reasonable minimum standards that the test could not be reasonably understood to do what it purports to do. The court is convinced that this was the case with the Alabama Initial Teacher Certification test, and in particular with the early childhood education and elementary education examinations.[37]

IV. RELIEF

Since Richardson is entitled to prevail on her disparate impact claim, the court must now determine her relief. The court will require that the Lamar County Board of Education reemploy Richardson as an elementary school teacher at a salary and with such employment benefits and job security as would normally accompany the position had she been employed in the school system since 1983. The court will also require that the school board pay her all backpay and other employment benefits she would have received had the school board reemployed her for the 1986-87 school year. The court will also require that the school board pay reasonable attorney's fees to her attorney. 42 U.S.C.A. § 2000e-5(k). The court will give Richardson and the school board an opportunity to agree, between themselves, to the appropriate amount of attorney's fees, present pay, backpay, and other employment benefits to which Richardson is entitled. If the parties are unable to agree, the court will then set these matters down for a hearing.

An appropriate judgment will be entered.

JUDGMENT AND INJUNCTION

In accordance with the memorandum opinion entered this date, it is the ORDER, JUDGMENT, and DECREE of the court:

(1) That judgment be and it is hereby entered in favor of plaintiff Alice Richardson and against defendants Lamar County Board of Education and its superintendent and members;

(2) That it be and it is hereby DECLARED that plaintiff Richardson may recover on her "disparate impact" claim but not on her "disparate treatment" claim against defendants Lamar County Board of Education and its superintendent and members;

(3) That defendants Lamar County Board of Education and its superintendent and members, their officers, agents, servants, employees, attorneys, and those persons in active concert or participation with them who receive actual notice of this injunction by personal service or otherwise, be and they are each hereby ENJOINED and RESTRAINED from failing to reemploy forthwith plaintiff Richardson as an elementary school teacher in the Lamar County School System at a salary and with such employment benefits and job security as would normally accompany the position had she been employed in the school system since 1983;

(4) That plaintiff Richardson be and she is hereby awarded from defendants Lamar County Board of Education and its superintendent and members all backpay and other employment benefits she would have received had said defendants not illegally refused to reemploy her;

(5) That plaintiff Richardson and defendants Lamar County Board of Education *826 and its superintendent and members be and they are hereby allowed 21 days from the date of this order to file a request for the court to determine the appropriate amount of present pay, backpay and other employment benefits to which plaintiff Richardson is entitled, should the parties be unable to agree to these matters;

(6) That plaintiff Richardson be and she is hereby allowed 28 days from the date of this order to file a request for reasonable attorney's fees; and

(7) That all other relief sought by plaintiff Richardson that is not specifically granted be and it is hereby denied.

It is further ORDERED that this court retains jurisdiction of this case until further order.

It is further ORDERED that all costs of these proceedings be and they are hereby taxed against defendants Lamar County Board of Education and its superintendent and members, for which execution may issue.

The clerk of the court is DIRECTED to issue a writ of injunction.

NOTES

[1] Richardson has sued not only the Lamar County Board of Education but also its superintendent and members. However, because Richardson may obtain full relief from the school board the court has not treated the board members and the superintendent separately from the school board.

[2] Title VII is codified at 42 U.S.C.A. §§ 2000e through 2000e-17.

[3] Richardson's disparate treatment claim is also based on 42 U.S.C.A. § 1981 and the fourteenth amendment, as enforced by 42 U.S.C.A. § 1983, Jett v. Dallas Independent School District, ___ U.S. ___, 109 S. Ct. 2702, 105 L. Ed. 2d 598 (1989), with jurisdiction premised on 28 U.S. C.A. §§ 1331, 1343. Because a plaintiff must prove intentional discrimination to establish a disparate treatment claim under § 1981, § 1983 and the fourteenth amendment as well as under Title VII, Stallworth v. Shuler, 777 F.2d 1431, 1433 (11th Cir.1985), and because Richardson is seeking the same relief under all these statutory provisions, the court need not address separately her theories under §§ 1981, 1983, and the fourteenth amendment. The court also need not address whether Richardson has stated a cognizable claim under § 1981. Patterson v. McLean Credit Union, ___ U.S. ___, 109 S. Ct. 2363, 105 L. Ed. 2d 132 (1989).

[4] Allen v. Alabama State Board of Education, 816 F.2d 575 (11th Cir.1987) (directing district court to enforce consent decree); Allen v. Alabama State Board of Education, Civil Action No. 81-697-N (M.D.Ala. May 14, 1987) (enforcing the consent decree).

[5] Some of these nontenured white teachers were rehired at the end of the 1985-86 school year, and some, although not reemployed at the end of that school year, were rehired at the beginning of the 1986-87 school year.

[6] See Allen v. Alabama State Board of Education, 612 F. Supp. 1046, 1048 (M.D.Ala.1985) (describing in detail the plaintiffs' claims).

[7] See Allen v. Alabama State Board of Education, 816 F.2d 575 (11th Cir.1987) (directing enforcement of the consent decree); see also Allen v. Alabama State Board of Education, 636 F. Supp. 64 (M.D.Ala.1986) (describing the consent decree), rev'd on other grounds, 816 F.2d 575 (11th Cir.1987).

[8] See also United States v. Jefferson County, 720 F.2d 1511, 1517 (11th Cir.1983); Kaspar Wire Works, Inc. v. Leco Engineering and Machine, Inc., 575 F.2d 530, 539 (5th Cir.1978); 1B J. Moore, J. Lucas & T. Currier, Moore's Federal Practice ¶ 0.444 (1988).

[9] Privity also exists where a nonparty has succeeded to the party's interest in property, where the nonparty controlled the original suit, and where the party and nonparty have concurrent interests in the same property. Hart, 787 F.2d at 1472. The school board is not relying on these theories of privity.

[10] See also Delta Air Lines, Inc. v. McCoy Restaurants, Inc., 708 F.2d 582, 587 (11th Cir.1983); Southwest Airlines Company v. Texas International Airlines, Inc., 546 F.2d 84, 95 (5th Cir.), cert. denied, 434 U.S. 832, 98 S. Ct. 117, 54 L. Ed. 2d 93 (1977); Aerojet-General Corporation v. Askew, 511 F.2d 710, 719 (5th Cir.), appeal dismissed and cert. denied, 423 U.S. 908, 96 S. Ct. 210, 46 L. Ed. 2d 137 (1975); Restatement (Second) of Judgments §§ 30-32, 34, 39-41.

[11] See also Aerojet-General Corporation, supra.

[12] See 1975 Alabama Code §§ 16-3-1 through 16-3-37, 16-23-1 through 16-23-23.

[13] See 1975 Alabama Code §§ 16-8-1 through 16-8-43, 16-13-1 through XX-XX-XXX; see also Lee v. Macon County Board of Education, 267 F. Supp. 458, 479, (M.D.Ala.) (three-judge court) (per curiam), aff'd sub nom., Wallace v. United States, 389 U.S. 215, 88 S. Ct. 415, 19 L. Ed. 2d 422 (1967) (per curiam); Opinion of the Justices, 276 Ala. 239, 160 So. 2d 648, 650-51 (1964).

[14] See also Zenith Radio Corp. v. Hazeltine Research, Inc., 401 U.S. 321, 347, 91 S. Ct. 795, 810, 28 L. Ed. 2d 77 (1971) ("The straightforward rule is that a party releases only those other persons whom he intends to release.")

[15] 1B J. Moore, J. Lucas & T. Currier, Moore's Federal Practice ¶ 0.444[3] & n. 5 (1988) (discussing cases); see also Lawlor v. National Screen Service Corporation, 349 U.S. 322, 327, 75 S. Ct. 865, 868, 99 L. Ed. 1122 (1955) ("It is likewise true that the [consent] judgment was unaccompanied by findings and hence did not bind the parties on any issue").

[16] See also Kaspar Wire Works, Inc. v. Leco Engineering and Machine, Inc., 575 F.2d 530, 539-40 (5th Cir.1978); 1B J. Moore, J. Lucas & T. Currier, Moore's Federal Practice ¶ 0.444[1] (1988).

[17] See also Jones v. Gerwens, 874 F.2d 1534, 1539 n. 8 (11th Cir.1989).

[18] Richardson argues in her brief that the school board may not prevail by offering as a legitimate reason one that did not, in fact, motivate it at the time of its decision. She then maintains that, because the school board did not consider her teaching experience, education, and ability at the time it decided not to renew her contract, it may not now refer to such qualities, or lack thereof, in proving it would not have reemployed her even if she had passed the teacher certification test. Richardson's argument has some basis in the law. In Price Waterhouse, a plurality of the Court explained that

proving "that the same decision would have been justified ... is not the same as proving that the same decision would have been made." Givhan [v. Western Line Consolidated School District], 439 U.S. [410,] 416, 99 S.Ct. [693] 697 [58 L. Ed. 2d 619 (1979)], quoting Ayers v. Western Line Consolidated School District, 555 F.2d 1309, 1315 (CA5 1977). An employer may not, in other words, prevail in a mixed-motives case by offering a legitimate and sufficient reason for its decision if that reason did not motivate it at the time of the decision.

Id., ___ U.S. at ___, 109 S.Ct. at 1791 (Brennan, J., plurality opinion). It could be argued, however, that the above statement should not be applied here, first, because Price Waterhouse addressed an employer's burden in the face of direct evidence of intentional discrimination, not merely disparate impact, and, second, because the statement commanded only a plurality vote of the Supreme Court. This court need not resolve this issue because, as demonstrated later, even without the above limitation on an employer's proof, the Lamar County Board of Education has not convinced the court that it would have declined to reemploy Richardson even if she had possessed a permanent teaching certificate.

[19] In any event, even without this "admission" from the superintendent, the court would reach the same result.

[20] Richardson has a B.S. degree.

[21] These teachers, with their years of experience noted, are Tina Rye (1), Geraldine Abrams (2), Linda Gault (2), Melissa Smith (1), Alice Wriley (1), Teresa Price (1), and Linda Shivers (3). Richardson had, as stated, three years experience in the Lamar County School System.

[22] These teachers are Jewelene Hannah (3) and Janet Baxter (1).

[23] These teachers, with their total years of experience noted, are Tina Rye (1), Geraldine Abrams (5), Melissa Smith (1), Alice Wriley (2), Teresa Price (3.5), and Jewelene Hannah (3). Richardson had five years "overall" teaching experience.

[24] Three of the nine teachers listed in notes 21, 22, and 23, supra, are African-Americans: Geraldine Abrams, Alice Wriley, and Jewelene Hannah. The other six are white. The court has not considered the race of these teachers to be critical. The issue is not whether Richardson would have been reemployed in spite of her race; the court has found that Richardson's race was not the reason behind the school board's decision not to reemploy her. Therefore, a comparative inquiry into the qualifications and abilities of Richardson and the white teachers would not be relevant. The issue instead is whether the school board would have reemployed Richardson even if she had possessed a permanent teaching certificate; and, accordingly, the relevant inquiry should be a comparison of Richardson's qualifications and abilities with those of teachers with permanent teaching certificates.

[25] In its post-trial brief, the school board mentions that in a supplemental deposition taken on March 9, 1988, the school system's superintendent compared Richardson's teaching ability with that of the teachers who were reemployed. This deposition was never filed with the court, however; the only supplemental testimony the court has received from the superintendent is an affidavit dated February 3, 1988. The only evidence, therefore, the court has regarding teaching qualifications consists of teaching experience and education.

The court also recognizes that Richardson was on the edge of gaining tenure when the school board refused to reemploy her. Since the school board has not suggested that this fact would have been significant or critical to its decision about Richardson, the court has not addressed it in the body of its opinion. In any event, another teacher, Jeweline Hannah, who was about to gain tenure and who had the same education as Richardson, but who had less overall experience than she had, was reemployed for 1986-87 and thus acquired tenure.

[26] See also Dothard v. Rawlinson, 433 U.S. 321, 97 S. Ct. 2720, 53 L. Ed. 2d 786 (1977); Maddox v. Claytor, 764 F.2d 1539, 1553-56 (11th Cir.1985); Eastland v. Tennessee Valley Authority, 704 F.2d 613, 619 (11th Cir.1983), cert. denied, 465 U.S. 1066, 104 S. Ct. 1415, 79 L. Ed. 2d 741 (1984).

[27] If the formula to calculate a standard deviation set forth in Hameed v. International Association of Bridge, Structural and Ornamental Iron Workers, Local Union No. 396, 637 F.2d 506, 512-14 & nn. 8 & 9 (8th Cir.1980), is used, the difference between the expected value and the observed value is over 35 standard deviations

[28] Using the Hameed formula, the difference between the expected value and observed value is over 39 standard deviations. See note 27, supra.

[29] Board members anticipated that the test requirement would adversely impact against African-American applicants for teaching certificates. However, the same decision would have been reached without consideration of that factor. The board's action was predicated on a legitimate concern for improving the quality of education in Alabama.

[30] A stratified random sampling technique was employed to select survey respondents and a fair cross section of teachers was generally achieved.

[31] The test developer did not convene separate panels of minority educators at any stage of the item review or content validity process to screen items for possible bias.

[32] Although the cut scores in the special education area were intended to serve as an upper limit, the cut scores on five of those examinations were actually raised to 80 to achieve uniformity.

[33] The term APA Standards is a shorthand reference for the American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Standards for Educational and Psychological Testing (1985).

[34] The court does not agree that the test developer's multi-step test development process was inherently self-correcting. There is substantial support in the record for the view that errors at one step not only survived the next step, but also created new errors.

Moreover, the fact that a validity study for the National Teachers Examination was upheld in United States v. South Carolina, 445 F. Supp. 1094 (D.S.C.1977), aff'd 434 U.S. 1026, 98 S. Ct. 756, 54 L. Ed. 2d 775 (1978), does not mandate the same result here. The validity of the present examinations must be assessed on the basis of evidence now before the court. Cf. York v. Alabama State Board of Education, 581 F. Supp. 779, 786 (M.D.Ala.1986) ("tests are not valid or invalid per se ...; the fact that the validity of a particular test has been ruled upon in prior litigation is not necessarily determinative in a different factual setting").

[35] The court must point out that three of Richardson's arguments with respect to the 1981 cut scores clearly lack merit. First, she asserts that Nassiff's 1978 "Two-Choice Angoff" method for yielding an original cut score was and is "without professional endorsement." However, professional literature published well before the initiation of Alabama's testing program endorsed methodologies similar to Nassiff's approach. See R. Thorndike, Educational Measurement at 514-515 (1971). Moreover, while current professional literature does not grant Nassiff's method the highest possible marks, it certainly does not condemn it as being wholly outside the bounds of professional judgment. See Berk, A Consumer's Guide to Setting Performance Standards on Criterion Referenced Tests, 56 Rev. of Educ. Research 137, 148 (1986). Second, Richardson contends that Nassiff's method was largely unproven and that an alternative cut-score methodology should have been used at the same time as a backup. While the court agrees that this might have been advisable, there is no evidence that the failure to use a backup cut-score method was unprofessional. Third, Richardson argues that the test developer's recent adoption of a more sophisticated cut-score methodology signals the bankruptcy of Nassiff's 1981 method. The court does not agree. The fact that, with new developments in the field, the test developer later changed its methodology should not be held against it as an admission of error.

[36] It is argued that the binomial twist was, in fact, implemented in consultation with the State Department of Education, and that such consultation somehow injects the judgment of Alabama educators into the cut-score process. However, the evidence is clear that the developer never consulted any official at the State Department of Education with respect to the binomial twist. In fact, the State Department was not advised of that twist until shortly before trial.

[37] The court recognizes that it has focussed not so much on the early childhood education and elementary education examinations, but on the Alabama Initial Teacher Certification Test as a whole. The court has done this because the history of the two examinations challenged by Richardson is the same as the history of the teacher certification test as a whole; the conclusions reached by the court regarding the certification test are also applicable to the two challenged examinations. Moreover, in order to appreciate fully the invalidity of the two challenged examinations, one must also understand just how bankrupt the overall methodology used by the State Board and the test developer was.

The court also recognizes that it has focussed on the development and implementation of several individual examinations which have not been challenged by Richardson. The court has included these examinations as additional evidence of the invalidity of the State Board and test developer's overall methodology.