Police Officer for Equal Rights v. City of Columbus

KENNEDY, Circuit Judge.

Police Officers for Equal Rights (POER), appellants in this Title VII employment discrimination case, are a class of black police officers employed by the City of Columbus, Ohio. On January 8, 1985, Judge Duncan issued an opinion and order finding that appellees, the City of Columbus and a number of its employees, had discriminated against members of the plaintiff class in the areas of promotions, assignments, transfers and certain other terms and conditions of employment in violation of Titles VI and VII of the Civil Rights Acts. Police Officers for Equal Rights v. City of Columbus, 644 F.Supp. 393 (S.D.Ohio 1985) (City of Columbus I). Judge Duncan found that appellees’ 1976, 1978, and 1982 sergeant examinations had a disparate impact on black officers and that those examinations had not been sufficiently shown to be job related. The court made no such finding with respect to the promotional examinations for the ranks of lieutenant and captain, however, citing the lack of evidence concerning the effect of those examinations because of the almost complete absence of black officers in the upper ranks.

Judge Graham was then called upon to fashion a -remedy for the discrimination found by Judge Duncan and thereafter issued a series of orders designated as Interim Orders 1 through 17. The court ordered affirmative race-conscious relief in the rank of sergeant, requiring the City to fill half of all existing vacancies in that rank with qualified black officers and to make future promotions to sergeant at the rate of one black officer and one white officer until a goal of 14.9% of black sergeants, proportional to the percentage of black police officers, had been achieved. The court declined to order affirmative relief in the ranks of lieutenant or captain because there had been no finding that the promotional examinations for those ranks had discriminated against black candidates, and because the limited pool of black candidates left little or no discretion in the selection procedure or in determining relative qualification. The court concluded that the best remedy for black underrepresentation in the ranks above sergeant would come through the eventual promotion of the additional black sergeants appointed pursuant to the court’s order.

In Interim Order Number 12, the court established a procedure for the review of future promotional examinations found to have an adverse impact on blacks:

If any future promotional examination has an adverse impact on blacks, the examination will be reviewed by the plaintiffs’ and defendants’ experts pursuant to this Order to determine whether it is job related in accordance with the standards of the Uniform Guidelines on Employment Selection Procedure, (29 C.F.R. 1607).

Pursuant to the provisions of this Order, the court was provided with a report from the Columbus Civil Service Commission on May 24, 1989 reporting that the results of the 1989 lieutenant promotional examination revealed that the examination had an adverse impact on black candidates. On June 22, 1989, appellants’ counsel advised the court that they had retained an expert to evaluate the job relatedness of the examination. On July 19, 1989, the court granted the motion of the intervenor Capital City Lodge No. 9, Fraternal Order of Police (FOP), and permitted it to participate in the proceedings. The FOP is the collective bargaining representative of all sworn employees of the Columbus Division of Police with the exception of the chief and five deputy chiefs.

On February 5, 1990, the District Court found that “plaintiffs did not sustain their burden of proving that the examination was not job related or that there was some other test or selection device without an adverse impact which would also serve the defendants’ legitimate interests.” Appellants contend that the trial court erred in finding that the 1989 lieutenant examination was job related. Appellants also challenge the trial court’s decision denying affirmative relief in the upper ranks.

We AFFIRM the judgment of the District Court.

*1095I.

The District Court provided an extensive discussion of the background of the 1989 lieutenant examination. The court noted that in 1986, in the aftermath of the liability decision, the City hired a consultant to review completely the procedures employed by the Columbus Civil Service Commission (CCSC) in developing promotional examinations for the Division of Police and to help it design a state of the art promotional examination. The consultant selected was Frank J. Landy, Ph.D., of Landy, Jacobs & Associates, Inc., located in State College, Pennsylvania. Dr. Landy is a nationally recognized expert in employment testing with extensive experience in preparing entry level and promotional examinations for public safety organizations. Dr. Landy guided the City in the development of promotional examinations for the ranks of sergeant, lieutenant and captain which were given in 1986-87. In preparing those examinations, Dr. Landy received substantial advice from Dr. Joseph Craney of Bowling Green State University, appellants’ expert in the liability phase of this case. Dr. Craney made several suggestions that were accepted by Dr. Landy and the CCSC. A new format was developed for the 1986-87 examinations, which included open-book and closed-book written examinations as well as a work sample exercise and an oral exercise.

The 1989 police lieutenant examination was prepared by the uniform testing unit of the CCSC under the supervision of Dr. Landy. The staff assigned to the test consisted of six individuals and was headed by Dr. S. David Kriska. The staff included one individual with a Ph.D., three individuals with masters degrees in psychology, and two individuals with bachelors degrees. Pursuant to the provisions of the Collective Bargaining Agreement, the FOP’s testing expert, Bonnie A. Sandman, Ph.D., of the firm of Smith, Sandman & McCreery, had a consulting role with the CCSC in the preparation of the examination. Drs. Landy and Sandman both reviewed the examination to evaluate its validity of job relatedness.

In developing the 1989 lieutenant examination, the CCSC used many of the concepts developed by Dr. Landy in conjunction with the 1986-87 examinations. The CCSC also used some of the underlying data used by Dr. Landy in constructing the earlier examinations.

The examination consisted of four parts; each part was worth 25% of an applicant’s overall score. Parts I and II consisted of multiple choice questions. Part I was a closed-book test and Part II was an open-book test. The knowledge tested in these two parts was derived from three sources: division directives, division policies, and an outside textbook entitled Introduction to Police Administration by Sheehan and Cordner. Questions derived from the Shee-han and Cordner text constituted one-third of the questions on the closed-book portion of the exam. Part III of the examination consisted of three sub-components: an in-basket test (comprising 70% of the Part III score); a letter writing test (comprising 20% of the Part III score); and a letter review test (comprising 10% of the Part III score). In the in-basket portion of Part III, candidates were presented with twenty-two items, such as transfer requests, division memos and letters from the public, and were instructed to indicate what their responses and reactions would be with respect to those items. Candidates were also instructed to make priority judgments for each of the items. Candidates were allowed two and one-half hours in which to complete this test. Part IV of the examination was an oral test. This portion of the examination did not result in an adverse impact.

Examination scores were ranked in order of performance. In accordance with the city charter, a rank order selection process was employed in order to promote applicants to the position of lieutenant.

II.

Appellants argue that appellees failed to produce evidence that the 1989 lieutenant examination was content valid and that appellants in fact proved that the examination was not job related. Appel*1096lants contend that the trial court misallo-cated the burden of proof in reaching its conclusion.

The United States Supreme Court enunciated how the burdens of proof are to be set forth in a disparate impact case such as the present one in its recent decision, Wards Cove Packing Co. v. Atonio, 490 U.S. 642, 109 S.Ct. 2115, 104 L.Ed.2d 733 (1989). First, the plaintiff must show that a facially neutral employment practice has a significant adverse impact on a protected group. Id. at 2125. Once the plaintiff has established a prima facie case of adverse impact, the burden shifts to any business justification the employer offers for its use of the employment practice. “This phase of the disparate-impact case contains two components: first, a consideration of the justifications an employer offers for his use of these practices; and second, the availability of alternate practices to achieve the same business ends, with less racial impact.” Id. The Court noted in Wards Cove:

The touchstone of this inquiry is a reasoned review of the employer’s justification for his use of the challenged practice. A mere insubstantial justification in this regard will not suffice, because such a low standard of review would permit discrimination to be practiced through the use of spurious, seemingly neutral employment practices. At the same time, though, there is no requirement that the challenged practice be “essential” or “indispensable” to the employer’s business for it to pass muster: this degree of scrutiny would be almost impossible for most employers to meet, and would result in a host of evils....

Id. at 2126.

The Court then stated that “[i]n this phase, the employer carries the burden of producing evidence of a business justification for his employment practice. The burden of persuasion, however, remains with the disparate-impact plaintiff.” Id.

Finally, if on remand the case reaches this point, and respondents cannot persuade the trier of fact on the question of petitioners’ business necessity defense, respondents may still be able to prevail. To do so, respondents will have to persuade the fact-finder that “other tests or selection devices, without a similarly undesirable racial effect, would also serve the employer’s legitimate [hiring] interest^];” by so demonstrating, respondents would prove that “[petitioners were] using [their] tests merely as a ‘pretext’ for discrimination.”

Id. (quoting Albemarle Paper Co. v. Moody, 422 U.S. 405, 425, 95 S.Ct. 2362, 2375, 45 L.Ed.2d 280 (1975)).

Appellants argue that the trial court mis-allocated the burdens of proof in the present case. They contend that the trial court believed it was compelled to amend Section III of its Second Amended Interim Order No. 12 as a result of the Supreme Court’s decision in Wards Cove.

The title of Section III of the trial court’s Interim Order No. 12 reads as follows:

IF PROMOTIONAL EXAMINATIONS HAVE AN ADVERSE IMPACT ON BLACKS, DEFENDANTS MUST SHOW THAT THEY ARE JOB RELATED

The trial court later amended Interim Order No. 12 after the Supreme Court decided Wards Cove, stating:

This title was adopted without discussion from a draft of a proposed order submitted by plaintiffs’ counsel. The court did not intend that the title should have substantive effect and it is to be noted that the language of Section III does not specifically address the burden of proof to be applied in a hearing. The Court, and apparently counsel as well, assumed that the phraseology of the title of Section III correctly stated the law in the Sixth Circuit at the time the order was issued, namely, that upon showing adverse impact the burden of proof shifted to the defendant to show that the examination was job related. However, in Wards Cove ... the Supreme Court clarified the law in this area holding that a showing of adverse impact only shifts the burden of production, not the burden of proof, which remains at all time with the plaintiff.

In choosing the title for Section III of Interim Order No. 12, this Court had no *1097intention to impose a greater burden of proof on the defendant than that required by existing law. The Court intended only to provide an accelerated vehicle for the determination of the validity of future promotional examinations which were shown to have adverse impact against black candidates.

At the pretrial conference which preceded the hearing on the validity of the 1989 lieutenants examination, the issue of burden of proof was raised and the Court indicated that Wards Cove would govern that hearing and that Interim Order No. 12 was not intended to modify the burden of proof in such a hearing. At a status conference the Court reiterated its original intentions regarding the burden of proof in a hearing held under the provisions of Section III of Interim Order No. 12 and the Court requested that the parties submit proposed modifications or amendments to that order....

This Court has the inherent authority to modify its interlocutory orders. The Court hereby modifies Interim Order No. 12 to reflect its original intentions. The title of Section III shall be amended to read as follows:

IF PROMOTIONAL EXAMINATIONS HAVE AN ADVERSE IMPACT ON BLACKS, PLAINTIFFS WILL BE AFFORDED AN OPPORTUNITY TO REVIEW AND CHALLENGE THE JOB RELATEDNESS OF THE EXAMINATIONS,[.]

Order Amending Interim Order No. 12 (citations omitted).

Appellants argue that the title of Section III placed the burden of proof on appellees, and that because this placement of the burden was part of an affirmative action order, the placement of the burden was not disturbed by the Court’s holding in Wards Cove. Appellants contend that Wards Cove involved the initial burdens of parties to show whether an employer’s actions are discriminatory under an adverse impact theory whereas in the present case, the trial court misapplied Wards Cove at the remedy phase to alter an affirmative action order.

Appellants’ argument is premised upon the contention that Wards Cove did not require the court below to disturb its earlier affirmative action order. The court did not believe it was compelled to alter its prior order, however. Rather, the court stated that it assumed that the original title of Section III reflected the state of the law of this Circuit. The court altered its order in light of Wards Cove not because it believed it was required to do so, but rather because it desired to place the burdens of proof in a manner consistent with the current state of the law.1 The court properly asserted that it had “the inherent authority to modify its interlocutory orders.” 2 The trial court could have altered the burdens of proof in light of or in spite of Wards Cove. Thus the trial court did not err in amending its order so that the burdens of proof in a hearing determining job relatedness would be placed in a manner consistent with the law set forth in Wards Cove.

III.

A.

Appellants next argue that the 1989 lieutenant examination focused only on *1098knowledge and failed to identify, operationally define, and test for important skills and abilities of the job. Appellants assert that although knowledge is perhaps a component of a police lieutenant’s position, the supervisory skills and behaviors are a critical aspect of the job as well. Appellants further assert that a test of knowledge, even knowledge associated with a supervisory job, is not a valid test where more than knowledge is required in the performance of a job. They note that Judge Duncan explained the reasoning against tests of knowledge where skills and behaviors are required on the job. Judge Duncan discussed the difference between knowing that a ear accelerates when the gas pedal is pressed down and knowing what proportions of air and gas are mixed in the carburetor in order for an engine to be engaged. City of Columbus I, 644 F.Supp. at 415. Knowing the former is necessary to drive a car; knowing the latter is not. Judge Duncan used this analogy in invalidating examinations that were “little more than a test of reading comprehension and memory.” Id. at 416.

The trial court found that the 1989 lieutenant examination was not merely a test of reading comprehension and memory. The court reasoned that a reading level analysis of the lieutenant’s job and of the examination revealed that the reading level of the job was college level and the reading level of the examination required only a tenth grade reading level. It is from these reading analyses that the court appears to have concluded that the examination was not a test of reading comprehension and memorization. Appellants are correct in their contention that an exam may have a reading level lower than the reading level of the job for which the examination is screening applicants and yet still be an exam of reading comprehension and memorization. The court, however, went on to state that appellants seem to ignore a basic truth — “namely that being able to do something requires knowing how to do it.” Appellants’ argument assumes that a test of knowledge is not a proper test. As the trial court noted, however, “ ‘knowing the reason for correct action is some indication of acting accordingly.’ ” (District Court quoting Bridgeport Guardians v. Bridgeport Police Dep’t, 431 F.Supp. 931, 938 (D.Conn.1977)). We agree with the District Court that the test was not improper merely because it was, to some extent, a test of knowledge.

Appellants also argue that the job analysis used in preparing the 1989 lieutenant examination was inadequate. Appellants state that the heart of content validity analysis3 is identification of the underlying knowledge, skills abilities (KSA’s) required for a job, and then designing a test that measures the applicant’s possession of those KSA’s. Appellants contend that although appellees devised a list of tasks and organized them into a set of task categories, appellees focused only on knowledge implicated in those task categories and ignored the undisputed skills and abilities required to be a police lieutenant.

The trial court found that:

[t]he City chose to base the examination on the knowledge and knowledge sources necessary to perform the job and relied on subject matter experts to identify the knowledge sources, tasks, task categories, and their relative importance to the job. This approach is consistent with content validity, whereas Dr. Lefkowitz’s views4 are more appropriate to a test based on construct validity.

District Court’s Opinion and Order (footnote added). The Uniform Guidelines do *1099not suggest that in order to demonstrate content validity, one must show that an examination assesses knowledge, skills, and abilities. Rather, the Uniform Guidelines state:

In the case of a selection procedure measuring a knowledge, skill, or ability, the knowledge, skill, or ability being measured should be operationally defined. In the case of a selection procedure measuring a knowledge, the knowledge being measured should be operationally defined as that body of learned information which is used in and is a necessary prerequisite for observable aspects of work behavior of the job_ For any selection procedure measuring a knowledge, skill, or ability the user should show that (a) the selection procedure measures and is a representative sample of that knowledge, skill, or ability; and (b) that knowledge, skill, or ability is used in and is a necessary prerequisite to performance of critical or important work behavior(s).

29 C.F.R. § 1607.14(C)(4) (emphasis added).

This language does not foreclose the possibility of an examination being validated pursuant to a content validity study where the examination is one testing solely for knowledge. Further, the District Court found that the 1989 lieutenant examination was not solely a test of knowledge, a finding that is not clearly erroneous. Therefore appellants’ argument that the job analysis and thus the resultant examination are invalid because they involve knowledge is without merit.

B.

Appellants also argue that the lieutenant examination does not represent5 the requirements of the position of lieutenant because the test does not measure attributes in proportion to their importance and frequency of use in the performance of the job.

In response to this argument, the District Court noted:

The Uniform Guidelines, however, speak in terms of representativeness, not proportionality. As noted earlier, representativeness is not defined by the Guidelines. The City and its experts contended that precise proportionality is impossible to achieve in a job as complex as police lieutenant where the job categories are not precisely compartmentalized but are interrelated. Based upon the testimony of the experts, the Court concludes that the degree of proportionality is largely a matter within the professional judgment of the test writer based upon the particular attributes of the job in question. Proportionality is only one aspect of representativeness, and while it may be a proper goal, precise proportionality is not a prerequisite to job relatedness. In certain instances, exact proportionality may be impossible to achieve and in any event it is not the standard by which job relatedness should be measured.

The court noted that the evidence presented revealed that the City:

identified] the tasks involved in the job, that it rated them according to importance and frequency and in the process identified the most important task categories. It constructed a test which tested for all or nearly all of the task catego*1100ries, and emphasized the most important task categories.

The court found the test sufficiently representative of the job to satisfy the requirements of content validity.

Appellants argue that testing for nearly all of the task categories and emphasizing the most important categories is insufficient. We disagree. In Guardians Association of New York City Police Department v. Civil Service Commission, 630 F.2d 79 (2d Cir.1980), cert. denied, 452 U.S. 940, 101 S.Ct. 3083, 69 L.Ed.2d 954 (1981), the court stated:

The reason for a requirement that the content of the exam be representative is to prevent either the use of some minor aspect of the job as the basis for the selection procedure or the needless elimination of some significant part of the job’s requirements from the selection process entirely; this adds a quantitative element to the qualitative requirement— that the content of the test be related to the content of the job. Thus, it is reasonable to insist that the test measure important aspects of the job, at least those for which appropriate measurement is feasible, but not that it measure all aspects, regardless of significance, in their exact proportions.

Id. at 99 (emphasis in original).

We agree with the Second Circuit that relatedness does not require precise proportionality. Thus we find that the District Court did not err in so finding.

C.

Appellants next argue that although the District Court refused to allow the parties to present evidence regarding Phase IV of the lieutenant exam, the court in its opinion on liability stated that appellants should have presented evidence regarding Phase IV to show that the test was not representative of the job. Phase IV comprised the oral portion of the examination.

In its post-decree denial of plaintiffs’ motion for a stay, the trial court stated:

Phase IV was excluded from the hearing on the issue of job relatedness because it had no adverse impact on black candidates. This did not prevent the plaintiffs from showing what knowledge, skills and abilities were tested on Phase IV in order to support their argument that the examination failed to adequately represent the job domain. In any event, this Court analyzed the issue of representativeness in the context in which plaintiffs presented their case which was essentially as though Phases I, II and III constituted the entire examination and on that basis, the Court found that the examination satisfied the requirement of representativeness.

There is no evidence to suggest that the court banned all evidence regarding Phase IV of the examination. Rather, the court merely indicated that it would not look to the issue of whether Phase IV was job related. We do not find this to be inconsistent with the court’s statement in its Opinion and Order of February 5, 1990 that “[i]t should be noted that plaintiffs’ analysis of proportionality focused only on the multiple-choice examinations and ignored the remaining half of the examination which covered various task categories in various degrees. Any meaningful criticism of proportionality should have included consideration of the knowledge tested in Phases III and IV.”

IV.

Appellants next argue that the time limit placed on the in-basket portion of Part II of the examination was a limiting factor and did not adequately simulate the job task. Appellants contend that appellees failed to comply with the following provisions of the Uniform Guidelines:

If a test purports to sample a work behavior or to provide a sample of a work product, the manner and setting of the selection procedure and its level and complexity should closely approximate the work situation.

29 C.F.R. § 1607.14(C)(4).

Establishment of time limits, if any, and how these limits are related to the speed *1101with which duties must be performed on the job, should be explained.

29 C.F.R. § 1607.15(C)(5).

The trial court found that appellants did not demonstrate that the time limit for the in-basket test had an adverse impact on black applicants. The only support for the allegation of adverse impact presented by appellants was a comment contained in a book entitled Assessment Centers and Managerial Performance by Thornton and Byham. These authors stated, “Russell and Byham (1981) found in two studies that blacks completed fewer in-basket items than whites when held to a tight time limit.” The District Court noted that the studies were not placed into evidence and the appellants’ expert did not testify on this point. The court further noted that the evidence in the present case did not reveal that blacks completed fewer items than whites on the in-basket portion of the test. Approximately thirty percent of the candidates did not complete the last three items and the summary sheet, and this thirty percent consisted of an equal portion of blacks and whites. We find that in light of the lack of evidence supporting appellants’ contention of adverse impact, the District Court was not clearly erroneous in failing to find that an adverse impact was present. As discussed above, Wards Cove does not require this inquiry to go any further.

V.

Appellants argue that the examination overemphasized familiarity with an outside textbook, Sheehan and Cordner’s Introduction to Police Administration. One third of the items in the closed-book portion of the examination were derived from this textbook. It is undisputed that the text was not used as a knowledge source or reference material by the Division, and that the text referred to principles, procedures and terminology that were not necessarily used by the Columbus Division of Police. Appellants argue that the text is not otherwise job related.

The court found:

Defendants have made a judgment that its lieutenants should have some training in management principles as a prerequisite to advancement to that rank. It has chosen to use the lieutenants examination as the tool to accomplish such training. This is little different than requiring that each candidate for promotion to the rank of lieutenant successfully pass a course in police management principles.

The court concluded that the use of the text was appropriate and job related.

In response to appellants’ argument that the use of the text was not job related, appellees contend that the text constituted a pre-examination training program for a specific part of the lieutenant job domain and as such was valid. When asked about the usage of an outside text in the testing process, appellees’ expert Dr Landy stated, “I think of it, the management text, as really a training program of some relatively specific part of the job.” We agree with appellants’ assertion that employers may not test for candidates’ abilities to perform in a training program unless they can show that the program is job related. See 29 C.F.R. § 1607.14(C)(7); see also Craig v. Los Angeles County, 626 F.2d 659, 662-63 (9th Cir.1980), cert. denied, 450 U.S. 919, 101 S.Ct. 1364, 67 L.Ed.2d 345 (1981).

The trial court, in concluding that the outside text was job related, stated:

The Court has reviewed the text and finds that it is a well written and informative text on management principles applicable to police departments and that it contains information which would appear to be useful to a police lieutenant in performing his or her duties. The Court has also reviewed the test items based on the text and finds that they fairly test the important principles contained in it.

We find that the trial court was not clearly erroneous in finding that the use of the Sheehan and Cordner text was appropriate and job related.

VI.

Appellants next argue that the trial court failed to require the requisite degree of representativeness or job relatedness required for rank order selection. In con-*1102eluding that the examination was sufficiently reliable to justify rank ordering, the trial court considered the standards set forth in the Uniform Guidelines, the principles adopted by the Society for Industrial and Organizational Psychology, Inc. (SIOP principles), and the requirements set forth in the testimony of Dr. Landy in analyzing the appropriateness of rank ordering.

The Uniform Guidelines provide:

If a user can show, by a job analysis or otherwise, that a higher score on a content valid selection procedure is likely to result in better job performance, the results may be used to rank persons who score above minimum levels. Where a selection procedure supported solely or primarily by content validity is used to rank job candidates, the selection procedure should measure those aspects of performance which differentiate among levels of job performance.

29 C.F.R. § 1607.14(C)(9).

The SIOP principles provide:

If selection instruments measure a substantial and important part of the job reliably, and provide adequate discrimination in the score ranges involved, persons may be ranked on the basis of its results.

Dr. Landy stated that three requirements are necessary for rank ordering: (1) there must be a sufficient spread among the scores of candidates; (2) there must be composite reliability (the whole test must be reliable) and component reliability (each component must be reliable); and (3) there must be a reasonable job analysis. The trial court appears to have relied primarily upon the three requirements set forth by Dr. Landy in reaching its decision.

Appellants contend that the trial court held that appellees’ use of rank order hiring was job related where the test was content valid and “reliable” while ignoring that this Court has held that test scores must vary directly with job performance. In Williams v. Vukovich, 720 F.2d 909, 924 (6th Cir.1983), this Court stated that “[rjanking is a valid, job-related selection technique only where the test scores vary directly with job performance.” In so stating, this Court relied upon Guardians, 630 F.2d at 100, which in turn relied upon the Uniform Guidelines. The court in Guardians noted that “[i]f a test is content valid, it may be reasonable to infer that the test scores make some useful gross distinctions between candidates.... And it may even be that within some range of scores, some incremental improvements in scores show some positive correlation with improvements in job performance.” The court cautioned, however, “[CJontent validity is not an all or nothing matter; it comes in degrees. A test may have enough validity for making gross distinctions between those qualified and unqualified for a job, yet may be totally inadequate to yield passing grades that show positive correlation with job performance.” Id. The expert testimony in the present case demonstrated that the test had “enough validity” to “yield passing grades that show positive correlation with job performance.” Id. The trial court relied upon expert testimony in finding that:

The Commission calculated the reliability of the components of the examination as follows: Phase I .86; Phase II .85; Phase III .97; Phase IV .85. Composite reliability was calculated to be .95. Based on this, Dr. Landy calculated the standard error of measurement to be 1.3 points on the reported score scale. Dr. Lefkowitz conceded that the lowest standard error of measurement he had seen in his experience was in the range of 2.0. There was a spread of more than forty points among 71 test takers. The highest score was 89.66, the passing score was 70. The top fifteen scores ranged from 78.60 to 89.66.

Appellees’ experts, Drs. Landy and Kris-ka, agreed that the reliability figures for the examination warranted rank-ordering under the SIOP principles. Appellants’ expert, Dr. Lefkowitz, testified that appellees did not document the degree of reliability necessary to justify rank ordering. Dr. Lefkowitz performed calculations that he believed demonstrated that the margin of error on the test may be as high as nine points. He thus suggested that all candi*1103dates who score within nine points of each other be “banded” and treated as if each had attained the same score. Appellees argue that Dr. Lefkowitz used inter-correlation statistics rather than reliability statistics. Drs. Landy, Sandman and Kriska each criticized Dr. Lefkowitz’s reliability calculations. The trial court stated:

It appears that Dr. Lefkowitz used a formula for measuring composite reliability of an examination based on the reliability of the parts but inserted intercorre-lation data where the formula called for reliability data. At trial, he attempted to correct his calculations but committed further errors. The Court cannot accept Dr. Lefkowitz’s calculations of composite reliability.

We find that the trial court was not clearly erroneous in accepting the testimony of Drs. Landy, Sandman and Kriska on the issue of reliability6 and rank order scoring and thus in finding that the examination was sufficiently reliable to justify rank order scoring. The district judge is entitled in questions of this kind which require expert opinion to rely on that opinion.

VII.

Appellants also contend that the trial court erred in not granting relief to blacks in the upper ranks of the Columbus Division of Police. In 1987 the trial court declined to order the relief requested by appellants in promotions to the positions of lieutenant and captain, stating:

The Court has declined to grant affirmative relief in the matter of promotions to lieutenant and captain because the limited pool of candidates leaves little or no discretion in the selection procedure or in determining relative qualifications. In the ease of captain, there is only one candidate who has had experience in the next lower rank. In the case of lieutenant, there is only one candidate who was sufficiently interested in promotion to take the last examination.
The Court concludes that the best remedy for black underrepresentation in the ranks above sergeant will come through the eventual promotion of many of the additional black sergeants who will become sergeants pursuant to this order and the eventual promotion of other black officers who achieve promotion to sergeant through the fair and non-discriminatory procedures ordered by Judge Duncan.

(Interim Order No. 5, May 28, 1987).

Appellants then sought to enjoin promotions based on the results of the 1986-87 lieutenant and captain promotional examinations and sought affirmative race-conscious promotions to these ranks until parity was reached. The trial court again denied this remedy, stating:

[T]he Court continues to feel that the best remedy for black underrepresentation in the ranks above sergeant will come through the eventual promotion of black sergeants promoted pursuant to this Court’s remedy order and the eventual promotion of other black officers who achieve promotion to sergeant through fair and non-discriminatory examinations as ordered by this Court’s predecessor.

(Interim Order No. 9, 1987).

Appellees contend that appellants did not file an appeal from the denials of injunctive relief in Interim Orders 5 and 9 or from the trial court’s denial of affirmative race-conscious promotional relief above the rank of sergeant following the trial court’s Final Order. Further, appellees contend that although appellants challenged the 1989 lieutenant promotional examination on the ground of job relatedness, appellants did not seek injunctive relief in the form of affirmative race-conscious promotions as part of the job-relatedness hearing. Thus appellees argue that the issue of denial of affirmative race-conscious relief above the rank of sergeant was not appealed in a timely manner and is not properly before this Court. We agree. Appellants did not *1104appeal Interim Orders 5 and 9 within the permissible time period. Further, these orders do not involve a request for relief in light of the results of the 1989 lieutenant examination. Appellants inform us that on March 5, 1990, they filed a motion requesting affirmative action in the upper ranks after the trial court held that the 1989 lieutenant examination was job related. The trial court has not, to our knowledge, ruled upon the motion. Appellants note that if the court grants their motion, their appeal of this issue will become moot. If the court denies appellants’ motion, they will have the opportunity to appeal the court’s order. Thus we await further action regarding appellants’ motion before considering this issue.

For the reasons discussed above, we AFFIRM the February 5, 1990 order of the District Court.

. When the court denied appellants’ motion for injunctive relief pending appeal, the court stated:

Regardless, as the Court pointed out in its opinion and order of February 5, 1990, it was not intended that Interim Order No. 12 in any way alter the burden of proof in a hearing arising under the provisions of Interim Order No. 12.... If the Court had been asked to modify the burden of proof as part of an affirmative action order, it would have declined to do so.

Because the court has the power to alter its order, we need not determine whether the court's initial order was at odds with Wards Cove.

. Appellants attempt to rely on United States v. City of Buffalo, 721 F.Supp. 463 (W.D.N.Y.1989), for the proposition that Wards Cove did not require the trial court to amend its prior order. The court in City of Buffalo, however, stated, "Wards Cove ... [did not alter] ... the broad power of federal district courts to implement relief that operates both retrospectively to redress past discrimination and prospectively to ensure that it does not recur.” Id. at 467. It is this power that permits the court to alter its order in light of Wards Cove.

. The Uniform Guidelines describe three methods that may be used to validate an examination: (1) content validation, (2) construct validation, and (3) criterion-related validation. 29 C.F.R. § 1607.5(B). The City chose to construct the 1989 lieutenant examination based upon content validity.

. Dr. Lefkowitz stated:

[Task and task categories] play some role [in constructing an examination when a content validity strategy is used], but they really ought to be playing a much smaller role than they played here because what was not done in the job analysis here, which we haven’t really talked about much, is the job development of the particular items of knowledge and abilities that are requisite in order to perform those specific tasks and cumulative tasks within each category, because that's what you want to focus on. You are assessing people. You are assessing people for promotion. Peo-*1099pie don’t have tasks, they have attributes. People either have requisite knowledge or they don’t. People are either able to perform certain operations or they are not. That's what you are assessing.
So you are assessing people’s knowledge, skills, abilities, so-called KSAs. So that’s why in a job analysis you should be defining the job in terms of those KSAs, so that you can then develop a test specification as to what KSAs you should be looking at, so that you can then develop a testing procedure that adequately represents those KSAs.

. The Uniform Guidelines state:

To demonstrate the content validity of a selection procedure, a user should show that the behavior(s) demonstrated in the selection procedure are a representative sample of the behavior(s) of the job in question or that the selection procedure provides a representative sample of the work product of the job.

29 C.F.R. § 1607.14(C)(4) (emphasis added).

. For discussion of reliability see Guardian Ass’n of New York City v. Civil Serv. Comm’n, 630 F.2d 79, 100-06 (2d Cir.1980), cert. denied, 452 U.S. 940, 101 S.Ct. 3083, 69 L.Ed.2d 954 (1981).