Vulcan Pioneers, Inc. v. New Jersey Department of Civil Service

*531OPINION

SAROKIN, District Judge.

INTRODUCTION

This case challenges the validity of civil service examinations administered on numerous occasions over a considerable period of time for promotional positions in fire departments throughout New Jersey. Although to a large extent the proofs in the case are predicated upon statistics, it is important not to lose sight of the fact that we are here dealing with individuals whose employment futures will be seriously affected by this matter. On the one hand, there are those who have taken and passed the test who now wait patiently for implementation of its results, particularly those who rank high on the resulting lists and are on the brink of appointment. On the other hand, there are minorities who the government contends have been discriminated against and thereby deprived of the same promotional opportunities. Finally, there is the public, which has a right to expect that responsible positions of leadership in the fire departments will be filled expeditiously by persons clearly qualified to supervise and perform the highly important and courageous work of protecting life and property.

There can be little doubt that the failure to have minorities in responsible positions of supervision is a direct result of historic discriminatory practices. Earlier failures to appoint minority firefighters at the entry level obviously affect the current availability of such minorities for appointment to higher positions. That limitation, in turn, affects the statistical basis upon which the court’s analysis must focus. It therefore is unfortunate, that despite the requirements of the consent decree and the warnings of the Justice Department to the State regarding the claimed invalidity of the test, it has been offered so many times and has raised the expectations of so many. Those circumstances now require the court to resolve these highly complex and sensitive issues.

PROCEDURAL POSTURE

This matter is before the court on the May 7, 1984 motion of plaintiff United States of America to enforce the May 30, 1980 Consent Decree entered into between the parties. Plaintiff’s notice of motion requested the court “to enjoin the State defendants and the defendant cities from using eligibility lists [for promotion] based upon the present examination and to require the State to use a test in the future which is job-related and consistent with the Uniform Guidelines on Employee Selection Procedures, 28 C.F.R. 50.14, or which has no adverse impact upon minorities.” At issue is the validity of examinations administered by the State of New Jersey for the first level supervisory rank {i.e., fire captain or lieutenant) since entry of the decree in the twelve defendant municipalities: Atlantic City, Camden, East Orange, Elizabeth, Hoboken, Jersey City, New Brunswick, Newark, Passaic, Paterson, Plainfield and Trenton.

The procedural history of this matter is traced in court’s Opinions of October 1, 1982, Vulcan Pioneers, Inc. v. New Jersey Department of Civil Service, Civil Action No. 81-281, unpub. op. at 2-5 (D.N.J. Oct. 1, 1982), and May 3, 1984. Vulcan Pioneers, Inc. v. New Jersey Department of Civil Service, 588 F.Supp. 716, 719-20 (D.N.J.1984). That history culminated, for purposes of this motion, in the May 30, 1980 Consent Decree. Such decree does not contain findings of discrimination as against defendants, but does require that defendants

refrain from engaging in any act or practice which has the purpose or effect of unlawfully discriminating against any black or hispanic employee of, or any black or Hispanic applicant for employment with their respective fire departments because of such individual’s race, color, or national origin. Specifically the defendants shall not discriminate against any such individual in hiring, assignment, training, discipline, promotion or discharge because of race, color or national origin.

*532Consent Decree 111. To this end, the Decree, inter alia, barred the use of certain examinations for purposes of making appointments to the rank of firefighters, 112, and set forth numerical goals for the appointment of minority firefighters in each of the defendant municipalities, 113. With respect to the issue of promotions, the Decree required the defendant State of New Jersey to

review the composition of the current selection process for appointment to ranks above the level of firefighter to ensure job relatedness and with the goal of eliminating adverse impact on black or Hispanic applicants in accordance with Title VII of the Civil Rights Act of 1964, as amended, and the Guidelines issued thereunder. As part of this review, the State defendants shall conduct a thorough job analysis of each fire department promotional classification for defendant cities in a manner consistent with the Uniform Guidelines of Employee Selection Procedures, 28 C.F.R. 50.14, and other professionally accepted standards and shall establish cut-off scores, examination components and the duration of eligibility lists in a manner consistent with the results of the job analyses, to assure qualified candidates and with the goal of eliminating adverse impact on black and Hispanic applicants.

¶ 7(a). Minimum time-in-grade requirements were set forth, 117(b), and certain reporting obligations imposed upon the defendant State. 117(c). Finally, the Decree states:

Should plaintiff United States, at any stage of the'process set forth ... above, or thereafter, determine that the promotional selection process will have the purpose or effect of discrimination against black or Hispanic applicants, plaintiff shall notify the applicable State and municipal defendants, and the affected parties shall meet within a reasonable period to discuss resolution of the matter. If the parties fail to resolve the matter, any affected party may move the Court for resolution. If an objection is made by plaintiff, no persons shall be certified for appointment pending resolution by the Court.

II8. The parties agree, and the court finds, that plaintiff met the notice requirement of this paragraph, see Plaintiff’s Exh. 103, and that efforts informally to resolve this matter have failed. As a result, and after extensive discovery, the court held an exhaustive hearing on this matter, commencing February 6, 1985. At the close of plaintiff’s case, the defendant State of New Jersey and certain intervenors moved for a directed verdict; after briefing, the court reserved judgment on the matter, and addresses it herein.

FINDINGS OF FACT

A. The Selection Procedure

The defendant, State of New Jersey, through the President of the Civil Service Commission, is responsible for determining the components of the selection process for entry and promotional positions in the fire departments of the defendant municipalities, the content of and weight assigned to each such component, and whether the components are to be used on a rank order or pass/fail basis. The administration of the responsibilities of the President of the Civil Service Commission is carried out through the New Jersey Department of Civil Service, which prepares and administers examinations and promulgates eligibility lists for promotion to all uniformed fire department ranks above the entry level rank of firefighter for the fire departments in the twelve defendant municipalities herein. The defendant municipalities, in turn, are required to use the selection procedures prescribed by the State in making appointment to first level fire supervisory ranks. See generally N.J.S.A. 11:1-1, et seq; N.J.A.C. 4:1-1, et seq.

The defendant municipalities have established the first level supervisory rank in their respective fire departments as follows:

Atlantic City — Fire Captain

Camden — Fire Captain

East Orange — Fire Captain

*533Elizabeth — Fire Captain

Hoboken — Fire Captain

Jersey City — Fire Lieutenant until 3/26/84

— Fire Captain since 3/26/84

Newark — Fire Captain

New Brunswick — Fire Lieutenant until 12/3/81

— Fire Captain since 12/3/81

Passaic — Fire Lieutenant

Paterson — Fire Captain

Plainfield — Fire Lieutenant

Trenton — Fire Captain

The State has determined that, for these cities, the supervisory ranks of fire lieutenant and fire captain are equivalent positions and, since at least May 1980, has used the same selection procedure for the two ranks. Indeed, where examinations have been given for both ranks on the same day, the State has generally used substantially the same written examination for the two ranks.1

In the defendant cities, appointments to ranks above the level of firefighter are made from the next lowest rank within the fire department. By July 10, 1979 order of the New Jersey Civil Service Commission, the State determined that three years service as a firefighter was adequate experience for firefighters seeking promotion to the first level supervisory rank in all Civil Service jurisdictions; since that date this has been the minimum time-in-grade requirement used by all jurisdictions.

For at least the period subsequent to May 1, 1975, the State has determined the rank order of applicants for the first level supervisory ranks of fire captain and fire lieutenant within a defendant municipality in the following manner:

(a) A cut-off score is established based upon the results of a written, multiple-choice, job knowledge examination administered by the State. All applicants with a score at or above the cut off are deemed to have passed the examination. All individuals with lower scores are deemed to have failed.

(b) A “final average score” is computed for each individual who passed the written examination.

(c) Twenty percent (20%) of the “final average score” is attributed to the applicant’s seniority and service record.

(d) Seniority is computed on the basis of one point per year to a maximum of 85.000 (15 points added to the 70.000 points with which all candidates are credited). Applicants receive an additional ten points for efficiency and record of service and five points for a national commendation as a firefighter. Points (to a maximum of 10) are deducted from the efficiency and record of service award if an applicant has been suspended for disciplinary reasons.

(e) Eighty percent (80%) of the “final average score” is attributed to the score achieved by the applicant on the written examination.

(f) Applicants who pass the written examination are then ranked on the “eligible roster” in descending order based upon their “final average score.”

Upon receiving a request from a municipality for candidates for promotion to fire captain or lieutenant, the State follows the “Rule of Three” in certifying individuals from the eligibility roster for promotion. Under the Rule of Three, the State certifies for promotion the number of individuals equal to the number of vacancies plus two, in rank order from the eligibility roster. The city may then appoint any of those certified, based upon the relative abilities of the candidates in terms of their experience, and other relevant considerations such as leadership and supervisory experience. See generally N.J.S.A. 11:22 — 4; N.J. A.C. 4:1-12.4. Where a person with disabled or general veteran preference is ranked number one on the certification list, he may not be passed over in favor of a non-veteran; this, however, is the only advantage accruing to veterans. See N.J. S.A. 11:27-6. Promotional eligibility lists *534promulgated by the State are valid for two years, although they may be extended upon request of a municipality or affected individuals and/or labor organizations. See N.J.S.A. 11:9-10; 11:22-32; 11:22-34.1.

The following promotional examinations were administered by the State for the defendant municipalities since May 1980:

Exam Book No, Date City (Exam Symbol No.)

1243310 6/13/81 Atlantic City (PM 4003B)

1243310 6/13/81 Jersey City (PM 4011B)

1243310 6/13/81 Passaic (PM 4012B)

1523331 12/19/81 East Orange (PM 0661C)

1523331 12/19/81 Elizabeth (PM 0662C)

2243319 6/12/82 Trenton (PM 1250C)

2513315 12/18/82 Camden (PM 1873D)

3243326 6/11/83 Newark (PM 1875D)

3243326 6/11/83 Plainfield (PM 2518D)

3513314 12/17/83 Hoboken (PM 1282E)

3513314 12/17/83 Paterson (PM 1283E)

4243317 6/16/84 Atlantic City (PM 1649E)

4243317 6/16/84 East Orange (PM 1664E)

4243317 6/16/84 Elizabeth (PM 1650E)

4243317 6/16/84 Passaic (PM 1658E)

4243317 6/16/84 Trenton (PM 1655E)

B. Adverse Impact

As previously stated, only those individuals who have served at least three years as a firefighter are eligible to compete for the fire captain and lieutenant ranks. Hence, minority firefighters hired as a result of the Consent Decree have only recently become eligible to take the tests here at issue. A history of past discrimination against them has thus resulted in a concentration of minority personnel in the entry firefighter position, while they constitute a growing proportion of the total work force of the defendant municipalities’ fire department personnel. See Plaintiff’s Exh. 117, tabs b-c.

The court finds that the examinations administered by the State since May 1980 for the first level promotional ranks in the defendant municipalities have exhibited a pattern of impact adverse to minority candidates. It accepts the following statistics regarding success on the examination as persuasive evidence of such impact:

% Exam Date_White assed Adverse Minority Impact Ratio

06/13/81 56.8% 37.7% 64.6%

12/19/81 68.3% 60.0% 87.8%

06/12/82 45.0%

12/18/82 54.6% 26.7% 48.9%

06/11/83 60.4% 31.3% 51.8%

12/17/83 53.2% 20.0% 37.5%

06/16/84 58.1% (capt. only) 30.4% 52.3%

06/16/84 (total) 57.3% 32.7% 57.0%

Statistics regarding appointment rates, or the likelihood of appointments, confirm the effect of these disparate pass rates on the examination. Thus, the chart below shows the two highest ranking minorities remaining on the lists currently in existence and accordingly demonstrates the meager possibilities for future minority appointments if the lists are allowed to be used.

Exam Date City First Second Minority Minority

12/81 East Orange 26th . 29th

12/82 Camden 31st 36th

06/83 Newark 25th 44th

06/83 Plainfield 9th 17th

12/83 Paterson 35th None

12/84 Hoboken None None

06/84 Atlantic City 9th 39th

06/84 East Orange 7th 13th

06/84 Elizabeth None None

06/84 Passaic 22nd None

06/84 Trenton 58th None

Undisputed evidence of projected vacancies in the fire captain and lieutenant ranks in these cities indicates that, at most, only two additional minorities could expect to be appointed from existing promotional lists.

The court rejects defendants’, and intervenors’, assertion that these statistics constitute an improper aggregation of statistically insignificant test results, because the tests aggregated contained different questions, had different cutoff scores and was administered to different persons. The extreme similarity of the tests render aggregation proper and the prevailing legal stan*535dards compel a finding of adverse impact and a corresponding denial of defendants’ and intervenors’ motion for a directed verdict.

Nor, finally, does the court find that the State defendant’s mere attempt to comply with the Consent Decree, by conducting a thorough job analysis, renders the relief sought inappropriate where, as here, such attempt resulted in an examination with so profound an adverse impact upon minority candidates, which impact was made known to the State defendants.

C. The Job of Fire Lieutenant/Captain 2

As previously noted, the first level supervisory rank in the fire departments of the defendant municipalities is either captain or lieutenant. A captain/lieutenant is in command of a fire company, which in these cities consists of the company officer and two to four firefighters. Typically, each company is assigned to a single piece of apparatus — either a “pumper” or a ladder truck. Although fire alarm response procedures vary from city to city, and within a city depending upon the time of day, it is generally the case that a single company responds to “still alarm” fires (e.g., grass fires, kitchen stove fires, etc.), three companies and a battalion chief to a one alarm fire and numerous companies, under the command of a deputy chief or the department chief, to a two- or three-alarm fire. The first fire captain who arrives at a scene fire has full responsibility for com-batting the fire, including “sizing it up” and assigning firefighting duties, until the arrival of a superior officer. This period lasts from two to eight minutes and the decisions made at this time are extremely critical to the ultimate success of the firefighting effort.

The primary distinction between the job duties of a firefighter and those of a fire captain/lieutenant is that the latter is responsible for supervising the firefighters assigned to him, although officers routinely assist firefighters at the fire scene. It is thus important that fire captains/lieutenants possess substantial knowledge concerning the science of firefighting. In particular, they must know what steps are to be taken in such emergencies, though not necessarily the theoretical scientific basis for taking such steps. Additionally, all fire department personnel, firefighters as well as officers, are responsible for reporting signs of arson discovered at a fire scene. Although from time to time fire captains/lieutenants are called upon to pursue arson investigations and should, in the normal course of fighting fires, be able to recognize evidence of such crime, the primary responsibility for detecting and investigating arson belongs to specialized personnel, and not to captains/lieutenants.

In many of the defendant municipalities, fire captains/lieutenants are responsible for conducting in-service inspections of non-residential buildings and multi-family dwellings. In others this “preplanning” for fires, or identification of obvious hazards and code violations by officers, occurs less frequently. In seven departments, there exist fire prevention bureaus, which have primary responsibility for detecting code violations. In these departments, officers have little or no responsibility in this regard; in others, they have somewhat greater, though limited, fire prevention responsibilities.

D. The 1981 Job Analysis

Pursuant to its obligations under the Consent Decree, the defendant State’s fire examiner conducted job analyses of the positions of fire captain/lieutenant in February and March 1981. These job analyses involved two one-day sessions with “subject matter experts” (“SME’s”) who held the rank in question or a higher rank in a fire department and who were well ac*536quainted with the duties of the fire captain/lieutenant. Two SME’s from each of three defendant cities attended the lieutenant session on February 23, 1981. Two SMEs from each of eight other defendant cities attended the captain session held on March 9, 1981.3

At the fire lieutenant session, the SMEs, as a group, developed task statements listing the activities performed by a fire lieutenant and delineated the knowledges, abilities and skills (“KASOs”) required to perform each task. The SMEs rated the frequency and “criticality” {i.e., consequence of error) of each task on a scale of 1 to 4, ranked the order of importance of the KASOs for each task and indicated the importance of the KASOs in percentage terms to the completion of the task. The SMEs were asked to indicate whether each KASO was brought to the job or learned on the job and whether each KASO was appropriate for qualifying candidates.4

At the fire captain session, the State fire examiner did not ask the SMEs to generate task statements or KASOs, but used the information developed at the fire lieutenant session, along with his own knowledge, to generate a standard set of task and KASO statements from which the SMEs worked. These statements were reviewed by the SMEs, who agreed that they accurately reflected the contents of a fire captain’s job. As at the lieutenant’s session, the SMEs were requested to rate the frequency and criticality of the tasks and the importance of the KASOs to each task, and to state whether each KASO was brought to the job or learned on the job and whether each KASO was appropriate for qualifying candidates. Finally, the SMEs were asked to suggest texts from which test questions for a particular KASO might be obtained.

The court finds that no report or memorandum memorializing the manner in which these sessions were conducted was ever prepared by the personnel involved. Nonetheless, the 1981 job analysis provided the basis for all examinations administered by the State for promotion to the first level supervisory rank since May 1980. In particular, the State’s fire examiner prepared a summary “Job Analysis Chart,” from which he determined the number of questions that were to be devoted to each KASO on a multiple-choice examination, by use of a so-called “complex index.” If a KASO, such as oral communications, could not be tested by written examination, it was not measured at all, primarily because of the impracticability of testing such a large candidate population in an oral manner. Indeed, the State concedes that there were numerous KASOs for which there was no testing, either, for this reason or because the State deemed them unimportant.

E. Validity

The defendant State asserts that its selection procedure for promotion to the ranks of lieutenant/captain is content valid, i.e., that the procedure measures essential knowledges and abilities required to perform those jobs. In particular, the State claims that its 1981 job analysis identified knowledges critical to the successful performance of the duties of fire captain/lieutenant, that the resulting examinations measure those knowledges, and that the examination is appropriate for use as a *537strict rank ordering device. The State further contends that an analysis performed by its expert in the fall of 1984 established that the examinations measured 29 to 30 percent of the “generic human abilities” required for the jobs, even though the questions were not expressly constructed to measure such abilities, that 61 percent of the abilities required for successful job performance could be measured by a paper and pencil examination, and that more than half of the items on the examinations involved abilities important to the duties and responsibilities of a fire captain/lieutenant. However, the State’s expert conceded that he was uncomfortable basing any claim of validity on his 1984 analysis.

The State’s expert was retained in late August or early September 1984 for purposes of studying seven promotional examinations administered by the State to determine if such examinations were valid, i.e., whether individuals who scored well on the examinations would perform well on the job. He reviewed the forms and documents utilized by the State in performing its job analysis, the examinations themselves, job analyses for the position of fire officer in two other jurisdictions, a study of firefighters performed by the United States Civil Service Commission, depositions of the State employees and SMEs who participated in the job analysis at issue, and other State Civil Service materials. Additionally, the State’s expert performed his own study to identify the key abilities required by the captain/lieutenant position and to determine whether the challenged examination measured such abilities. In conducting this study, he consulted with 69 fire captains and firefighters from six of the defendant municipalities. He concluded that more than half of each test did, and that the examinations were valid, within the meaning of the Uniform Guidelines, supra.

The State’s expert testified that the State’s job analysis was appropriate, because it was conducted the way he would have done it and because the method of obtaining information from SMEs was entirely “normal and reasonable”. Furthermore, he found that the technical standards manual promulgated by the State Civil Service Commission provided “the best description of job analysis” he had ever seen in agency regulations or a textbook. However, he admitted that the State’s analysis differed from his own procedures in a number of respects, that he was unaware of or misinformed about a number of significant facts regarding the development and use of the job analysis and that his conclusion was based, at least in part, upon an assumption that the State fire examiner followed the procedures set forth in the technical procedures manual. The court finds that such assumption was incorrect.5

Indeed, the court finds that the job analyses performed for the ranks of lieutenant and captain based upon the February 24, 1981 and March 9, 1981 sessions do not meet the requirements of the Uniform Guidelines or other professional standards. A thorough job analysis must provide a clear description of the tasks of the job, measure the frequency and/or criticality of such tasks, define the knowledges, abilities and skills required to perform the tasks, and explain the relationship between the task and its related KASOs. Both the State’s expert and plaintiff’s expert testified that SMEs should be involved in a job analysis, emphasizing the need for a representative group of SMEs to assure that all important aspects of a job are addressed.

The court finds that the task statements utilized herein were too general to provide useful information in constructing an examination for promotion to captain/lieutenant. Many important aspects of the duties and activities of captains/lieutenants were omitted, both in the original ten task state-*538merits and, even more so, in the six more general statements derived by the State fire examiner. For example, the job descriptions ultimately arrived at do not contain any information regarding the number of individuals supervised, the extent to which captains may take individual action independent of orders from their superior officers, the manner in which their job duties relate to those of the fire prevention bureau or other specialized personnel within or outside the fire department, or the level or complexity of their job duties.

The overly general nature of the tasks arrived at made an accurate evaluation of their relative importance and criticality impossible. Many discrete activities may be included within one task statement: for example, Task One subsumes all activities at the fire scene. Indeed, the State’s expert estimated that 250 tasks were covered by the six groups.6 Thus, although the SMEs were instructed as to the meaning of the various “anchors” utilized in measuring critically, on a scale of one to four, the result of applying the scale to such broad task groups is that the same criticality rating was given to all activities within the task group, irrespective of the relative importance of the individual activities within such group. The court specifically rejects the State’s contention that the task statements that resulted from the 1981 job analysis were proper, or that the consolidation of all tasks into six large groups was appropriate either because it eliminated repetition and combined similar functions or because it resembled other task statements developed in other contexts.

The court similarly finds that the KASO statements resulting from the 1981 job analysis were too general, and contained no information about the level of complexity or proficiency required on the job. For example, even the State’s expert conceded that “ability to supervise,” which was listed in four of the six captain task statements and weighted heavily by the SMEs, was not adequately covered by the KASOs, as they were worded, and that no useful information as to this essential characteristic was thus provided. Though the State’s expert argued that supervision is a generic term, and that several KASOs are, in effect, components of supervision, the court finds incredible the contention that the ability to supervise ought not have been specifically and unambiguously delineated as a KASO, and tested. What distinguishes the firefighter from the captain/lieutenant, more than anything else, in the role of supervision.

The court also agrees with plaintiff that the KASOs contained unexplained inclusions, exclusions and overlaps, some of which produced anomalous results in the KASO ratings, and that the validity of the KASOs is placed in doubt by the wide variance in the importance ratings of particular KASOs among the cities. For example, the importance ratings for the ability to evaluate fires ranged from thirty percent in one city to five percent in two others. Such differences were neither investigated nor explained by the State. If they reflect reality, the job of fire captain appears to be significantly different in different cities, a fact not reflected in the actual tests. Alternatively, the variance may be an indication that the SMEs did not fully comprehend what they were asked to do and, thus, that the process by which the information was obtained was unreliable.

Because the State uses the written examination as both a qualifying and a ranking device, professional standards, including the State’s own Technical Standards Manual, require that it test only for those knowledges and abilities regarding which higher scores on the test would indicate better performance on the job. The court finds that the tests here at issue are not appropriate for ranking candidates. First, the elements of the job of fire cap*539tain/Iieutenant which were identified and rated in the State’s job analysis were almost exclusively abilities, while the examination was intended to be exclusively a test of job knowledge. Hence, the State did not select a test which measured the important work behaviors constituting most of the job, although the State’s expert concluded that the test seemed to measure certain cognitive abilities important to the job. Second, in the course of the 1981 job analysis, the SMEs were not even asked whether the KASOs arrived at were appropriate for ranking; rather, they were asked to indicate whether the KASOs ought to be qualifying. Significantly, no SME indicated that any KASO was qualifying for any task, although several KASOs, such as knot-tying or first aid, appear to be abilities or knowledges common to all experienced firefighters, and as to which only a very insignificant range of performance ought to have been possible. Thus, there is no basis upon which the State could conclude that the KASOs were properly utilized as ranking devices, even though the more specific tasks upon which they were based were evaluated in terms of relative frequency and criticality. By not specifically asking the SMEs whether the KASOs were appropriate for ranking, however, the State ran afoul of its own Technical Standards Manual. See Joint Exh. 3, at 3-81 — 3-82. This is but one reason why the court does not credit the testimony of the State’s expert to the effect that the examinations here at issue complied with the requirements of the Uniform Guidelines: such expert did not recognize either that the procedure utilized was not explicitly oriented to ranking or that the State had therefore not followed its own Manual. Additionally, the court notes that the form of the expert’s own study as well as his testimony in similar matters elsewhere both belie his assertion of the validity of the selection process used by the State and here challenged by plaintiff. Indeed, he himself found that the reliability of the examinations was poor, and the State’s method of calculating such reliability “lousy,” though he concluded that a high level of reliability would not necessarily have been expected.

Third, the testimony of plaintiff’s expert revealed several important flaws in the tests administered: they placed a premium on test-taking ability rather than the know-ledges, abilities and skills which ought to have been examined, such that one could often eliminate some or all of the incorrect responses without knowing the correct answer; 7 they were ambiguously phrased or diagrammed and thus tested the ability to understand questions rather than job knowledge itself; they tested for knowledge of certain terminology, such as “nozzle reaction” or “unity of command”, rather than for an understanding of the underlying concept; and such concepts were often tested in terms of underlying theories or principles, rather than in terms of their practical application in identifying a dangerous situation and taking the appropriate corrective action. In part, these problems arose because of the State’s failure to have SMEs review test items or otherwise participate in the test-writing process, as well as its decision not to pretest the examinations, for security reasons. In part, the problems arose because of the State’s heavy reliance upon such textbooks as Essentials of Firefighting, by the International Fire Service Training Association (2d ed. 1983) or The Fire Chiefs Handbook, edited by James F. Casey (4th ed. 1978). Such books concede that they set forth but one method for accomplishing particular tasks, while various municipalities have their own standard operating procedures. More fundamentally, the reliance upon specific sources in the questions themselves— which often asked, for example, “According to Firefighting Principles and Practices, _ is the quality that firefighters most want in an officer,” see Joint Exh. 8(c), at 7 — rendered the tests more probative of the test taker’s ability to recall what a particular text stated on a given topic than of his firefighting or supervisory knowledge or abilities. The court con-*540eludes that the tests were often irrelevant: they tested memory, rather than the “best” or “most accepted” means of dealing with particular situations, or presented several correct answers to choose from.

To this extent, the court credits the testimony of plaintiffs expert. The court specifically rejects the State’s contention that it ought not do so simply because plaintiff’s expert lacked the qualifications of an expert on firefighting, or the supervision thereof. It simply cannot be denied that plaintiff’s expert is a formidable authority on employment testing and test construction in general: he has worked as an industrial psychologist for over thirty years and has been involved in the development of all of the recognized professional guidelines and standards in the field. He also was directly involved in the development of a selection procedure for fire captain found valid for the City of St. Louis. Moreover, the court finds some of the deficiencies in the examination discussed by plaintiff’s expert to be so glaring that even a layperson could recognize them.

Finally, although the court does not rely upon this information in arriving at its conclusion that the tests used were invalid, it cannot help but note that the apparently satisfactory performance of firefighters performing as “acting captains” though they may have failed the test, undermines any contention of the test’s validity.

Nor does the court find the 1984 analysis of the State’s test performed by its expert to establish the validity of such test. Such analysis, which plaintiff characterizes as a measure of certain “generic human abilities,” was conducted as follows: the State’s expert regrouped the task requirements developed earlier into sets which he believed to be more amenable to identifying cognitive abilities. Next, he took so-called Fleishman ability sets and modified the ability descriptions therein to fit the context of firefighting by adding appropriate examples illustrative of each Fleishman ability. Third, he assembled a group of representative SMEs to evaluate which abilities were used to carry out which fire captain tasks and the importance of each;8 a second group of SMEs were assembled to evaluate the extent to which such abilities were measured by the test in question. Finally, he examined the overlap between abilities required and abilities tested to determine whether the test as administered actually tapped or measured important abilities to a substantial extent: if three of the five members of the State’s expert’s firm agreed that an ability was so measured by a particular test item, he considered there to be agreement as to that item. As a result, the State’s expert concluded that the fire captain job is 88% cognitive — substantially more so than the job of firefighter — and that an outstanding fire captain would have significantly more of each of the nine cognitive abilities testable by paper and pencil tests than would a barely competent fire captain.

The court, however, finds the procedure used by the State’s expert to have been flawed; it thus concludes that such procedure did not provide credible or reliable information regarding the abilities required in the job of fire captain or tested by the State’s selection procedure. Most fundamentally, the State’s expert has not persuaded the court that a multiple choice, written examination which heavily emphasizes specific texts actually evaluates the abilities required of a fire captain. Indeed, the court has serious doubts as to whether any written, multiple choice examination can adequately test this type of essentially practical ability; alternative means of testing ought to be developed to evaluate the *541subtle and complex factors involved, factors which are not reflected in one’s ability to remember what a particular book or manual states.

First, the State’s expert evaluated “generic human abilities”, unrelated to observable work behavior and more relevant to making inferences about the mental processes underlying such behavior. Whether “constructs” or not, they are not the appropriate yardstick for evaluating potential job performance, as opposed to the ability to think about it.

Second, even the State’s expert conceded that his analysis did not justify the use of the test as a rank-ordering device.

Third, the task statements evaluated by the SMEs were not representative of the jobs actually performed by fire captains/lieutenants. To the extent that they were based upon the 1981 job analysis, they suffered from the same problems as did that analysis, such as excessive generality. Moreover, to the extent that the 1981 job analysis was unilaterally modified by the State’s expert, or a modified version of a national study of the firefighter rank was used, the 1984 analysis represented task statements not independently confirmed as accurate and complete descriptions of those jobs, as performed in the cities at issue. In particular, the court agrees with plaintiff that the task statements utilized by the State’s expert excludes much of the physical activity required by the job. It was, accordingly, an analysis designed to conclude that the fire captain job is highly cognitive; that it so concluded is thus neither surprising nor convincing.

Fourth, by weighting task statements equally in the averaging process, the State assumes that the twelve captain tasks and seventeen firefighter tasks are each equally important to the jobs at issue. The court finds this assumption contrary to common sense; it also notes that it is highly inconsistent with the 1981 job analysis. Thus, for example, the 1984 analysis assumes that “fire evaluation” and “supervision” are no more important to successful job performance than are “public relations" and “report writing”. The 1981 job analysis demonstrated to the contrary: “report writing” was given a very low frequency/criticality rating and public relations was not even mentioned. Moreover, the court is reminded that as the most important function of the firefighter is to fight fires, the most important job of the fire captain is to evaluate conditions at a fire scene and direct appropriate action be taken. A valid job analysis must so conclude; one that equates “report writing” with supervision at the fire scene must, the court concludes, be fundamentally in error.

Fifth, the court finds the job analysis to suffer from the hypothetical nature of the judgments asked of the SMEs, the use of ambiguous instructions9 and the unanchored rating scales utilized. The result, the court believes, was an inadequate rating of abilities, with, for example, wide variations in judgments among SMEs from the same cities, and the virtually uniform ratings of some SMEs with regard to each and every task.

Finally, the court agrees with plaintiff that the Kappa statistic utilized by the State’s expert was not calculated properly and that, if it had been, it would have shown a low level of reliability with respect to the psychologist SMEs’ evaluation of the examination at issue. For this reason too, the court finds that the State’s 1984 analysis fails to justify the use of an examination which, the court finds, is not content valid, and, as such, not job-related.

DISCUSSION OF THE LAW

The law applicable to this case is, for the most part, agreed upon by the parties. *542The State urges the court to apply “general Title VII law” to this matter; plaintiff, however, contends that the Consent Decree requires that the impact and content of the examinations here at issue be evaluated in light of the Uniform Guidelines on Employee Selection Procedures, 28 C.F.R. § 50.14. The court agrees that the Decree mandates compliance with the Uniform Guidelines and other professionally accepted testing standards, as well as Title VII. See Consent Decree ¶¶ 3, 7(a). However, it finds that the law differs little, if at all, depending on the source of the legal requirements utilized. Nor does the court find the use of different requirements to mandate different results in this case.

First, Title VII of the Civil Rights Act of 1964, 42 U.S.C. § 2000e et seq., requires that a plaintiff challenging an employment examination demonstrate that such examination has a discriminatory impact. See generally Connecticut v. Teal, 457 U.S. 440, 446, 102 S.Ct. 2525, 2530, 73 L.Ed.2d 130 (1982), citing Dothard v. Rawlinson, 433 U.S. 321, 97 S.Ct. 2720, 53 L.Ed.2d 786 (1977); Albemarle Paper Co. v. Moody, 422 U.S. 405, 95 S.Ct. 2362, 45 L.Ed.2d 280 (1975); Griggs v. Duke Power Co., 401 U.S. 424, 91 S.Ct. 849, 28 L.Ed.2d 158 (1971). Such impact exists if “the tests in question select applicants for hire or promotion in a racial pattern significantly different from that of the pool of applicants.” Albemarle Paper Co., supra, 422 U.S. at 425, 95 S.Ct. at 2375, citing McDonnell Douglas Corp v. Green, 411 U.S. 792, 801, 93 S.Ct. 1817, 1823-24, 36 L.Ed.2d 668 (1973). Cf., Dothard v. Rawlinson, supra, 433 U.S. at 330, 97 S.Ct. at 2727 (disproportionate impact may be based on analysis of general population demographic data, instead of characteristics of actual applicants). Similarly, the Uniform Guidelines state:

The use of any selection procedure which has an adverse impact on the hiring, promotion, or other employment or membership opportunities of members of any race, sex, or ethnic group will be considered to be discriminatory and inconsistent with these guidelines, unless the procedure has been validated in accordance with these guidelines ...

28 C.F.R. § 50.14(3)(A). The Guidelines then go on to set forth a definition of adverse impact.

A selection rate for any race, sex, or ethnic group which is less than four-fifths (Vs) (or eighty percent) of the rate for the group with the highest rate will generally be regarded by the Federal enforcement agencies as evidence of adverse impact, while a greater than four-fifths rate will generally not be regarded by Federal enforcement agencies as evidence of adverse impact.

28 C.F.R. § 50.14(4)(D). In evaluating Title VII claims, however, courts have varied in their adherence to this so-called “Four-Fifths Rule”. As recently summarized by the United States Court of Appeals for the Ninth Circuit,

The Uniform Guidelines are not legally binding. See General Electric Co. v. Gilbert, 429 U.S. 125, 141-42 [97 S.Ct. 401, 410-11, 50 L.Ed.2d 343] (1976). They have not been promulgated as regulations and do not have the force of law. Aguilera v. Cook County Police and Corrections Merit Board, 760 F.2d 844, 847 (7th Cir.1985). Additionally, the 80 percent rule has been sharply criticized by courts and commentators. See E. Shoben, Differential Pass-Fail Rates in Employment Testing: Statistical Proof Under Title VII, 91 Harv.L.Rev. 793, 805 (1978) (ill-conceived rule capable of producing anomalous results because it fails to take account of differences in sampling size).

Clady v. Count of Los Angeles, 770 F.2d 1421, 1429 (9th Cir.1985). Hence, while many courts adhere to the four-fifths rule in evaluating adverse impact, see, e.g., Firefighters Institute for Racial Equality v. City of St. Louis, 616 F.2d 350, 356-57 (8th Cir.1980), cert. denied, 452 U.S. 938, 101 S.Ct. 3079, 69 L.Ed.2d 951 (1981); Thomas v. City of Evanston, 610 F.Supp. 422, 427-28 (N.D.Ill.1985)) others evaluate disparate impact in terms of other, more *543scientific measures of statistical significance, see, e.g., Fudge v. City of Providence Fire Department, 766 F.2d 650, 658 and nn. 8-9 (1st Cir.1985), citing Castaneda v. Partida, 430 U.S. 482, 496 n. 17, 97 S.Ct. 1272, 1281 n. 17, 51 L.Ed.2d 498 (1977); Hazelwood School District v. United States, 433 U.S. 299, 97 S.Ct. 2736, 53 L.Ed.2d 768 (1977); Rivera v. City of Wichita Falls, 665 F.2d 531, 545 (5th Cir.1982); Louisville Black Police Officer Association v. City of Louisville, 511 F.Supp. 825, 832-33 (W.D.Ky.1979), while still others look to a combination of the two standards. See, e.g., Easley v. Annheuser-Busch, Inc., 758 F.2d 251, 256 n. 8 (8th Cir.1985); Guardians Association of the New York City Police Department v. Civil Service Commission of the City of New York, 630 F.2d 79, 88 (2d Cir.1980), cert. denied, 452 U.S. 940, 101 S.Ct. 3083, 69 L.Ed.2d 954 (1981). Finally, some courts apply a more flexible mode of analysis, looking generally to whether statistical disparities are “substantial” or “significant” in a given case. Clady, supra, 770 F.2d at 1429-29, citing Contreras v. City of Los Angeles, 656 F.2d 1267, 1274-75 (9th Cir.1981), cert. denied, 455 U.S. 1021, 102 S.Ct. 1719, 72 L.Ed.2d 140 (1982); Craig v. City of Los Angeles, 626 F.2d 659, 661-62 (9th Cir.1980), cert. denied, 450 U.S. 919, 101 S.Ct. 1364, 67 L.Ed.2d 345 (1981); Aguilera, supra, 760 F.2d at 846.

Here, the court has no doubt but that plaintiff demonstrated the adverse impact of the examination at issue upon black and Hispanic candidates for promotion. While no evidence was offered at trial of the statistical significance of the figures cited supra, the court finds the evidence of the discriminatory effect of the tests at issue to be overwhelming. Minority candidates passed the examination at a rate 57% of that at which whites passed; their pass rate was approximately 25% lower than the pass rate of whites. The result is that minorities stand well down the lists, and that only two minority firefighters stand a reasonable chance of being promoted based upon these tests. These figures are similar to or greater than others based upon which adverse impact has been found. See, e.g., Clady, supra, 770 F.2d at 1429 (adverse impact on Hispanics, based upon 22.1% difference in pass rates, and 73% adverse impact ratio), citing Craig, supra, 626 F.2d at 661 n. 3; Aguilera, supra, 760 F.2d at 846 (adverse impact on Hispanics, based on 50% adverse impact ratio, although statistics imperfect), citing Walls v. Mississippi State Dept. of Public Welfare, 730 F.2d 306, 315 and n. 8 (5th Cir.1984); Hameed v. International Association of Bridge, Structural & Ornamental Iron Workers, Local Union No. 396, 637 F.2d 506, 510-11 and n. 4 (8th Cir.1980); Easley, supra, 758 F.2d at 256 n. 8 (adverse impact upon blacks, based upon 20% difference in pass rates and 60% adverse impact ratio); Jones v. International Paper Co., 720 F.2d 496, 499 (8th Cir.1983) (adverse impact on blacks where only two promotions found); Guardians Association, supra, 630 F.2d at 85-88 (adverse impact on blacks and Hispanics where differences in pass rates were 29% and 25% and adverse impact ratio was approximately 40%); United States v. City of Chicago, 549 F.2d 415, 428-29 (7th Cir.1977) (adverse impact found although whites passed test at eight-sevenths the rate of minorities, where whites had a 7.07 percent chance of being promoted, compared to a 2.23 percent chance for minorities); Jones v. Mississippi Department of Corrections, 615 F.Supp. 456, 464 (N.D.Miss.1985) (selection differential of 32.6%, with adverse impact ratio of 58% sufficient to present a prima facie case); Louisville Black Police Officers, supra, 511 F.Supp. at 834 (adverse impact gleaned from the fact that only two blacks were accepted into the recruit school in two separate years, in light of the relevant labor market); Vanguard Justice Society, Inc. v. Hughes, 471 F.Supp. 670, 723-27 (D.Md.1979) (various statistics). Cf., Fudge v. City of Providence, supra, 766 F.2d at 657 (9% difference in pass rates insufficient, when not statistically significant and where sample size is small); Rivera, supra, 665 F.2d at 545 (no adverse impact in absence of statistically significant difference be*544tween expected and actual Hispanic hires); Moore v. Southwestern Bell Telephone Co., 593 F.2d 607, 608 (5th Cir.1979) (no adverse impact where 7.1% selection differential and 93% adverse impact ratio); Stewart v. Hannon, 469 F.Supp. 1142, 1149 (N.D.Ill.1979) (no adverse impact where selection differential was 6.5%). The court thus has no trouble finding there to be adverse impact in this case.

Nor is the court swayed in such finding by the application of the defendant State and intervenors New Jersey State FMBA and Newark FMBA Local No. 4 for a directed verdict. That application is based upon plaintiffs failure to adduce evidence of the statistical significance of its data regarding adverse impact. Additionally, the State, and intervenors, argue that in contending that the examination at issue did, in fact, have adverse impact upon minority candidates for promotion, plaintiff seeks improperly to aggregate the results of various tests. Such aggregation ought not occur, they contend, because the tests were different, in that they contained different questions and had different cut-off scores; furthermore, they argue that plaintiff improperly omits mention of statistics regarding the same examinations here at issue, with respect to cities not encompassed by the Consent Decree.

The court finds these contentions to be without merit. Although it would have preferred to receive testimony regarding the statistical significance of the figures relied upon by plaintiff, it has no trouble concluding, as did the State’s expert, that such figures reveal a clear pattern of results adverse to minorities. Moreover, while it would also have benefit-ted from testimony regarding the aggregability of these tests, the court’s own review of the evidence leaves it convinced that aggregation is proper. The court earlier concluded that other intervenors were unlikely to succeed in showing the impropriety of aggregation. It wrote

Because the cases, as well as the regulations, see 28 C.F.R. § 50.14(D)(4), indicate that problems of statistically insignificant samples may be overcome by aggregating test results over several administrations of a single examination, see, e.g., Chicano Police Officer Association v. Stover, 526 F.2d 431, 438-49 and nn. 6, 8 (10th Cir.1975), remanded, 426 U.S. 944 [96 S.Ct. 3161, 49 L.Ed.2d 1181] (1976), on a state-wide basis, see, e.g., Kirkland v. New York State Department of Correctional Services, 520 F.2d 420, 425 (2d Cir.1975), based upon their preparation or utilization by a single employer, see, e.g., Ezell v. Mobile Housing Board, 709 F.2d 1376, 1382 (11th Cir.1983), or even upon the type of examination used, see, e.g., Boston Chapter N.A.A.C.P. v. Beecher, 504 F.2d 1017, 1021 (1st Cir.1974), cert. denied, 421 U.S. 910, [95 S.Ct. 1561, 43 L.Ed.2d 775] (1975), the court was led to conclude that this was a problem demanding factual resolution. However, the uniformity of the development, type and administration of the tests given [by] the municipalities, as well as “the historical context” of a defendant state aware of the adverse impact of its examinations, but unwilling or unable to change them, see Commonwealth of Pennsylvania v. Rizzo, 466 F.Supp. 1219, 1233 (E.D.Pa.1979), now presents the court from finding that intervenors are likely to succeed in this respect.

Vulcan Pioneers, Inc. v. New Jersey Department of Civil Service, Civil Action No. 950-73, unpub. op. at 8-9 (D.N.J. Oct. 29, 1984) (footnotes omitted). Having heard the evidence in this matter, the court now finds such aggregation entirely appropriate: although different tests were administered, which tests were scored slightly differently, the tests were extremely similar. All resulted from the same job analysis. All were of the paper-and-pencil variety. All were studied for by reading essentially the same textbooks. And all were utilized by the same employer, the New Jersey Department of Civil Service. Nor have the State, or intervenors cited any caselaw which contraindicates the propriety of aggregating test results in circumstances such as these: indeed, the many cases relied upon by these parties to the effect that *545small sample sizes ought not be relied upon in establishing adverse impact, see, e.g., Brief of Intervenor New Jersey State Fireman’s Mutual Benevolent Association at 10-11 (citing cases); see also Kim v. Commandant, Defense Language Institute, 772 F.2d 521, 524 (9th Cir.1985), citing Mayor of Philadelphia v. Educational Equality League, 415 U.S. 605, 620-21, 94 S.Ct. 1328, 1333, 39 L.Ed.2d 630 (1974), stand for the proposition that unaggregated test scores may not suffice to prove adverse impact. They do not argue against aggregation where appropriate. The court finds aggregation to be appropriate here. And, the court finds, once aggregated, the tests here at issue can only be viewed as adversely affecting minority candidates for promotion.10

Finally, the court here reiterates its position that, all of the above notwithstanding, the Consent Decree obligated the State to conduct a thorough job analysis and develop a valid examination for promotions. Consent Decree 117(a). For the reasons set forth, the court finds that it has not done so. Hence, it has violated the Consent Decree irrespective of the adverse impact demonstrated by plaintiff. See Vulcan Pioneers, Inc. v. New Jersey Department of Civil Service, supra, unpub. op. at 9 n. 3, citing United States v. County of San Diego, 20 E.P.D. ¶ 30, 154 at 11,813 (S.D.Calif.1979). That such adverse impact has been shown, in terms convincing to the court, strengthens plaintiff’s case. It may not, however, have been necessary.

Either way, the court finds that plaintiff has indeed established a prima facie case of discrimination. Hence, the burden shifted to defendants to demonstrate the job-relatedness, or “validity”, of the test at issue. See generally Teal, supra, 457 U.S. at 446-47, 102 S.Ct. at 2530-31; Dothard, supra, 433 U.S. at 329, 97 S.Ct. at 2727; Albemarle Paper Co., supra, 422 U.S. at 425, 95 S.Ct. at 2375; Griggs, supra, 401 U.S. at 431, 91 S.Ct. at 853. A test is job-related “if it measures traits that are significantly related to the applicant’s ability to perform the job.” Griggs, supra, 401 U.S. at 436, 91 S.Ct. at 856. Generally, job-relatedness is determined by or through three validation strategies; criterion-related validity, content validity and construct validity. See Uniform Guidelines, 28 C.F.R. § 50.14(14)-(15). See, e.g., Washington v. Davis, 426 U.S. 229, 247 n. 13, 96 S.Ct. 2040, 2051 n. 13, 48 L.Ed.2d 597 (1976); Gillespie v. State of Wisconsin, 771 F.2d 1035, 1040 (7th Cir.1985); Rivera v. City of Wichita Falls, supra, 665 F.2d at 537-38; Harless v. Duck, 619 F.2d 611, 616 n. 5 (6th Cir.), cert. denied, 449 U.S. 872, 101 S.Ct. 212, 66 L.Ed.2d 92 (1980); United States v. City of Chicago, supra, 573 F.2d at 425-26 (content validity); Firefighters Institute for Racial Equality v. City of St. Louis, 549 F.2d 506, 510-11 (8th Cir.), cert. denied, 434 U.S. 819, 98 S.Ct. 60, 54 L.Ed.2d 76 (1977) ; Richardson v. McFadden, 540 F.2d 744, 746-47 (4th Cir.1976), cert. denied, 435 U.S. 968, 98 S.Ct. 1606, 56 L.Ed.2d 59 (1978) . Criterion-related validity is established by showing a significant statistical correlation between test performance and reliable measures of job performance, content validity by demonstrating that the content of a test closely approximates the *546tasks required to be performed on the job, and construct validity by identifying underlying traits necessary to job performance and showing how they are measured by the test at issue. See, e.g., Washington v. Davis, supra, 426 U.S. at 247 n. 13, 96 S.Ct. at 2051 n. 13.

Here, the State’s claim of job-relatedness is bottomed solely on an assertion of content validity. In order to succeed in such assertion, the State concedes that it must show five attributes of the examination.

The first two concern the quality of the test’s development: (1) the test-makers must have conducted a suitable job analysis, and (2) they must have used reasonable competence in constructing the test itself. The next three attributes are more in the nature of standards that the test, as produced and used, must be shown to have met. The basic requirement, really the essence of content validation, is (3) that the content of the test must be related to the content of the job. In addition, (4) the content of the test must be representative of the content of the job. Finally, the test must be used with (5) a scoring system that usefully selects from among the applicants those who can better perform the job.

Guardians Association, supra, 630 F.2d at 95 (footnote omitted), citing Uniform Guidelines, supra.

In order for the job analysis to have been appropriate, under the Uniform Guidelines, it must have included

an analysis of the important work behaviors) required for successful performance and their relative importance and, if the behavior results in work product(s), an analysis of the work product(s). Any job analysis should focus on the work behavior(s) and the tasks associated with them. If work behavior(s) are not observable, the job analysis should identify and analyze those aspects of the behaviors) that can be observed and the observed work products. The work behavior^) selected for measurement should be critical work behavior(s) and/or important work behavior(s) constituting most of the job.

28 C.F.R. § 50.14(14)(C)(2). See also 28 C.F.R. § 50.14(15)(C)(3). As discussed above, the court here finds both the 1981 and 1984 job analyses to have been flawed. The 1981 analysis suffered from utilizing task statements that omitted many important tasks — such as supervising numbers of individuals — and which were too general to allow them to be used accurately to measure the frequency or criticality of tasks. The resulting KASOs were, consequently, similarly flawed. Such KASOs also failed to account for large differences between subject municipalities. The 1984 job analysis was even more suspect: it did not even purport to establish a relationship between the test and job behaviors, and did not, in fact, do so. Indeed, no new task statements were developed, and the conclusion of the State’s expert that the job of fire captain is 88% cognitive is accordingly rejected as not linked to observable or non-observable job behaviors, or as suffering from the same flaws as the 1981 analysis. Differences in frequency or criticality were substantially ignored in the 1984 analysis, rendering it inadequate for this reason as well. Finally, the court reiterates its previous findings concerning the technical flaws in both job analyses, including, for example scoring problems and ambiguous instructions.

Second, the construction of the test was also deeply flawed: hence, the court here reiterates not only its doubts as to whether, for example, the State had any basis for assuming that the KASOs were appropriate for ranking, as well as qualifying, candidates, but also its finding that the tests improperly tested memory over ability, knowledge of abstract concepts over practical know-how, terminology over the ideas they represented, and test-taking ability overall. In addition, the tests had technical flaws in particular questions, flaws which might have been avoided had the State pretested its examination or utilized experts to judge its relevance, comprehensability or lack of ambiguity. See Guardi*547ans Association, supra, 630 F.2d at 96 (citing cases).

Third, the court finds that the content of the test was not adequately related to the job of fire captain/lieutenant. See 28 C.F.R. §§ 50.14(14)(C)(4); 50.14(15)(C)(5). As discussed, it focused upon abilities easily testable by paper and pencil tests, tests which have been found by other courts to be inherently incapable of measuring the single attribute that most separates firefighters from fire captains — the ability to supervise. See, e.g., Guardians Association, supra, Firefighters Institute for Racial Equality, supra, 616 F.2d at 359; Firefighters Institute for Racial Equality, supra, 549 F.2d at 512. Moreover, such tests, while benefitting from the objective nature of their results and scoring suffered for having qualities common to written tests: they tested reading, memory and test-taking ability over any knowledge or abilities required for the job of fire captain. In so testing, many important abilities were omitted; less important ones were included. The third and fourth prongs of the validity test are not here satisfied. That no justification for utilizing the test as a rank-ordering device emerges from the job analysis leaves the fifth prong unfulfilled as well.

There is, of course, considerable risk in a matter of this type in permitting hindsight and picyune analysis to discredit a prior test. It is rare that a detailed analysis after the fact does not disclose some discrepancies, deficiencies and ambiguities in the test analyzed. Therefore it is important in judging the adequacy of defendants proofs in support of their burden to consider the totality of the test and its overall validity or lack thereof.

Here, it is clear from the form of each and every question that they tested the applicant’s knowledge and memory of specific sources. Rather than test for the specific knowledge needed to perform the job of fire captain, the questions primarily asked what a particular authority said. If the answer sought the knowledge necessary for the job the source should be immaterial.

Furthermore, the court is satisfied that the sole use of written multiple choice questions renders the test invalid. What is involved here is a position which involves tasks and characteristics which cannot be tested exclusively by such tests. Indeed, the most important aspects of the job are not susceptible of such testing, which tends, according to the writings of defendant’s expert, systematically to disadvantage minority test takers.

Finally, in having only a written test, undue emphasis is placed upon the reading ability of the applicants as well as their ability to deal with hypothetical and abstract situations.

In sum, for the reasons set forth herein, the court finds that the State’s promotional examination was wholly inadequate for purposes of selecting who ought to serve as fire captain/lieutenant. The job analyses conducted did not properly describe the job of captain/lieutenant, the test developed did not adequately measure the necessary skills, abilities or know-ledges, and the adverse impact of the State’s former selection procedure was not eliminated. See Consent Decree 117(a). Finally, the State did not consider alternative selection procedures made known to it by plaintiff, such as those utilized in St. Louis and elsewhere, which include oral interviews, assessment center exercises, etc. The court finds that such alternative procedures should have been considered and that the experience of other jurisdictions, as well as common sense, indicate that these procedures would have a far more valid claim to content validity than did the tests here at issue.11

*548For these reasons, the court hereby enters judgment for plaintiff on its May 7, 1984 motion to enforce the Consent Decree. The tests which gave rise to the promotional lists now in effect were invalid, and should not have been used by the state and forced upon the defendant municipalities. However, the court reserves decision on the consequences of this holding and urges the parties to come to some agreement regarding how such lists can be used, if at all, in order to minimize the harm to those who have been waiting on such lists for some length of time. If such agreement cannot be reached, the court will, of course, address the issue. Pending such resolution the court also reserves on plaintiff’s application for attorney’s fees.

CONCLUSION

The court expresses its concern for the time and expense devoted to this matter. The extremely qualified experts who were engaged by the parties might better have devoted their time and energies to developing a test for the future rather than critiquing those in the past. The present litigation results in a finding which negates the validity of tests already given but does little to assure the validity of those yet to be developed.

Absent cooperation between the parties, a new test may well be the subject of further litigation. As a result, cities in need of permanent leadership in their fire departments, applicants who have stood by patiently awaiting promotion, and minority firefighters who have been wrongfully denied that opportunity all will continue to flounder in a sea of uncertainty.

The need for judicial intervention in these matters ill serves the public interest and the private rights here involved.

Undoubtedly the courts, once again, will be blamed for the adverse consequences which will befall the successful candidates who will be denied promotion at this time because of this decision. But such criticism is akin to blaming the firefighter for causing a fire simply because he is called upon to extinguish it. What is oft described as judicial activism is more accurately characterized as executive inaction. Rather, than seek the court’s determination that a test is or is not valid, the public and the firefighters affected would have been better served if the parties had cooperated to create one that was.

. The single exception to the similarity of the captain’s exams vis a vis the lieutenant exams occurred with respect to the June 16, 1984 examination, in which a different number of questions was asked in the test for captains in Atlantic City, Elizabeth, East Orange and Trenton than for lieutenants in Passaic.

. To some extent, the findings in this section are based upon the depositions of municipal fire chiefs which, the defendant State points out, were held inadmissible as against it. However, the defendant State essentially agrees with the facts set forth in these depositions, and so states in its proposed findings. The court treats such facts, therefore, as stipulated.

. No SMEs from Hoboken attended either session.

. The evidence adduced at the hearing of this matter showed that certain problems arose in connection with the lieutenant analysis. Thus, although the State's fire examiner testified that the task statements and KASOs were agreed upon by all six participants, the evidence shows that there were differences in the wording and number of KASOs arrived at. Similarly, some SMEs did not, apparently, check that the percentages of time attributed to particular tasks totalled 100%. With respect to Passaic, for example, the percentages assigned by the two SMEs totalled 190% and 205%. Finally, the element of "preparation time,” which the fire examiner considered important, was not included in any lieutenant task statements.

The fire examiner attempted to rectify these problems, and utilized a different process in the fire captain session. He did, however, use the results of the lieutenant's session in constructing examinations for the three cities involved.

. Nonetheless, the court in no way means to cast aspersions upon the qualifications or competence of the State’s expert, Dr. Landy. Indeed, while the court here rejects many of his conclusions, it found his testimony to be extremely enlightening, helpful and candid, and wishes to express its appreciation for the assistance which he rendered to the court in this matter.

. Moreover, in evaluating how much time was spent on a particular task group, the lieutenant analysis omitted mention of preparation time altogether, while the instructions on preparation time given at the captain's session could not be determined from the testimony of the State fire examiner and expert.

. Moreover, on some questions, more than one answer was correct.

. Plaintiff notes that while the SMEs were to be appropriately experienced, some did not have the requisite one year in rank. It also challenges the extent to which such SMEs in fact showed above average job performance.

The State, for its part, concedes that the SMEs found it difficult to rate, on a 100-point scale, how much of the 35 cognitive and motor abilities were required by firefighters to complete each of the seventeen tasks arrived at by the State’s expert, within the three hour time period allotted. However, approximately equal amounts of judgments were made for each particular task by the SMEs.

. It may well be that some SMEs used an incorrect rating scale. It is undisputed that at least three did so. Additionally, plaintiff contends, and the evidence seems to indicate, that the raters assumed that the nine abilities listed by the expert as testable by multiple choice tests were sufficiently important to require high ratings. Notably some such abilities had been found inconsequential in the 1981 analysis. Thus, the ambiguity of the instructions may well have contributed to the bias of the 1984 analysis.

. The court also agrees with the plaintiff that it ought not consider the results of the examinations here at issue from their administrations in non-consent decree cities. Such cities have never been, and are not now the subjects of this litigation, which seeks to explore the question of discrimination in particular New Jersey municipalities. Whatever the result of the State’s tests in non-consent decree cities, their adverse impact remains established for the cities here before the court. Nor have intervenors or the State demonstrated that such adverse impact would not exist were these other municipalities to be considered. In light of the fact that defendant herein, including the State were not able to provide statistics regarding the non-consent decree municipalities, the burden so to demonstrate properly rests with defendants and intervenors. Indeed, the Uniform Guidelines provide that where a user, such as the State, does not maintain such data, as it has not here, an inference of adverse impact may be drawn. 28 C.F.R. § 50.14(4)(D). Hence, the court also rejects this ground of attack on plaintiffs assertion of adverse impact.

. Because the defendant State failed to demonstrate the job-relatedness of its test, plaintiff was not required to demonstrate the existence of alternative testing procedures existed which would serve defendants equally well without a similarly undesirable racial effect. See, e.g., Albemarle Paper Co., supra, 422 U.S. at 425, 95 S.Ct. at 2375.

Vulcan Pioneers, Inc. v. New Jersey Department of Civil Service

Related Cases