concurring in part and dissenting in part:
This court’s opinion leaves no doubt which way my colleagues would have voted on Barbour’s failure-to-promote claim had they been jurors. Our job as appellate judges, however, is not to weigh the evidence ourselves, but simply to assess its legal sufficiency. Because I find sufficient evidence in the record to support the jury’s failure-to-promote verdict, I cannot join that portion of the court’s opinion. In the end, however, I too would reverse, but for a different reason: I agree with EPA that the district judge made improper and prejudicial comments in the jury’s presence.
In a sense, the jury’s role in this case has now been usurped twice: first by the district judge, who jeopardized its impartiality with his prejudicial comments, and now by my colleagues, who have substituted their judgment for the jurors’. Because both sides were entitled to have this discrimination dispute resolved by the jury, see 42 U.S.C. § 1981a(e)(l) (1994), I respectfully dissent.
Failure to Promote
Beginning with the court’s discussion of the standard of review governing Barbour’s failure-to-promote claim, I think my colleagues’ formulation fails to capture the very limited scope of our role. It has long been settled law, as the court seems to acknowledge, see Maj. Op. at 1345-46, that the standard of review governing the jury’s verdict is “whether the evidence was sufficient for a reasonable jury to have reached [it].” Barbour v. Merrill, 48 F.3d 1270, 1276 (D.C.Cir.1995); see also Swanks v. WMATA, 179 F.3d 929 (D.C.Cir.1999). Nowhere in its opinion, however, does the court acknowledge that sufficiency challenges require us to view the evidence “in the light most favorable” to the prevailing party, and to give the prevailing party “the advantage of every fair and reasonable inference that the evidence may justify.” Coburn v. Pan American World Airways, Inc., 711 F.2d 339, 342 (D.C.Cir.1983) (internal quotation marks and citation omitted). Judgment as a matter of law is appropriate only “if the evidence, together with all inferences that can reasonably be drawn therefrom, is so one-sided that reasonable [jurors] could not disagree on the verdict.” Hayman v. National Academy of Sciences, 23 F.3d 535, 537 (D.C.Cir.1994) (internal quotation marks and citation omitted). Bearing this highly deferential standard in mind, I turn to the record in this case.
In 1990, Douglas Sellers, then section chief of EPA’s Public Information Section, began recruiting Joyce Barbour, an African-American who had worked for EPA for twelve years, to join his staff. See Trial Tr. 3/18/97 at 34, 38. Sellers needed someone to take over contract oversight duties that previously had been performed by Janette Peterson, a white employee who had worked for EPA for five years. See Trial Tr. 3/19/97 at 23-24, 42-45. Peterson had performed the contract oversight job for two years, first as a GS-12 and then as a GS-13. See id. at 28.
*1350When Barbour took over, she was given a GS-12 grade. She was also given a position description that EPA concedes had nothing to do with her actual job; rather, it had been written for a group of employees performing other tasks in a different section, and Barbour testified without contradiction that she never performed any of the duties detailed in it. See Trial Tr. 3/18/97 at 40. An EPA personnel officer- explained that this position description had not been properly updated following a 1986 division reorganization. See Trial Tr. 3/19/97 at 119-20.
In addition to the erroneous GS-12 position description, Sellers gave Barbour a set of “performance standards” — goals against which her performance would be measured — that he fashioned based upon Peterson’s GS-13 position description. See Trial Tr. 3/20/97 at 58; Trial Tr. 3/18/97 at 113. Sellers devised Barbour’s performance standards simply by photocopying the standards he had written for Peterson as a GS-13, neglecting on several pages even to change the grade from Peterson’s GS-13 to Barbour’s GS-12. See id. at 42.
Although Barbour did not immediately complain to Sellers about being a GS-12, she testified that Sellers took it upon himself to promise that he would promote her to GS-13 if she performed well for a year:
He brought it -up. As a matter of fact, I never brought it up because I didn’t have to. He always did. When I first took the job, he told me, “Joyce, if you’re in the job for a year and have your first performance evaluation, I see no reason why I can’t promote you, and I will.” And throughout the year,' as time went on, he constantly reminded me of that.
Id. at 44; see also id. at 109, 136. Never flatly denying that he made these statements, Sellers testified only that he “d[id] not remember ever promising her a promotion.” Trial Tr. 3/20/97 at 11.
At Barbour’s first annual performance review in October 1991, Sellers rated her as “[ejxceeds expectations,” just a few points below “outstanding.” See Trial Tr. 3/18/97 at 45. According to Barbour, Sellers again brought up the prospect of promoting her to GS-13, telling her, “ ‘Joyce, I see no reason why I can’t initiate promoting you in three to four months.’ ” Id. When Barbour reminded Sellers of his earlier promise to promote her upon her first evaluation, not months thereafter, Sellers instructed her to consult Sarsah McClean, the personnel officer, to find out what needed to be done to secure a promotion. See id. at 45^46. McClean told Barbour that because her position description (the concededly erroneous one) did not allow for promotion past GS-12, Sellers could only promote her either through a process called “accretion of duties,” or by creating a new GS-13 position for which she would have to compete. See Trial Tr. 3/19/97 at 114. Under the accretion-of-duties route, the employee’s supervisor writes a memo to the personnel officer explaining that the employee is actually performing duties at a level higher than the grade specified in the position description. See id. Although this process often includes a desk, audit — where the personnel officer sits down with the employee and examines the duties she is performing — the supervisor can request a waiver of a desk audit. See id. at 121-r22. From EPA’s Office of Personnel Management, Barbour confirmed that an employee can obtain an accretion-of-duties promotion without a desk audit so long as the supervisor agrees that the employee is actually performing duties at a higher grade level. See Trial Tr. 3/18/97 at 47.
Two months after her performance review, Barbour testified, she tried to confront Sellers-about the status of her promotion. See id. Although he initially attempted to avoid her, they finally got together in December, at which time Sellers told her that she would have to have a desk audit. See id. at 48-49. When Barbour explained that both McClean and OPM confirmed that he *1351had authority to waive the desk audit requirement, Sellers asked her to write a memorandum justifying her promotion to GS-13. See id. at 49.
In response, Barbour prepared a memorandum dated February 4, 1992, which relied primarily on the fact that she had the same performance standards that Peterson had when Peterson was a GS-13. See id. at 49-50. Claiming the memorandum was insufficient, Sellers told Barbour that he needed something detailing the duties she was actually performing. See Trial Tr. 3/20/97 at 16. Barbour prepared a second memorandum, this time appending to it a copy of Peterson’s GS-13 position description, which described the duties Peterson performed before Barbour took over her job. This second memorandum expressly asserted that Barbour was performing each task itemized in Peterson’s GS-13 position description. See Trial Tr. 3/19/97 at 161.
In May, Sellers wrote Barbour a memorandum of his own, agreeing that Peterson’s GS-13 position description was the relevant comparison, see Trial Tr. 3/20/97 at 17-18, but concluding that.he could not recommend her for promotion because she was not performing all of the duties detailed in that position description, see id. at 16-17. Sellers’s memorandum listed seven specific duties that, if Barbour began performing, would justify her promotion in six months. See id. at 17. Barbour testified not only that she was already performing most of those duties, but also that any duties that she was not performing Peterson had not performed either. See id. at 41-45. Four and a half years later, without assuming any additional duties, and without undergoing a desk audit, Barbour received a promotion to GS-13. See Trial Tr. 3/18/97 at 51, 53.
EPA makes three arguments challenging the sufficiency of Barbour’s evidence, none of which is persuasive. First, the agency argues that no reasonable juror could have found race discrimination based on a comparison between the experiences of Barbour and Peterson because, unlike Barbour, Peterson obtained her promotion by submitting to a desk audit. However, not. only did EPA’s personnel officer testify that a supervisor can waive a desk audit, see Trial Tr. 3/19/97 at 121-22, but the record contains at least three examples of employees in Barbour’s section who were promoted to GS-13 without desk audits: Sarsah McClean, Kimberly Orr, and Barbour herself, see id. at 109, 115. To be sure, only one of these three non-desk audit promotions occurred “in the early 1990’s.” Maj. Op. at 1346-47. That the other two promotions did not occur until 1996, however, is irrelevant absent evidence that EPA’s desk audit policy changed in the interim. EPA offered no such evidence. Indeed, toward the end of the trial EPA’s lawyer obtained leave from the district court to call an additional witness to testify on precisely this subject, see Trial Tr. 3/19/97 at 166, but inexplicably never did.
Not only does the promotion of these three employees to GS-13 without desk audits undercut EPA’s argument that Barbour and Peterson were not similarly situated, but it amounts to affirmative pretext evidence that reasonably could have led the jury to doubt the agency’s truthfulness. Assuming the role of jurors, however, my colleagues disregard Barbour’s evidence that EPA’s desk audit justification was false, concluding instead that the justification could not have been “intentionally deceitful” because EPA applied the putative desk audit rule to a white employee, not just to African-American employees. See Maj. Op. at 1347-48. It is true that Aka v. Washington Hospital Center suggests two hypothetical situations in which no reasonable juror could infer discrimination despite the demonstrated falsity of the employer’s asserted justification: where “the plaintiff shoots himself in the foot” by proving improvidently that the employer’s real motivation was something other than discrimination; or where the evidence undercutting the employer’s stated justification is weak and there is also “abundant *1352independent evidence in the record that no discrimination has occurred,” such as evidence that the employer “has a strong record of equal opportunity employment.” 156 F.3d 1284, 1291 (D.C.Cir.1998) (en banc). Neither hypothetical bears any relationship, to the facts of this case. Barbour never shot herself in the foot, and not only did EPA fail1 to introduce any evidence of a “strong” EEO record, but Barbour actually introduced evidence that the agency’s EEO record was poor. See infra p. 1354.
This court now creates a third situation in which evidence disproving an employer’s asserted justification cannot support an inference of discrimination: where the false justification has not been applied exclusively to African-Americans. This proposition assumes that an employer who tells the same lie. to two different employees necessarily does so for the same reason. Although this assumption may well be accurate in some situations, it-may be inaccurate in others. Under Aka, the jury was entitled to conclude that EPA’s false desk audit justification — viewed in light of all of the other record evidence of discrimination, see infra pp. 1352-55 — was pretext for race.discrimination even though as applied to Peterson it was not. My colleagues’ novel holding to the contrary creates an impenetrable legal safe harbor from Title VII liability: An- employer who has denied promotion to a minority employee ostensibly because of tardiness, writing deficiency, or inability to get along with others, for example, can render legally irrelevant all evidence demonstrating the falsity of that justification merely by asserting that it has denied promotion to a white employee for the same reason.
EPA next argues that no reasonable juror could have found race discrimination based on a comparison between the experiences of Barbour and Peterson because “Barbour failed to refute Sellers’ and Peterson’s testimony that the duties of the two women differed.” Appellant’s Br. at 16-17. EPA insists that the record demonstrates that Peterson performed fifteen task management duties as a GS-13 and that Barbour took over only seven, see id. at 17, but the portion of Sellers’s testimony it cites belies this assertion. While it is true that EPA’s entire contract with CBSI entailed a total of fifteen task management functions, no one — not even Peterson — testified that Peterson performed all fifteen. Indeed, the obvious gist of Sellers’s testimony was that Peterson was performing eight task management functions- — not fifteen — and that when Barbour took over she inherited all but one. See Trial Tr. 3/19/97 at 45-46; see also id. at 88-89. Asked at oral argument how many of Peterson’s task management functions Barbour would have to have performed before jurors could reasonably conclude that she and Peterson were “nearly identical” in all relevant aspects, EPA’s counsel, believing erroneously that Peterson had been performing all fifteen duties, conceded that thirteen out of fifteen would certainly suffice. Why then is it not sufficient for my colleagues that Barbour in fact took over seven out of eight?.
To be sure, the record reflects that in addition to those task management func- ■ tions that Barbour did inherit,' Peterson had been performing various GS-14 level policy functions that Bárbour did not inherit. According to my colleagues, that these policy functions are GS-14 functions, not GS-13 functions, is of no significance because the fact “[t]hat Peterson was capable of handling more important GS-14 level tasks is plainly relevant to whether she would acquit herself adequately in a GS-13 level position- — or so an employer is entitled to believe.” Maj. Op. at 1345-46. The question before us, however, is not what this court thinks an employer is entitled to believe, but whether the jury reasonably could have believed that the fact that Barbour performed no GS-14 ■ level policy functions was not the real reason why--EPA refused to promote her to GS-13. The record contains ample evidence to support such a conclusion.
*1353To begin with, when Sellers first hired Barbour, he did not tell her, “Joyce, if you’re in this job for a year and have your first performance evaluation, and if I determine at that time that you are performing not only GS-13 level functions but also GS-14 level policy functions like your friend Janette Peterson, I see no reason why I can’t promote you, and I will.” Quite to the contrary, the jury heard testimony that Barbour’s promised promotion in no way hinged on her performing GS-14 level functions. See Trial Tr. 3/18/97 at 44, 109, 136. In his May 1992 memorandum responding to Barbour’s promotion request, moreover, Sellers made no mention of her failure to take on GS-14 level policy duties; his memo focused exclusively on duties in Peterson’s GS-13 position description that he said Barbour would have to perform for six months in order to earn a promotion. See Trial Tr. 3/20/97 at 16-17. And in the end Barbour was promoted to GS-13 without taking on any additional GS-14 level policy duties. See Trial Tr. 3/18/97 at 52-53. If by pointing out that “this is not a contract case” my colleagues mean to suggest that a supervisor’s statements regarding promotion criteria are, as a matter of law, irrelevant to the question of pretext in Title VII cases, see Maj. Op. at 1345-46 n.*, they are mistaken.
The court’s conclusion that Barbour’s eventual promotion without assuming additional duties is somehow irrelevant because her performance may have improved between 1991 and 1996 is also mistaken. See id. at 1346. Just as Sellers’s testimony regarding Barbour’s improvement supports my colleagues’ belief about why EPA eventually promoted her, it likewise supports the jury’s apparent conclusion that EPA lied about its justification for not promoting her in the first place. If Sellers had testified that he refused to promote Barbour in 1991 because her performance of existing duties needed improvement— not that she needed to undergo a desk audit and take on additional duties (as he actually testified) — this court’s view of the evidence might well have carried the day in the jury. room. Weighing the evidence, my colleagues conclude for themselves that Sellers’s varying, statements were not “inconsistent,” id. at 1351-52 n.**, but this court has no authority to ignore the jury’s totally plausible conclusion that they were inconsistent. To be sure, the court correctly observes that “Barbour does not argue there is any conflict between” Sellers’s statements, id., but she made no such argument for h good reason: EPA itself never argued that it refused to promote her because her performance needed improvement — not at trial, not in its opening appellate brief, not in its reply brief, and not at oral argument.
Finally, EPA argues that no reasonable juror could have found race discrimination based on a comparison between the experiences of Barbour and Peterson because Peterson was a GS-12 task manager for two years before being promoted to GS-13, whereas Barbour sought her promotion after only one year. Once again, however, the jury reasonably could have concluded from abundant record evidence that this fact had nothing to do with Barbour’s non-promotion. In testimony that the jury was entitled to credit, Barbour said that Sellers expressly promised her that she would be promoted after one year, not two. See Trial Tr. 3/18/97 at 44, 109, 136. Then after one year, Sellers told her he would promote her in three to four more months, not twelve more months. See id. at 45. And EPA ultimately took six years, not two, to promote Barbour to GS-13. See id. at 51.
My colleagues give two reasons for distinguishing this case, from Aka. First, they say that unlike the plaintiff in Aka, “Barbour calls into doubt only part of the EPA’s proffered explanation for its refusal to promote her.” Maj. Op. at 1346^17. But Barbour actually called into doubt all of EPA’s proffered explanations: the putative desk audit requirement, which Barbour demonstrated was not just waivable in theory but actually waived for at least *1354three employees in her section; the fact that Peterson performed some GS-14 policy functions, which Barbour demonstrated had nothing to do with her eligibility for promotion to GS-13; and the fact that Peterson had an additional year of experience as a GS-12 task manager, which Barbour also demonstrated had nothing to do with her GS-13 eligibility. See supra pp. 1351-52. This case is thus just like Aka. There, as here, the record contained evidence from which a reasonable juror could conclude that the challenged employment decision was inexplicable absent invidious discrimination. As Aka said: “Events have causes; if the only explanations set forth in the record have been rebutted, the jury is permitted to search 'for others, and may in appropriate circumstances draw an inference of discrimination.” 156 F.3d at 1292.
As its second ground for distinguishing Aka, the court says that Barbour’s “apples-and-oranges” comparison of herself and Peterson fails to establish even a pri-ma facie case of race discrimination. Maj. Op. at 1347. This is a curious point given my colleagues’ concession that the entire burden-shifting paradigm is now irrelevant and that the only question before the jury was the “ ‘ultimate question of discrimination vel non.’ ” Id. at 1347 (quoting United States Postal Serv. Bd. of Governors v. Aikens, 460 U.S. 711, 715, 103 S.Ct. 1478, 75 L.Ed.2d 403 (1983)). But even taking the comparison issue on the court’s terms, the question is simply whether the jury reasonably could have concluded from the record that Barbour and Peterson were similarly situated in all relevant respects. Surely a hypothetical jury would be free to conclude that two employees were similarly situated for purposes of a given promotion even if the employer introduced evidence that one was more polite or better read than the other, so long as the record reasonably supported the conclusion that politeness or erudition were not relevant promotion criteria. The record in this case amply supports the jury’s apparent conclusion that Barbour was similarly situated to Peterson in all respects relevant to the GS-13 position.
Also missing from the court’s Aka discussion is any mention of the fact that in addition to Barbour’s evidence that she and Peterson were similarly situated with respect to the GS-13 position, and in addition to her evidence that each of EPA’s proffered justifications was pretextual, Barbour testified that EPA has a poor equal employment opportunity record with respect to African-Americans in her division:
The history of the program has been that minorities have been pretty much on the lower end of it. Out of 400 to 500 staff people, you only have, I’d say, maybe two section chiefs who were at the 14 level. One was temporary. 13’s in IMD out of my division, 50, 60 people, maybe four — maybe five or six 13’s who were African-American, if that many.
Trial Tr. 3/18/97 at 59. Perhaps there is a good answer to Barbour’s assertion. For example, perhaps these numbers — five or six African-American GS-13s out of fifty or sixty total GS-13s — actually reflect the availability of African-Americans in the relevant labor market. But EPA never offered any such evidence, nor did it move to strike Barbour’s testimony as either irrelevant or lacking in foundation. As Aka made clear, the jury could properly have considered Barbour’s unrebutted testimony in determining whether EPA failed to promote her because of her race. See Aka, 156 F.3d at 1295 n. 11.
Of course Title VII “does not authorize a federal court to become ‘a super-personnel department that reexamines an entity’s business decisions.’ ” Maj. Op. at 1346 (quoting Dale v. Chicago Tribune Co., 797 F.2d 458, 464 (7th Cir.1986)). But neither does Title VII authorize federal judges to become super-jurors, weighing evidence and drawing independent conclusions regarding the ultimate question of discrimination. I have certainly seen stronger Title VII cases than this one; indeed, had *1355I been a juror, I might well have cast my vote for the employer. But acknowledging that the merits of this case are debatable is a far cry from holding that no rational person could agree with the jury’s conclusion.
Racial Harassment
I do agree with .my colleagues that the record contains insufficient evidence to support the jury’s conclusion that CBSI’s treatment of Barbour rose to the level of actionable racial harassment. Even giving Barbour “the advantage of every fair and reasonable inference that the evidence may justify,” Cobum, 711 F.2d at 342, the most this record demonstrates is that CBSI employees sometimes put Barbour’s requests at the bottom of the pile, and that on one occasion a CBSI employee turned her back on Barbour in a meeting. Though we must not reverse a jury verdict unless the evidence “is so one-sided that reasonable [jurors] could not disagree,” Hayman, 23 F.3d at 537, I cannot fathom on what basis the jury could have determined that Barbour’s “ ‘workplace [was] permeated with discriminatory intimidation, ridicule, and insult that [was] sufficiently severe or pervasive to alter the conditions of [her] employment and create an abusive working environment.’ ” Oncale v. Sundowner Offshore Services, Inc., 523 U.S. 75, 118 S.Ct. 998, 1001, 140 L.Ed.2d 201 (1998) (quoting Harris v. Forklift Sys., Inc., 510 U.S. 17, 21, 114 S.Ct. 367, 126 L.Ed.2d 295 (1993)).
Perhaps the answer is this: The jury never made that determination because it was never instructed regarding the meaning of the legal term of art “harassment.” The only instruction the district court gave the jury with respect to Barbour’s harassment claim was the following:
[T]he plaintiff must show ... that [she] gave notice to the defendant ... that racial harassment was being engaged in by the corporation or by the employees of the contractor, and that the defendant failed to take ... prompt and adequate remedial action against it.
Trial Tr. 3/21/97 at 46. The jury thus had no way of knowing that to rule for Barbour, it had to find not just “harassment,” but “severe or pervasive”, harassment. Although EPA does not raise this issue, I suspect the district court’s incomplete instruction may explain the jury’s untenable harassment verdict. •
The District Court’s Comments on the Evidence
Since I would affirm the district court’s denial of judgment as a matter of law on Barbour’s failure-to-promote claim, I must address EPA’s alternative argument that it is nonetheless entitled to a new trial because the district court prejudiced the jury through improper comments on the evidence. Because I agree with EPA that the district court’s comments were prejudicial, I would reverse and remand for a new trial.
Federal judges have “inherent authority ... to comment on the evidence,” United States v. Liddy, 509 F.2d 428, 438 (D.C.Cir.1974), but that authority “is not arbitrary and uncontrolled, but judicial, to be exercised in conformity with the standards governing the judicial office,” Quercia v. United States, 289 U.S. 466, 470, 53 S.Ct. 698, 77 L.Ed. 1321 (1933). Judges must “ ‘use great care that an expression of opinion upon the evidence should be so given as not to mislead, and especially that it should not be one-sided.’ ” Wabisky v. D. C. Transit Sys., Inc., 326 F.2d 658, 659 (D.C.Cir.1963) (quoting Quercia, 289 U.S. at 470).
Applying this standard, I believe the trial judge crossed the line by making statements that the jury could have viewed as signaling not just his hostility toward the agency, but also that he believed the evidence demonstrated that Barbour was. a victim of discrimination. For example, in overruling an EPA objection during Barbour’s cross-examination, the district judge said this:
*1356Let me just give you the reason why I overruled your objection. As far as I am concerned, in these discrimination cases coming out of federal agencies, the agencies have all the powerful people in there, from the director or chairman or administrator on down; they have all the records; they have all the files; they make up the rules; and they can go on and on, and the person who is complaining about them is usually alone, with just one lawyer and maybe a couple of people who also claim they are discriminated against. When they come to court, which is the first time that they come to a place where justice is done— where people don’t protect each other, where people don’t agree with each other from the lowest to the highest — here they get a fair shake and here they get a chance to talk, and they are going to get a chance to talk as long as I am here whether you object to it or not.
Trial Tr. 3/20/97 at 54-55. At another point, the judge responded to the testimony of a defense witness (an EPA employee) by stating: “No wonder the public and the Congress are upset about agencies in Washington.” Id. at 38. In a trial like this, where the agency’s veracity was central to its defense, I can hardly imagine anything more prejudicial than for the judge to tell the jury that agencies like EPA “have all the power[],” that they “make up the rules,” that they cover up for each other, that they refuse to do justice until hauled into court, and that the public no longer has any confidence in them.
The judge also challenged Sellers’s credibility: “That’s under oath? You are testifying under oath?” Id. at 13. Because Barbour’s failure-to-promote claim ultimately hinged on Sellers’s credibility, our statement in United States v. Tilghman applies here as well: “Because juries, not judges, decide whether witnesses are telling the truth, and because judges wield enormous influence over juries, judges may not ask questions that signal their belief or disbelief of witnesses.” 134 F.3d 414, 416 (D.C.Cir.1998).
District judges certainly enjoy wide discretion to manage trials, including questioning witnesses aggressively and commenting on the evidence. In fact, most of the judge’s comments that EPA challenges were not at all inappropriate. But because of the particular comments discussed above, I think the district court went too far. Indeed, the judge’s comments may well help explain why the jury ruled for Barbour on this relatively weak (though sufficient) record. Just as Barbour deserved to have her case decided by the jury without improper judicial interference, so did EPA.