OPINION
O’SCANNLAIN, Circuit Judge.We must decide whether the district court fulfilled its obligation to ensure that the testimony of an expert witness was sufficiently reliable before it was presented to the jury.
I
California State University, Hayward (“CSUH”) hired Mohamed Osman Elsayed Mukhtar (“Elsayed”), a Muslim of Sudanese origin,1 in 1990 as a tenure-track *1057professor in its Mass Communications Department to teach broadcast television and other foundational courses. He was the first black tenure-track professor hired by the department. CSUH eventually denied Elsayed’s application for tenure. CSUH claims it did so because he lacked sufficient scholarly activity; Elsayed contends it was because of his race, religion, and national origin. Due to the nature of the dispute, it is helpful to trace Elsayed’s career at CSUH, starting from the time of his hiring.
While the Ph.D. degree is normally a pre-requisite for hire in CSUH tenure-track positions, the University makes exceptions for applicants close to completing the doctorate. Non-degreed candidates, such as Elsayed when he started at CSUH in 1990, are generally expected to complete the Ph.D. within their first year of employment. Elsayed, however, did not receive his Ph.D. from the University of Missouri until December 1995. His consistent failure to obtain his Ph.D. during his five-year probationary period at CSUH was a source of great concern among Elsayed’s superiors. During that five-year period, he did not publish a single article. He did, however, consistently receive positive student evaluations and was named departmental Teacher of the Year for the 1995-96 and 1996-97 academic years. He mentored students of color, served on the Multicultural Council, and was active in the Black Faculty Association. Elsayed also participated in international humanitarian and community organizations, such as the Islamic African Relief Agency and the American Society of Humanitarian Aid and Development. He made oral presentations in numerous workshops and seminar’s, but none involved presenting written work.
During Elsayed’s first year at CSUH, he received a reduced teaching load to enable him to complete his dissertation, as is customary. In his first annual evaluation in January 1991, Alan Smith, Dean of the School of Arts, Letters, and Social Sciences (“ALSS”), noted that if Elsayed “has not completed his degree by the time of his 1991-92 evaluation for retention, the dossier should include a statement from the candidate concerning the schedule for the completion of his degree.” He also commented that Elsayed was “an enthusiastic, conscientious, and accessible teacher with a genuine interest in the education of his' students.” Finally, he noted that El-sayed had an article accepted for publication by Gazette, a mass communications journal. However, at trial Elsayed admitted that the editor had simply “showed interest,” and the article was never published.
In Elsayed’s second annual evaluation in January 1992, Robert Terrell, Chair of the Department of Mass Communications, expressed concern at Elsayed’s failure to obtain a Ph.D.:
We feel it incumbent upon us to point out at this juncture that significant allowances/considerations were made particularly during the first year of employment to free [Elsayed] from the traditional obligations of the appointment in order that he might be encouraged to complete the degree. Such was not the case. We are now informed that completion is anticipated during the summer, 1992. We point out the seriousness of the condition to recommend to [Elsayed] that future evaluations mil bear heavily on his showing good faith and meeting this obligation by this neto target date.
(emphasis added). Terrell noted the “uniformly positive response” to Elsayed as a teacher; his evaluations were “more than generally enthusiastic.” Terrell also encouraged him to “broaden his horizon and seek positions in the faculty governance *1058structure,” rather than limiting himself to the Multicultural Council. Finally, he recommended that Elsayed “adhere more stringently to the specific obligations attendant to a faculty appointment ... namely, keeping posted office hours and regular attendance at departmental meetings.”
Dean Smith also warned Elsayed in January 1992 that “failure to complete this essential task [of finishing his Ph.D.] will negatively influence his chances of maintaining his current relationship with the university.” While noting his “impressive” classroom skills, Smith also recommended that Elsayed assume more responsibility for the department’s administrative obligations.
In his third evaluation in April 1993, Terrell expressed frustration over El-sayed’s lack of a Ph.D., going so far as to recommend that his employment at CSUH be terminated at the end of the 1993-94 academic year if he did not complete his dissertation by the end of summer 1993.
Last year Professor Elsayed was informed that the failure to complete his dissertation would “negatively influence his chance of maintaining his current relationship with the university.” This observation has been repeated on numerous occasions since. Professor El-sayed was released from all responsibilities during the summer of 1992 so he would have ample time to complete his dissertation. At the end of the summer, he promised to be finished by December. In December he promised to be finished by June. He currently promises to be done by the end of the summer of 1993. Professor Elsayed’s failure to complete his dissertation, despite numerous chances to do so, raises serious questions regarding his continued employment at CSUH.
(emphasis added). While recognizing that Elsayed was “popular with many students,” Terrell noted that his “teaching evaluations indicate an ongoing problem with organization.” Furthermore, Elsayed “has done little or no publishing since arriving at CSUH three years ago.” Finally, “he rarely participates in mainstream faculty governance procedures,” despite being “strongly urged to broaden his involvement with the campus community.”
In April 1993, Mary Cullinan, Interim Dean of ALSS, echoed Terrell’s concern regarding Elsayed’s lack of a Ph.D. She noted that Elsayed was a “popular instructor,” but should “create detailed syllabi for his classes, provide ample office hours for students, and give students early and ample feedback on how they are doing.” She also highlighted his failure to complete a number of professional works in progress.
The fourth evaluation, in March 1994, noted that Elsayed did not meet the summer 1993 Ph.D. deadline and set the new deadline for summer 1994. Terrell again stated that Elsayed’s failure to complete his dissertation “might have significant, negative consequences.” He observed that although Elsayed was a “sensitive, caring professor,” he did not grade assignments in a timely fashion and taught in a “disorganized fashion.” Terrell encouraged him to “devote more time and attention to becoming a better organized teacher” and expressed concern over Elsayed’s practice of occasionally having students do portions of his work, which, Terrell felt, came “perilously close to being a violation of professional ethics.”
Terrell wrote that “[t]oo many of the activities [Elsayed] cites as evidence of his professional achievement are insufficiently centered on mass communication.” He recommended that Elsayed “devote more time and attention to producing articles and other texts such that he can clearly demonstrate his mastery of scholarship.” He continued to note Elsayed’s lack of *1059campus involvement, stating that he “needs to get more directly involved in advising students, working with alumni, participating in faculty governance, [and] representing the department at campus and professional meetings.” In short, “Professor Elsayed is not pulling his fair share of the department’s load at this time.” Terrell ended his evaluation with this stern warning: “I strongly recommend that [Elsayed] be informed — in the bluntest possible fashion — that prolonging completion of his dissertation will be a mistake of the most serious sort.”
The new Dean of ALSS, Carlos Navarro, reiterated Terrell’s concerns, stating that Elsayed needed “to make dramatic improvements” in the area of campus involvement. He also noted that Elsayed’s position would be strengthened if he contributed to scholarly publications and organizations.
Elsayed’s fifth evaluation in March 1995 indicated that the Mass Communications Promotion and Tenure Committee (“Tenure Committee”) was “troubled” by El-sayed’s “lack of progress toward attaining a doctorate.” It “urge[d] him to complete some of his works in progress,” and advised that it was “important” for him to “publish[ ] in the appropriate journals in his field or show evidence of completed scholarly productivity.”
Dean Navarro’s May 1995 evaluation recommended that Elsayed be retained for a sixth year, “but not without grave reservations about the quality of [his] faculty profile,” including the lack of the Ph.D. Dean Navarro again recognized that El-sayed received “strong teaching evaluations”; however, “unfortunately, there [were] no peer evaluations which allow him to be evaluated by professionals in his area of expertise.” In his letter, Dean Navarro recommended that an independent outside reviewer assess Elsayed’s professional achievement. Again noting that Elsayed “has not played any significant role in the life of the department and the university,” despite being advised to, he warned that Elsayed “needs to make dramatic improvement in this area.”
With respect to Elsayed’s scholarly activities, Dean Navarro wrote:
There is very little evidence, if any, that the candidate has made any serious efforts to publish or to present papers or productions at professional meetings .... This candidate has not apparently seen this area as part of his professional responsibility to strive towards excellence. Certainly papers based on his dissertation in progress might have been presented at scholarly meetings. It would not be unfair to argue that this candidate has demonstrated iveakness in his scholarly activity.
(emphasis added). Elsayed testified that it took him five years to receive his Ph.D. because one committee chair died, the second left to teach elsewhere, and the third took a sabbatical. Thus, he had to take frequent trips to Missouri to recruit three new chairs and familiarize them with his methodology, literature review, and preliminary analysis and findings.
In the fall of 1995, Elsayed applied for tenure. CSUH’s tenure policy states: “[T]enure constitutes more than a recognition of past teaching performance and scholarly work. It is a judgment by the faculty that the individual will contribute in the future to the development of the University.” Faculty are evaluated on five criteria, which are ranked in order of most to least weighty: (1) possession of a Ph.D degree (required), (2) instructional achievement, (3) professional achievement, (4) internal university contributions, and (5) external representation. A candidate must clearly satisfy all the criteria, but the tenure policy provides that “[exceptional ratings on one or more of the criteria may *1060offset minor deficiencies with respect to other criteria.” (emphasis added).
CSUH President Norma Rees has the ultimate authority to make tenure decisions. Prior to her review, a candidate’s dossier is reviewed by three separate Tenure Committees — the department committee, the school committee, and the University committee — as well as the department chair and school 11471 dean. Each makes an independent assessment of the candidate without relying on the previous evaluation.
At the time he applied for tenure, El-sayed reported completing his dissertation and anticipated receiving his Ph.D in December 1995. He also reported that his article, “Elijah, Wallace and Farrakan: The Growth and Schism of the Black Muslims as Reflected In Their Newspapers, 1930-1984,” had been accepted for publication in the refereed journal of the International Institute of Islamic Thought, The American Journal of Islamic Social Sciences (“AJISS ”). Contrary to his original report, the article had simply been recommended for publication sometime in 1996. It was never published. Elsayed testified that he asked AJISS to suspend publishing his article because Dean Navarro and President Rees had told him that AJISS was not an appropriate scholarly publication.
The Mass Communications Departmental Committee, by a vote of 2-1, recommended Elsayed for tenure. The person voting against tenure believed that “the delayed activity toward achieving the Ph.D. degree signals a lack of promise in professional achievement.” The ALSS School Committee, by a vote of 4-1, also endorsed Elsayed. The dissenter found “that the areas of [teaching] strength did not compensate for an overall weak profile of achievement and contribution.” CSUH Provost Frank Martino testified that departmental and school recommendations for tenure are typically unanimous, and a “split vote signals trouble.”
Dean Navarro recommended against tenure, finding a “dearth of evidence in the area of professional achievement.” He did not regard AJISS as a journal of Elsayed’s peers, and Elsayed had not produced any other papers or documentary films during his five years at CSUH. Also lacking was any “significant evidence of contributions in the area of committee or faculty governance work.” Dean Navarro received the report of an outside reviewer he commissioned to evaluate Elsayed, Professor John Hewitt.2 Dr. Hewitt found Elsayed’s professional achievement “a bit thin,” but “would expect him to blossom out in the next few years with more work.” Regarding internal university contributions, Hewitt noted that he “would expect more from a junior faculty member.” Nevertheless, Hewitt recommended Elsayed for tenure.
The University Committee voted 3-2 in favor of tenure. The dissenters found that Elsayed’s instructional achievement did not offset “serious deficiencies” in the area of professional achievement and internal university contributions. President Rees was troubled by the split votes; ten faculty (including Dr. Hewitt, the outside reviewer) had recommended tenure and five had not. Rees was not impressed with El-sayed’s professional achievement, and she found little evidence that Elsayed challenged his students or developed creative learning procedures. In her letter to El-sayed informing him of her decision, she wrote that the reason for denial was that he “did not meet university standards in the areas of instructional and professional achievement.”
*1061Elsayed grieved the denial of tenure. The arbitrator found procedural errors in the tenure review process and remanded for further consideration. In late 1997, the new interim Dean of ALSS, James Fay, conducted a second tenure review. This time, both the Departmental and School Tenure Committees’ recommendations were against granting Elsayed tenure.
Dean Fay found that Elsayed’s “failure to publish, or to present alternative evidence of sufficiently vigorous scholarly endeavor, renders his professional achievement below the standard ... needed for tenure.” As of December 1997, Elsayed still had “no publications to his credit,” and no one at the AJISS could confirm that Elsayed’s article was going to be published. In April 1998, the five-member University Committee unanimously recommended against tenure, finding insufficient documentation of instructional or professional achievement.
In May 1998, Rees again denied tenure. Elsayed grieved the second denial, and the same arbitrator again found in his favor, directing that Elsayed be awarded tenure. However, a state court vacated the arbitration award.
Elsayed then brought this employment discrimination action under Title VII of the Civil Rights Act of 1964, 42 U.S.C. § 2000e, against CSUH and Navarro, Rees, and Martino in their official and individual capacities. At trial, both sides presented witnesses who testified as to whether Elsayed was qualified for a tenured appointment. Dr. David Wellman, an expert witness who has devoted much of his career to investigating how racism persists without open bigotry, testified that race was a factor in CSUH’s decision to deny Elsayed tenure. After a nine-day jury trial, the jury found for Elsayed and awarded him $637,000 in economic, emotional distress, and punitive damages.3
CSUH’s timely appeal followed.
II
On appeal, CSUH challenges two of the district court’s evidentiary rulings, which, it argues, were not harmless.
A
First, CSUH contends that the district court erred by allowing Elsayed to testify that, after the second round of arbitration, the arbitrator ordered that he be granted tenure because CSUH did not have reasonable support for its decision.4 Having the jury hear the decision of a quasi-judicial factfinder caused unfair prejudice, CSUH argues.5 Federal Rule of Evidence 403 provides that evidence, even if relevant, “may be excluded if its probative value is substantially outweighed by the danger of unfair prejudice.” We have emphasized that it is important to view the challenged evidence in light of the record, *1062as a whole. United States v. Nguyen, 284 F.3d 1086, 1090 (9th Cir.2002).
Here, the district court allowed the testimony because it did not want “to artificially exclude facts in the narrative,” so the arbitration could be “mentioned, but [not] played up in any way.” When Elsayed testified about the arbitrator’s decision, the judge explicitly told the jury that it had been vacated by a state court, and Elsayed himself also testified that the arbitrator’s grant of tenure had been vacated. Finally, the jury instructions included the information that “[t]he university asked the state court to vacate the arbitrator’s award, and the state court did so.”
While we agree that the prejudice of having the jury hear this marginally relevant evidence would normally outweigh the value of providing the jury with the case’s entire factual history, the jury repeatedly heard that a state court had vacated the arbitrator’s decision. Because it had the “full story,” we believe that the jury was able to give the arbitrator’s decision the weight it merited — very little. As such, any prejudice flowing from learning about the arbitrator’s ruling was immediately (and repeatedly) minimized by the revelation that a state court had vacated it. CSUH’s argument fails; the district court did not abuse its discretion under these circumstances.
B
Second, CSUH challenges the admission of testimony by Dr. Wellman, Elsayed’s expert on racial discrimination. Dr. Well-man has developed eight criteria for “decoding” white behavior — all of which he found present in CSUH’s decision to deny Elsayed tenure:
a. The University’s justification for denying tenure lacked “credence;”
b. Tenure criteria were applied inconsistently;
c. Inconsistent. tenure criteria advantaged whites and disadvantaged blacks;
d. Tenure criteria shifted when challenged;
e. Statistical evidence showed disparate treatment;
f. Procedural violations occurred in the tenure process;
g. University officials trivialized and dismissed Elsayed’s qualifications and accomplishments; and
h. University officials failed to follow procedures for reducing racial inequality.
1
As a threshold matter, Elsayed argues that because CSUH failed to make a contemporaneous objection to Dr. Well-man’s testimony, we should review the district court’s decision to admit his testimony for plain error. See United States v. Varela-Rivera, 279 F.3d 1174, 1177 (9th Cir.2002); see also Sablan v. Dep’t of Finance, 856 F.2d 1317, 1323 (9th Cir.1988) (“[A] plain error standard ... requires us to consider whether the alleged error was highly prejudicial and whether the error affected the substantial rights of the[appellant].”). However, CSUH made explicit objections to Dr. Wellman’s testimony in its motion in limine, which the district court denied.
Contemporaneous objection is not required where, as here, the trial court definitively ruled on a motion in limine after exploring CSUH’s objection. See Fed.R.Evid. 103(a)(2) (“Once the court makes a definitive ruling on the record admitting or excluding evidence, either at or before trial, a party need not renew an objection or offer of proof to preserve a claim of error for appeal.”); see also Vare*1063la-Rivera, 279 F.3d at 1177-78; Scott v. Ross, 140 F.3d 1275, 1285 (9th Cir.1998).
Here, the district court made clear at the outset of the trial that it was admitting Dr. Wellman’s testimony subject to the limitation that he not offer a legal conclusion. However, it declined to exclude Dr. Wellman’s testimony concerning an ultimate factual issue, i.e., whether race was a factor in CSUH’s decision to deny Elsayed tenure. Dr. Wellman’s testimony stayed within these parameters, and because he did not violate the district court’s in limine ruling, no additional objection was necessary. Therefore, we are satisfied that CSUH properly preserved the arguments it now makes before us, and, thus, we review the district court’s evidentiary rulings for an abuse of discretion.6 See Varela-Rivera, 279 F.3d at 1177.
2
Federal Rule of Evidence 702 allows admission of “scientific, technical, or other specialized knowledge” by a qualified expert if it will “assist the trier of fact to understand the evidence or to determine a fact in issue.” Expert testimony is admissible pursuant to Rule 702 if it is both relevant7 and reliable. Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 589, 113 5.Ct. 2786, 125 L.Ed.2d 469 (1993). The trial court must act as a “gatekeeper” to exclude “junk science” that does not meet Rule 702’s reliability standards by making a preliminary determination that the expert’s testimony is reliable. See Kumho Tire Co. v. Carmichael, 526 U.S. 137, 147-48, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999); Gen. Elec. Co. v. Joiner, 522 U.S. 136, 142, 118 S.Ct. 512, 139 L.Ed.2d 508 (1997); Daubert, 509 U.S. at 589-90, 592-93, 113 S.Ct. 2786. As the Supreme Court emphasized, however, “[t]he inquiry envisioned by Rule 702 is ... a flexible one,” Daubert, 509 U.S. at 594, 113 S.Ct. 2786, and must be “tied to the facts of a particular case.” Kumho Tire, 526 U.S. at 150, 119 S.Ct. 1167 (quotation marks omitted).
The trial court’s “special obligation” to determine the relevance and reliability of an expert’s testimony, Kumho Tire, 526 U.S. at 147, 119 S.Ct. 1167, is vital to ensure accurate and unbiased decision-making by the trier of fact. Kumho Tire described the “importance of Daubert’s gatekeeping requirement ... to make certain that an expert ... employs in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field.” Id. at 152, 119 S.Ct. 1167. Or, more specifically, the trial judge must ensure that “junk science” plays no part in the decision. Maintaining Daubert’s standards is particularly important considering the aura of authority ex*1064perts often exude, which can lead juries to give more weight to their testimony.8
Daubert provided a non-exhaustive list of factors for determining whether expert testimony is sufficiently reliable to be admitted into evidence, including: (1) whether the scientific theory or technique can be (and has been) tested, (2) whether the theory or technique has been subjected to peer review and publication, (3) whether there is a known or potential error rate, and (4) whether the theory or technique is generally accepted in the relevant scientific community. Daubert, 509 U.S. at 593-94, 113 S.Ct. 2786; see also Kumho Tire, 526 U.S. at 141, 119 S.Ct. 1167 (holding that Daubert’s factors apply to testimony based on technical and specialized knowledge, not just scientific knowledge).
A trial court not only has broad latitude in determining whether an expert’s testimony is reliable, but also in deciding how to determine the testimony’s reliability. United States v. Hankey, 203 F.3d 1160, 1167 (9th Cir.2000); see also Kumho Tire, 526 U.S. at 152, 119 S.Ct. 1167. Indeed, a separate, pretrial hearing on reliability is not required. E.g., United States v. Alatorre, 222 F.3d 1098, 1102 (9th Cir.2000) (“Nowhere in Daubert, Joiner, or Kumho Tire does the Supreme Court mandate the form that the inquiry into relevance and reliability must take.”); Hopkins v. Dow Corning Corp., 33 F.3d 1116, 1124 (9th Cir.1994). Surely, however, the trial court’s broad latitude to make the reliability determination does not include the discretion to abdicate completely its responsibility to do so.
3
Thus, the question we must answer: did the district court fulfill its gatek-eeping function and determine that Dr. Wellman’s testimony was reliable? In response to CSUH’s motion in limine to exclude Dr. Wellman’s testimony, the district court ordered each side to submit Daubert briefs. Before ruling that Dr. Wellman’s testimony was admissible, the district court reviewed two briefs in support of CSUH’s motion, three opposition briefs, two declarations from Dr. Wellman, excerpts from Dr. Wellman’s deposition, his preliminary report, and his curriculum vitae. On the first day of the trial, the district court decided to admit his testimony, but without any discussion of its reliability.
The district court’s analysis of and ruling on CSUH’s motion in limine consisted in its entirety:
Well, I see Wellman and [CSUH’s expert witnesses on Elsayed’s academic qualifications] as essentially parallel, and I would prefer that none of them express their own opinion about whether this decision was right or wrong. But if any of them are going to, then I guess all of them have to.
And since both sides have prepared on the basis that they all will, I suppose you would prefer that I let all of them do it, rather than none of them do it. But I really would want to downplay as much as possible any of them substituting their judgment for what the jury ultimately has to find, which is whether, in fact, this decision was based on race discrimination or based on legitimate academic concerns.
They each have their own opinion, which is essentially what the jury will have to decide, so I don’t exactly know how we’re going to avoid having each of them go through all of the evidence and *1065essentially deliberate as jurors and argue about which — what means what and what factor goes which way.
It’s not really appropriate, so I guess we’ll just have to try to keep it as brief as possible, in Dr. Wellman’s case, on general factors that would lead to such decisions, and likewise in [CSUH] expert’s case on general factors that would lead to such decisions, as opposed to their trying to convince the jury of their own view of what the truth is of what the underlying state of mind was.
So that’s about all the guidance I can give on that.
Then, in response to a comment from CSUH’s counsel, the district court commented that
each [expert] is opining on the true reason when, in fact, it’s the jury who has to decide what the true reason is. So I guess we’ll just have to play it by ear in terms of trying to direct both of their testimony towards general factors that the jury can apply, as opposed to making — presenting their opinions of how the decision should be made based on the evidence.
And I don’t want to hear them each go through all the evidence and say “this means this and that means that,” and “I read this testimony and that testimony.” That will take too long and it really will invade the province of the jury.
So I guess Wellman will be first, and so we’ll just have to apply what guidelines we can to his testimony and then the same guidelines will be applied to [CSUH] witness’ testimony.
As is apparent from the above recital, the district court said nothing about the reliability of Dr. Wellman’s testimony.9 It appears that the district court was concerned only with whether the expert witnesses would testify on an “ultimate issue” that is properly for the jury to decide.10 *1066In fact, the only indication we have that the district court found Dr. Wellman’s testimony reliable is the fact that it was admitted over CSUH’s reliability objections. Surely Daubert and its progeny require more.11
In Alatorre, we upheld the district court’s admission of expert testimony; it had ruled on the testimony’s reliability after allowing detailed voir dire of the expert in front of the jury. Alatorre, 222 F.3d at 1105. We were careful to distinguish United States v. Velarde, 214 F.3d 1204 (10th Cir.2000), where the Tenth Circuit reversed and remanded for a new trial because the district court failed to conduct any reliability determination, instead deciding that since it had “had this [expert] testimony before in trials, and it’s not new and novel,” it was admissible. Id. at 1208. Velarde held that “[wjhile ... the trial court is accorded great latitude in determining how to make Daubert reliability findings before admitting expert testimony, Kumho and Daubert make it clear that the court must, on the record, make some kind of reliability determination.” Id. at 1209 (emphasis in original).
Instead, we found Hankey, where the district court conducted extensive voir dire of the expert witness, on point. Hankey, 203 F.3d at 1168-69. Importantly, the district court in Hankey “made findings that the foundation for [the expert’s] opinions was relevant and reliable.” Id. at 1170. Likewise, in Alatorre, the district court “after voir dire ... ruled on the relevance and reliability of [the expert’s] testimony.” Alatorre, 222 F.3d at 1105.
The fact that we drew a distinction between the district court’s explicit findings of reliability in Alatorre and Han-key and the district court’s complete failure to make any reliability finding in Velarde suggests that we require a district court to make some kind of reliability determination to fulfill its gatekeeping function. Here, we find the district court’s reliability determination (or lack thereof) analogous to the district court’s failure in Velarde. We agree with the Tenth Circuit that “some ... reliability determination must be apparent from the record” before we can uphold a district court’s decision to admit expert testimony. Velarde, 214 F.3d at 1210. As such, we must conclude that the district court abdicated its gatekeeping role by failing to make any determination that Dr. Wellman’s testimony was reliable and, thus, did not fulfill its obligation as set out by Daubert and its progeny.12
Ill
Despite the district court’s evidentiary error in admitting Dr. Wellman’s testimony without a reliability finding, the jury’s verdict is reversible on appeal only if CSUH can demonstrate that the error was not harmless, ie., “a party must demonstrate that the allegedly erroneous evidentiary ruling more probably than not was the cause of the result reached.” Jauregui v. City of Glendale, 852 F.2d 1128, 1133 (9th Cir.1988); United States v. Rahm, 993 F.2d 1405, 1415 (9th Cir.1993). If we are unable to say that the probabilities favor the same result and are unsure whether the error was harmless, a new *1067trial is required. United States v. Mitchell, 172 F.3d 1104, 1111 (9th Cir.1999).
CSUH argues that Dr. Wellman’s testimony was not harmless because it was cloaked in authority and addressed the central element of Elsayed’s case. See United States v. Arenal, 768 F.2d 263, 270 (8th Cir.1985) (finding prejudice because the erroneously admitted testimony enjoyed an expert’s “aura of expertise”). Without Dr. Wellman’s testimony, CSUH asserts, Elsayed’s evidence could show only a difference of academic opinion regarding his tenure qualifications. Thus, because the remaining evidence was evenly divided and contradictory, Dr. Wellman’s testimony was not harmless.
To establish racial discrimination in the employment context, Elsayed must demonstrate that the reason CSUH gave for denying Elsayed tenure — lack of scholarly achievement — is a mere pretext for illegal racial discrimination. Reeves v. Sanderson Plumbing Prods., Inc., 530 U.S. 133, 147, 120 S.Ct. 2097, 147 L.Ed.2d 105 (2000). Thus, we look to what evidence Elsayed presented, other than Dr. Wellman’s testimony, that would tend to establish pretext.13 Six CSUH professors testified that they believed Elsayed was qualified for a tenure appointment, and Dr. Hewitt, the outside expert hired to evaluate Elsayed, also recommended tenure despite his misgivings. The jury knew of Elsayed’s instructional achievement, book-length dissertation, and article recommended for publication in AJISS, which, in fact, was never published. Finally, both arbitrations found procedural errors in the tenure process, although the second arbitration had been vacated in state court. Thus, Elsayed argues, there was plenty of evidence, even apart from Dr. Wellman’s testimony, upon which the jury could have based its decision. We are not persuaded.
Once Dr. Wellman’s testimony is excluded, the remaining evidence seems to indicate, at most, a mere difference of academic opinion — not discrimination — and does not undermine the University’s nondiscriminatory reason for denying Elsayed tenure.14 Indeed, academic tenure decisions involve subjective judgments on scholarship that neither courts nor juries are well qualified to make. See Lynn v. Regents of Univ. of Cal., 656 F.2d 1337, 1342-44 (9th Cir.1981). Furthermore, the jury heard evidence on CSUH’s record of hiring people of color, who comprise a slightly higher percentage of CSUH’s faculty than expected based on the academic labor pool. Without Dr. Wellman’s testi*1068mony, Elsayed’s racial discrimination case becomes a disagreement among academic professionals, which is something that Title VII does not proscribe.
Dr. Wellman drew the inference of discrimination for the jury in a case otherwise based entirely on less-than-convincing circumstantial evidence. Thus, it is hard for us to see how Dr. Wellman’s testimony, which addressed the central issue of El-sayed’s case, was harmless; rather it “more probably than not was the cause of the result reached.” Jauregui, 852 F.2d at 1133; see also Hester v. BIC Corp., 225 F.3d 178, 185-86 (2d Cir.2000) (where plaintiffs racial discrimination case was otherwise based on comparative evidence, the expert’s testimony drawing an inference of discrimination for the jury could have been enough to tilt the balance); Burkhart v. Washington Metro. Area Transit Auth., 112 F.3d 1207, 1214-15 (D.C.Cir.1997) (where evidence of discrimination was not “strong” expert testimony about regulatory standards was prejudicial and required new trial).
Therefore, the district court’s erroneous admission of Dr. Wellman’s testimony without the proper reliability determination was not harmless, and CSUH is entitled to a new trial.15
VACATED AND REMANDED for a new trial. Each party shall bear its own costs and attorney’s fees on appeal. This panel shall retain jurisdiction over any future appeal in this matter.
. After rtndergoing tribal initiation as a youth in the Sudan, tribal elders carved warrior marks into Elsayed's face. He also wears the warrior hair-style, commonly known as dreadlocks.
. In obtaining an outside reviewer, Dean Navarro violated CSUH policy by not first securing Elsayed's consent and by failing to put in writing his reasons for requesting the review.
. The district court granted Martino's motion for judgment as a matter of law, which reduced the punitive damages award by $15,000. Elsayed does not appeal from that order.
. Elsayed testified:
After two years of arbitration, and motions, and examinations of hundreds of documents, the arbitrator concluded that the University in the second-in the remand not only committed the same procedural errors, but they actually added new ones.
And as far as their judgment, they also found that they did not use reasonable arguments to support their position. And, therefore, she concluded that the only remedy for the situation is to grant me tenure.
And she concluded by granting me tenure, which was later vacated by a judge.
.In a motion in limine, which the district court denied, CSUH objected to having the jury hear the second arbitration’s result. CSUH also made a contemporaneous objection at trial, which the district court denied.
. Elsayed also argues that the doctrine of invited error should preclude our review because CSUH cross-examined Dr. Wellman as to why he thought race influenced the tenure decision. Invited error occurs when the appellant opens the door to objectionable testimony by introducing it, rather than waiting for the appellee to do so. Ohler v. United States, 529 U.S. 753, 755, 120 S.Ct. 1851, 146 L.Ed.2d 826 (2000); United States v. Segal, 852 F.2d 1152, 1155 (9th Cir.1988). Where, as here, a party simply cross-examines a witness to counter evidence already admitted, the error is not invited. United States v. Weitzenhoff, 35 F.3d 1275, 1287 n. 8 (9th Cir.1994).
. Encompassed in the determination of whether expert testimony is relevant is whether it is helpful to the jury, which is the "central concern” of Rule 702. United States v. Rahm, 993 F.2d 1405, 1413 (9th Cir.1993); see also Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 591, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993). Despite arguing that the substance of Dr. Wellman's testimony was well within the common sense knowledge of laypersons and, thus, not helpful to the jury, CSUH conceded at oral argument that Dr. Wellman’s testimony was "absolutely” relevant. Therefore, we proceed only on the question of reliability.
. This concern is also reflected in our prohibition on expert witnesses offering their legal conclusions. See infra n. 10.
. Nor did the district court determine that Dr. Wellman's testimony would be helpful to the jury. To be admissible, "expert testimony must ... address an issue beyond the common knowledge of the average layman." United States v. Vallejo, 237 F.3d 1008, 1019 (9th Cir.), amended by 246 F.3d 1150 (9th Cir.2001); see also United States v. Hanna, 293 F.3d 1080, 1086 (9th Cir.2002). Elsayed argues that Dr. Wellman's specialized sociological knowledge was helpful because it assisted the jury to identify and to analyze coded expressions of contemporary racism. As Elsayed argued in his opposition to CSUH's motion in limine, data indicate "that while there is an increasing trend toward verbal tolerance in relation to issues of race and racism, there is a discrepancy between such statements and the routine everyday practices of white Americans. What white Americans say in opinion polls on the subject of race is often contradicted by their behavior." Indeed, “[sjocial scientists in particular may be able to show that commonly accepted explanations for behavior'are, when studied more closely, inaccurate. These results sometimes fly in the face of conventional wisdom.” Tyus v. Urban Search Mgmt., 102 F.3d 256, 263 (7th Cir.1996); see also United States v. Hall, 93 F.3d 1337, 1342-43 (7th Cir.1996) ("[Sjocial science testimony is an integral part of many cases, [including] employment discrimination actions.”). We express no opinion, however, as to whether Dr. Wellman’s testimony would, in fact, be helpful to the jury. See supra n. 7.
. It is well-established, however, that expert testimony concerning an ultimate issue is not per se improper. E.g., Shad v. Dean Witter Reynolds, Inc., 199 F.2d 525, 529 (9th Cir.1986). Indeed, Federal Rule of Evidence 704(a) provides that expert testimony that is "otherwise admissible is not objectionable because it embraces an ultimate issue to be decided-by the trier of fact.” However, an expert witness cannot give an opinion as to her legal conclusion, i.e., an opinion on an ultimate issue of law. E.g., McHugh v. United Serv. Auto. Ass'n, 164 F.3d 451, 454 (9th Cir.1999); United States v. Duncan, 42 F.3d 97, 101 (2d Cir.1994) ("When an expert undertakes to tell the jury what result to reach, this does not aid the jury in making a decision, but rather attempts to substitute the expert's judgment for the jury’s.”) (emphasis in original).
. CSUH does not argue that Dr. Wellman was unqualified to provide expert testimony. Indeed, his curriculum vitae is quite impressive.
. We express no opinion on the merits of whether Dr. Wellman’s testimony was, in fact, reliable or of the reliability of such evidence in general. See Daubert, 509 U.S. at 595, 113 S.Ct. 2786 (“The focus, of course, must be solely on principles and methodology, not on the conclusions that they generate.”). This determination must be made in the first instance by the trial court.
. "It is not enough ... to disbelieve the employer; the factfinder must believe the plaintiff's explanation of intentional discrimination.” Reeves, 530 U.S. at 147, 120 S.Ct. 2097 (quotation'tharks omitted) (emphasis in original).
. Elsayed alleges that Dean Navarro called him a "pothead” and commented that foreign professors do not understand American students’ culture and attitudes. Dean Navarro's testimony is less sinister than Elsayed makes it seem, however. Elsayed's counsel asked, "Did you ever make the statement that foreigners don't understand American students’ attitudes?” Dean Navarro responded, "Yes, I may have made the generalization that sometimes I found that some foreign professors don't understand American high school culture. I have two young people that just graduated from high school, and sometimes their standards and expectations for behavior in class or their knowledge gets to be a little — or lack of knowledge sometimes is shocking to professors abroad.”
Far from being a criticism of foreign professors, Dean Navarro's comment is, in fact, more an indictment of American teenagers. Also, Dean Navarro testified that he did not recall ever referring to Elsayed as a "pothead.” Seen in their proper context, Dean Navarro's remarks fail to live up to the discriminatory billing first given to them by El-sayed. We find that the jury could not have based its decision on Dean Navarro's remarks alone.
. Because we remand for a new trial, we do not reach the question of whether there was sufficient evidence to support the award of punitive damages against Dean Navarro and President Rees. However, we note that Title VII provides for punitive damages, which may be awarded "if the complaining party demonstrates that the respondent engaged in a discriminatory practice ... with malice or with reckless indifference to the federally protected rights of an aggrieved individual.” 42 U.S.C. § 1981a(b)(l) (emphasis added). To award punitive damages, the individuals' conduct must have been more than just intentional discrimination — instead they must have known they were acting in violation of federal law. Kolstad v. Am. Dental Ass’n, 527 U.S. 526, 535-36, 119 S.Ct. 2118, 144 L.Ed.2d 494 (1999); see also Ngo v. Reno Hilton Resort Corp., 140 F.3d 1299, 1304 (9th Cir.1998) ("Punitive damages may not be awarded ... where a defendant’s discriminatory conduct is merely 'negligent in respect to the existence of a federally protected right.’ ” (quoting Hernandez-Tirado v. Artau, 874 F.2d 866, 870 (1st Cir.1989))).