dissenting:
Due process demands that no defendant should face a biased jury. Nonetheless, the mental- gymnastics demanded by a retrospective jury analysis taking place decades after the trial suggest that Justice Marshall was prescient in his concurrence in Batson: “The decision today will not end the racial discrimination that peremp-tories inject into the jury-selection process. That goal can be accomplished only by eliminating peremptory challenges entirely.” Batson v. Kentucky, 476 U.S. 79, 102-03, 106 S.Ct. 1712, 90 L.Ed.2d 69 (1986) (Marshall J., concurring).
I part ways with the majority’s ultimate conclusion that Crittenden has proven that the prosecutor’s challenge to the single black juror was substantially motivated by race in violation of Batson. 476 U.S. 79, 106 S.Ct. 1712. I join the majority as to Part I (the Teague analysis) and Part II.A (the lack of AEDPA deference to the California Supreme Court’s Wheeler analysis). Otherwise, I respectfully dissent.
Let me turn now to what happened in this case in 1989. In observing voir dire, the trial judge characterized potential juror Manzanita Casey as “indecisive” and noted that she “couldn’t decide whether or not she would be able to follow the law.” He presciently observed that a “Wheeler motion would be inappropriate.” Striking a juror who is a death penalty “wobbler” is hardly a basis to impute purposeful discrimination to the prosecutor. In light of the evidence presented in state court, and the heavy deference we owe to the trial judge’s firsthand observations, we should not disturb the trial court’s fact-bound determination that Crittenden did not make out a prima facie case of discrimination under Batson.
Crittenden’s case does not improve under step three of the Batson analysis. At this stage, our review is de novo because the California Supreme Court invoked the wrong legal standard. But de novo review does not mean that we can, after the fact, stack inference upon inference, impute motive when none was demonstrated, or use new evidence to construct a hypothetical record that never existed. Because the evidence is ultimately inconclusive as to the prosecutor’s state of mind in 1989, and does not clearly support pretext, Critten-den failed to prove purposeful discrimination.
In the end, Crittenden’s case should not rise or fall oh the after-the-fact significance imputed to the prosecutor’s XXXX rating of Casey.1 The majority’s analysis boils down to a feeling that although perhaps Casey deserved an XX or even XXX rating, the fourth X looms large and could only signify racial bias. This entire mode of analysis is folly, for it grafts scientific certitude onto a back-of-the-hand rating system, which the prosecutor himself described as a “de minimus approach.” According to the prosecutor, the precise number of Xs “wasn’t a very scientific notation,” and the same juror “could have been a two or a three, or a three or a four.” We have no Rosetta Stone to unlock the meaning of the fourth X; it is a *1021mistake to order a new trial based on this speculative foundation as to a single juror.
I. Batson Step One — Crediting the Trial Court’s Factual Finding
Under Batson step one, Crittenden must “show[] that the totality of the relevant facts gives rise to an inference of discriminatory purpose.” Batson, 476 U.S. at 93-94, 106 S.Ct. 1712. The state trial court found that Crittenden did not meet this standard. Under both the Anti-Terrorism and Effective Death Penalty Act of 1996 (“AEDPA”) and Batson principles, overturning such a finding requires “exceptional circumstances.” Davis v. Ayala, — U.S. -, 135 S.Ct. 2187, 2201, 192 L.Ed.2d 323 (2015) (quoting Snyder v. Louisiana, 552 U.S. 472, 477, 128 S.Ct. 1203, 170 L.Ed.2d 175 (2008)). Because there is nothing amiss about the trial court’s finding — much less exceptionally wrong — that conclusion should have ended the matter. Instead, the majority second-guesses the fact-bound decision of the state' trial judge with a raft of new evidence introduced in federal habeas proceedings. I dissent from this upside-down approach to deference.
The starting point is AEDPA, 28 U.S.C. § 2254(e)(1), under which the trial court’s factual finding is “presumed correct” and Crittenden “has the burden of rebutting that presumption by ‘clear and convincing evidence.’ ” Ayala, 135 S.Ct. at 2199-2200 (quoting Rice v. Collins, 546 U.S. 333, 338-39, 126 S.Ct. 969, 163 L.Ed.2d 824 (2006)). In light of AEDPA’s mandate, “we normally review the state trial court’s fact-specific determination of whether a defendant has made a prima facie case of a Batson violation deferentially, applying AEDPA’s ‘statutory presumption of correctness.’ ” Fernandez v. Roe, 286 F.3d 1073, 1077 (9th Cir.2002) (quoting Wade v. Terhune, 202 F.3d 1190, 1195 (9th Cir.2000)). In contrast, “where the trial court has applied the wrong legal standard, AEDPA’s rule of deference does not apply.” Id.; see also Cooperwood v. Cambra, 245 F.3d 1042, 1046 (9th Cir.2001).
Nothing reflects that the trial court applied the wrong legal standard or otherwise erred in its application of Batson step one. Importantly, neither the majority nor Crittenden suggests otherwise. A-though, in 1994, the California Supreme Court conflated Batson’s “reasonable inference” test with Wheeler’s more stringent “strong likelihood” test, see Majority Part II.A, there is no reason to think that the trial judge committed that same mistake five years earlier.2 Nor can Critten-den summon clear and convincing evidence that the trial court erred in assessing whether there was a prima facie case of purposeful discrimination based on the evidence before the state court. The prima facie determination is a factual inquiry that is “peculiarly within a trial judge’s province,” Ayala, 135 S.Ct. at 2201 (quoting Snyder, 552 U.S at 477, 128 S.Ct. 1203), because the trial judge plays a pivotal role supervising voir dire and is “best situated to evaluate both the words and the demeanor of jurors who are peremptorily challenged, as well as the credibility of the prosecutor who exercised those *1022strikes,” id. See also Tolbert v. Page, 182 F.3d 677, 683 (9th Cir.1999) (en banc) (noting that, at Batson step one, “the trial judge’s unique perspective of voir dire enables the judge to have first-hand knowledge and observation of critical events” and to “personally witness[ ] the totality of circumstances that comprises the ‘factual inquiry’ ” at issue, making heavy deference appropriate).
In his Wheeler motion, Crittenden’s counsel made two primary points. He noted that the same prosecutor faced an unsuccessful Wheeler challenge in a previous case. Additionally, Casey, as an African-American, was “a member of a cognizable racial group” and was in fact the “only member of the identifiable group” among the voir dire panelists. Neither contention satisfied the requirements for a prima fa-cie showing.
The prosecutor’s earlier Wheeler challenge was “weak” evidence because, as we explained in Crittenden’s first appeal, it was “one isolated incident in which the trial court denied the Batson objection,” and “it did not add significantly to his prima facie case,” Crittenden I, 624 F.3d at 957 n. 4. Nor does “the fact that the juror was the one Black member of the venire,” in and of itself, “raise an inference of discrimination.” Terhune, 202 F.3d at 1198 (quoting United States v. Vasquez-Lopez, 22 F.3d 900, 902 (9th Cir.1994)). “More is required.” Id.
Benchmarked against the defense counsel’s proffer at the prima facie stage, the trial judge gave specific reasons, based on his firsthand observations, for finding no inference of discrimination. Even before the prosecutor and defense counsel began jury selection, Crittenden’s counsel alerted the trial judge that he planned to make a Wheeler motion — and already had prepared a written motion to that effect — if the prosecutor struck Casey, the sole black member of the venire. The judge therefore was acutely attuned to the issue of discrimination and took notes on Casey’s demeanor and voir dire answers. The judge’s notes and impressions “revealed that at the very time that we questioned Ms. Casey, my exact quotation is: ‘This is a case where a Wheeler motion would be inappropriate, because of the fact that she is indecisive and cannot guarantee that she would vote in a certain way.’ ... She couldn’t decide whether or not she would be able to follow the law.”
Context is key. Before striking Casey, the prosecutor used peremptory strikes against 14 white jurors — consistently targeting those who expressed doubt about the death penalty. To cite a few examples, the prosecutor used his first peremptory against juror Smith, who stated, “I do not believe in [the death penalty] as a general rule — there are exceptions.” The prosecutor used his fourth peremptory against juror Gilbert, who described himself as an “extremely liberal person” and said he “would have a difficult time voting for the death penalty.” The sixth strike removed juror Pisarek, who generally opposed the death penalty but recognized it was the law and, unlike Casey, was unequivocal that she could vote for it. The prosecutor’s tenth strike went against juror Works, who believed that “all life is precious” but added that she wouldn’t conscientiously object to voting for the death penalty. The prosecutor struck juror Henley, whom he labeled as “Borderline DP weak” despite Henley’s bland statement on his juror questionnaire that “[t]here are times and circumstances when I have considered [the death penalty] appropriate.”
The strike of Casey hardly stands out. Casey opposed the death penalty, and the death penalty was the overriding focus of Crittenden’s capital trial. On her juror *1023questionnaire, Casey wrote: “I don’t like to see anyone put to death.” During her voir dire question-and-answer session, Casey continued to express hesitancy about capital punishment. “I am against death — being put to death,” she said at one point. “And I am against people killing people.” Given the prosecutor’s pattern of peremptory strikes and Casey’s death penalty views, the trial judge understandably cited “abundant [] reasons” why he expected and accepted a peremptory challenge against her.
The prior panel compared Casey to two white jurors — Clark and Krueger — who ultimately served on Crittenden’s jury. Crittenden I, 624 F.3d at 957. However, that prior decision was issued before Cullen v. Pinholster, 563 U.S. 170, 131 S.Ct. 1388, 179 L.Ed.2d 557 (2011). Now, “[w]hen examining a petitioner’s habeas claim through the AEDPA lens, we ‘focus [] on what a state court knew and did,’ ” and “thus consider ‘how the [state court] decision confronts [the] set of facts that were before [it],’ rather than how it should have confronted a new set of facts presented for the first time in federal court.” Jamerson v. Runnels, 713 F.3d 1218, 1226 (9th Cir.2013) (last four alternations in original) (quoting Pinholster, 131 S.Ct. at 1399). The two white jurors entered the jury box after the prosecutor struck Casey and the trial judge denied Crittenden’s Wheeler motion. Hence, when the trial judge denied the prima facie case, he could not have divined that Clark and Krueger later would be permitted to serve on the jury. Nor did Crittenden’s counsel renew his Wheeler motion at any subsequent point. A post-hoc, comparative analysis in these circumstances has no place in evaluating the trial court’s finding of fact at the prima facie stage. Even if the juror analysis is appropriate, the comparison hardly provides clear and convincing evidence that the trial judge got it wrong, because both subsequently seated white jurors are readily distinguishable from Casey. See Section II.B.
In repudiating the trial court’s prima facie finding, the majority mistakenly relies on evidence produced at the 2002 federal evidentiary hearing — namely, the prosecutor’s notations rating jurors in the margins of their questionnaire sheets. No state court was ever privy to this evidence. As we recently explained, “after Pinholster, a federal habeas court may consider new evidence only on de novo review, subject to the limitations of § 2254(e)(2).” Murray v. Schriro, 745 F.3d 984, 1000 (9th Cir.2014). As we explained in Murray, Pinholster cabins our review under § 2254(e)(1), because it “eliminated the relevance of ‘extrinsic’ challenges when we are reviewing state-court decisions under AEDPA.” Id. at 999.
Of course, where the conditions for de novo review are satisfied — ie., when the factual finding is rebutted under § 2254(e)(1) — Pinholster may allow for new evidence adduced during federal ha-beas proceedings. But first things first: Because our review under step one is constrained by AEDPA deference and Crit-tenden has not effectively rebutted the trial court’s initial factual finding, we are not in de novo review mode at this stage. This conclusion follows from a faithful reading of Murray. I acknowledge that the post-Murray cases cited by the majority may be in tension with Murray, given that they appear to support new fact-finding simply on the basis that the California Supreme Court alone rendered a decision contrary to clearly established law under § 2254(d)(1). See Hurles v. Ryan, 752 F.3d 768, 778 (9th Cir.2014); Johnson v. Finn, 665 F.3d 1063, 1069 n. 1 (9th Cir.2011). That error cannot be imputed to the state trial court, however. Under Murray, the situation here is clear: the *1024state trial court did not err in its factual finding that Crittenden failed to carry his burden, .and therefore our review is cab-ined by the evidence before the trial court. In any event, the majority lacks authority to overrule Murray and cannot escape its holding simply by dismissing it as an earlier case-indeed, it was decided in 2014, the same year as Hurles. See Rodriguez v. AT & T Mobility Servs. LLC, 728 F.3d 975, 979 (9th Cir.2013). Crediting the state trial court’s factual finding, I would deny Crittenden’s habeas petition at step one of the Batson analysis.
II. Batson Step Three: Failure to Establish the Prosecutor’s Purposeful Discrimination
. Even if it were appropriate to reach the ultimate Batson step three question of purposeful discrimination, I would still deny the petition because Crittenden has not shown that the prosecutor harbored substantial racist intent.3 Decades after the voir dire, we are like archaeologists without a framework trying to piece together forgotten motives from small shards of imperfect and inconclusive evidence. The record does not establish that the prosecutor was “motivated in substantial part by discriminatory intent.” Cook v. LaMarque, 593 F.3d 810, 815 (9th Cir.2010) (quoting Snyder, 552 U.S. at 485, 128 S.Ct. 1203).
A. Batson Standard of Review
The majority starts off on the wrong foot in its phase three Batson analysis, categorically deferring to the district court under a clear-error standard. The appropriate standard of review, given a context where we share the district court’s task of reviewing a cold record, should be de novo review.4
We routinely recite Rule 52 of the Federal Rules of Civil Procedure and the “clear error” standard without putting the rule in context. Notably, in Batson itself, the Court stated that a “reviewing court ordinarily should give [factual findings] great deference,” but only because “the trial judge’s findings in the context under consideration here largely will turn on evaluation of credibility.” 476 U.S. at 98 n. 21, 106 S.Ct. 1712 (emphasis added). Our cases applying Batson likewise continue to emphasize that deference is due particularly where the facts go to the demean- or and credibility of the prosecutor. Cook, 593 F.3d at 815-16 (quoting Hernandez v. New York, 500 U.S. 352, 365, 111 S.Ct. 1859, 114 L.Ed.2d 395 (1991)).
What gets lost in this case is the layer-upon-layer review at issue. Because there was .no step three analysis in the state courts, both we and the district court review the habeas petition de novo. To the extent there were true factual findings at the district court level, I agree that we should evaluate 'those findings under a “clear error” analysis. However, the reality is that aside from a handful of non-determinative factual findings made by the *1025magistrate judge, which the district court neither relied on nor contested, the district court was simply reviewing a cold record of documentary evidence.
In short, our task is identical to that of the district court: applying the familiar tools of comparative juror analysis to a fixed record. In the unusual context of this case where nothing hinges on testimony from the evidentiary hearing, our review should be de novo. See Holder v. Welborn, 60 F.3d 383, 388 (7th Cir.1995) (holding that no deference is owed to a district court deciding Batson habeas case on a cold record). The majority and I simply disagree on the standard of review.
The only factual findings and credibility considerations were made by the magistrate judge at the 2002 evidentiary hearing. At that hearing, the prosecutor testified that he “did not remember anything of significance to the exercise of his peremptory challenge” against Casey, which had occurred some 13 years earlier, (emphasis in original). The magistrate judge found that the prosecutor was “forthright in his factual testimony” about his lack of an independent recollection of the events of Crittenden’s voir dire. Otherwise, the prosecutor testified about administrative matters, such as his handwriting and markings on juror questionnaires. The prosecutor also spoke, in general terms, about his methodology for ranking jurors, though he couldn’t recall why he ranked any particular juror positively or negatively. The magistrate judge credited the prosecutor’s testimony as to those matters and ultimately recommended denying Crit-tenden’s habeas petition after analyzing the questionnaires and voir dire transcript, concluding that the prosecutor harbored “significant” but not “substantial” bias in striking Casey.
The district court did not disturb any of the magistrate’s uncontroversial specific findings, which even if credited do not dictate whether the prosecutor’s peremptory strike against Casey was legitimate. At best, the absence of specific evidence about the prosecutor’s methodology simply means that Crittenden lacks evidence of animus. The district court rejected the magistrate’s ultimate recommendation to deny Crittenden’s habeas petition without holding a new evidentiary hearing, precisely because the live testimony and underlying factual findings from the 2002 hearing do not alter the outcome of the case. See Majority Op. Part III. The district court’s analysis was based entirely on a retrospective review of the records from voir dire.
The habeas standard of review vis-a-vis Batson depends on which court’s findings and determinations are under review. Of course in Batson itself, the Supreme Court emphasized the importance of giving deference to the trial court and reversing only in the case of clearly erroneous findings. 476 U.S. at 98 n. 21, 106 S.Ct. 1712. As we explained, “the trial judge’s unique perspective of voir dire enables the judge to have first-hand knowledge and observation of critical events” and to “personally witness[ ] the totality of circumstances that comprises the ‘factual inquiry’ ” under Bat-son, making deference appropriate. Tolbert, 182 F.3d at 683. “An appellate court can read a transcript of the voir dire, but it is not privy to the unspoken atmosphere of the trial court — the nuance, demeanor, body language, expression and gestures of the various players.” Id. at 683-84. None of those rationales for deference apply here, where the federal district court played no role in voir dire, had no occasion to soak in the “unspoken atmosphere of the trial court,” id., and never took stock of the demeanor and body language of the prosecutor and jurors.
Nor is this a case in which the district court reconstructed the Batson hearing *1026and following testimony made credibility determinations that affect the disposition of the Batson step three inquiry. The majority states that “the magistrate judge did not make — and hence the district court did not reject — any credibility determination.” Maj. Op. 1011. This view is not precisely accurate as the magistrate judge did credit the prosecutor’s testimony' — -it was just that the prosecutor’s testimony didn’t have any substance. Compare with Harris v. Haeberlin, 752 F.3d 1054, 1061 (6th Cir.2014) (deferring to federal district court where it made credibility determinations based on newly discovered videotape evidence of voir dire and the prosecutor’s live testimony, so that the case turned on “in-person credibility assessments which clearly the district court is in the best position to make”) (internal citation omitted); Jordan v. Lefevre, 293 F.3d 587, 594 (2d Cir.2002) (holding that, once a district court reconstructs a Batson hearing in federal habeas proceedings, “we will accord deference to the reconstructing court’s credibility assessments”).
As a point of stark comparison, a recent Eleventh Circuit case involving a reconstructed Batson hearing is instructive. Madison v. Comm’r, Ala. Dep’t of Corr., 761 F.3d 1240 (11th Cir.2014). Significantly, in deferring to the district court, the Eleventh Circuit noted that the district court judge did more than consider “the prosecutor’s trial notes and ’the testimony authenticating it.” Id. at 1247. The court explained:
The District Court heard the live testimony of Mr. Madison’s trial prosecutor and had the opportunity to observe his demeanor when he offered his explanations for striking the jurors he did. While [the prosecutor] Mr. Cherry relied on his notes to provide his reasons for striking individual jurors, he never testified that he had no recollection of the decisions he made during Mr. Madison’s voir dire. In fact, Mr. Cherry was able to answer several questions about his strategy in picking the jury, his awareness of the Mobile County District Attorney Office’s history of Batson violations, and his experience as a defense attorney. His testimony about these things went beyond the four corners of his voir dire notes.
Id. at 1247-48. Because the prosecutor provided substantive testimony and was cross examined by defense counsel, “the District Court was in a superior position to assess [the prosecutor’s] credibility and the genuineness of his explanations for striking black jurors at Batson’s third step.” Id. at 1248.
Unlike in Madison, the prosecutor here testified that he had no recollection of Casey and provided no explanation for striking her that “went beyond the four corners of his voir dire notes.” Id. The facts are fixed in a cold record, so our Batson step three analysis involves nothing more than a run-of-the-mill review of the voir dire records and comparative juror analysis. This case is closely akin to Welbom, where the Seventh Circuit concluded that de novo review of the federal district court’s Batson step three determination was appropriate:
Although the magistrate judge was able to hear the explanations given by the prosecutors at the Batson hearing, she was not in the same position to make credibility determinations as is a trial judge who has the opportunity to observe the responses from the venire and to hear the attorney’s explanation for a peremptory immediately after it is exercised. In fact, the prosecutors admitted that at the time of the Batson hearing, they had little, if any, recollection of the actual voir dire, and found it necessary to testify with aid from the voir dire *1027transcript and from their contemporaneously taken notes. Therefore since [the magistrate judge, the district judge], and the members of this panel all have basically been provided with only a cold record from which to determine if a Batson violation occurred at Holder’s jury trial, we find that no deference is warranted under these circumstances.
60 F.3d at 388. Likewise, I conclude that no deference to the federal district court is warranted: the prosecutor had no recollection of why he struck Casey and the magistrate judge, the district court, and the Ninth Circuit are all working from the same decades-old records from voir dire in rendering the ultimate Batson step three determination.
B. Batson Step Three: Purposeful Discrimination
The remaining question is whether, in striking Casey, the prosecutor had a discriminatory purpose. “ ‘Discriminatory purpose’ ... implies more than intent as volition or intent as awareness of consequences'. It implies that the decisionmaker ... selected ... a particular course of action at least in part ‘because of,’ not merely ‘in spite of,’ its adverse effects upon an identifiable group.” Hernandez v. New York, 500 U.S. 352, 360, 111 S.Ct. 1859, 114 L.Ed.2d 395 (1991) (plurality) (quoting Person. Admin, of Mass. v. Feeney, 442 U.S. 256, 279, 99 S.Ct. 2282, 60 L.Ed.2d 870 (1979)). The touchstone, as described in our caselaw, is whether race was a “substantial motivating factor” in the prosecutor’s decision to strike Casey. Cook, 593 F.3d at 815.
Gleaning the secret truth of the prosecutor’s state of mind is rarely simple, especially years or decades after the trial has drawn to a close. Our assignment is doubly difficult because we’re missing the key piece of evidence — the prosecutor’s explanation for striking Casey. That testimony is often the focal point of the step three analysis. ‘ However, the prosecutor should hardly be penalized for his honesty. He merely declined to manufacture a convenient reason post hoc.
I don’t begrudge the majority its careful comparative juror analysis. A lot of ink has been spilled in these habeas proceedings now going on 16 years.5 That so many diligent jurists have reached differing and conflicting conclusions underscores that the prosecutor’s -notes, while slightly illuminating, are ultimately inconclusive. In my view, the prosecutor’s XXXX rating of Casey cannot bear the weight ascribed to it by the majority, nor can a rehashing of the voir dire transcript trump the trial court’s factual finding on Casey’s demean- or. In proving purposeful discrimination, the “burden of persuasion rests with, and never shifts from,” the defendant — Crittenden. Johnson v. California, 545 U.S. 162, 171, 125 S.Ct. 2410, 162 L.Ed.2d 129 *1028(2005) (quoting Purkett v. Elem, 514 U.S. 765, 768, 115 S.Ct. 1769, 131 L.Ed.2d 834 (1995)). Whether the standard of review of the district court’s phase three determination is clear error or de novo, Crittenden has failed to meet his burden.
To begin, the prosecutor testified that Xs meant a venire member was “opposed to the death penalty and strongly stated it ... Checkmarks were people who either were for the death penalty or medium ground that I thought to some degree I would be able to tolerate having on the jury.” The more Xs the juror received, the less favorably the prosecutor viewed that juror; the more checkmarks, vice ver-sa. The prosecutor made the ratings after reviewing the written juror questionnaires and listening to the voir dire answers of each member of the venire.
By the time Casey entered the jury box, the prosecutor already had used perempto-ries against seven of the nine venire members to whom he gave a negative rating of at least one X. The prosecutor did not strike Casey at the first opportunity upon her draw to the jury box; instead, he removed a juror with a “/?” rating. He then struck Casey. - When the prosecutor did so, the jury box included the following venire members, as rated by the prosecutor:
• Corrao — ///
• Casey — XXXX
• Fisher — /
• Rehm — /
• Tennies — //
• Naess — /
• Bertrando — XXX
• Shalley — /
• McMahan — ///
• Stewart — (no rating but listed as one of the “good jurors” by the prosecutor; she was later- excused for hardship).
• Fortier — ///
• Curtis — ///
Facing a potential jury in which a majority held neutral or favorable views toward 'the death penalty, the prosecutor did what anyone would expect: he struck Casey and then Bertrando, who stated on his juror questionnaire that “killing people isn’t right no matter who is doing it” and that life imprisonment actually is a “worse punishment.” Having removed every juror with at least an X rating, the prosecutor used his remaining peremptories against those- with a / or //? rating. The jury ultimately was comprised exclusively of jurors with at least a // rating; all but two scored even higher.
As I described earlier, see Section I, the prosecutor used his first 14 peremptories against white jurors, many of whom expressed less doubt about the death penalty than did Casey. The upshot is that, by the time Casey was seated in the jury box, the prosecutor already had removed most of the jurors he considered unfavorable to the case for capital punishment — even those whose death penalty views bent more toward ambivalence than opposition. Leaping to racism as the substantial explanation for the strike against Casey ignores the obvious, because Casey and Bertrando fell right in line with the prosecutor’s pattern of previous strikes. Overall, the prosecutor used 21 of his 26 peremptories against venire members who opposed the death penalty in some fashion. As the magistrate judge noted, Casey “would have been stricken regardless of her race.”
Nor is Casey the only juror who received the XXXX rating. The prosecutor also gave Smith, who was white, the same rating, so race is hardly the only reason for the fourth X. The majority says that Smith was more deserving of the XXXX rating because she “arguably expressed stronger opposition to the death penalty *1029than did Casey.” But it takes no leap of logic to conclude the opposite, as the magistrate judge did: “Arguably, this four ‘X’ juror was more disposed to render a death verdict than Mrs. Casey.” As one example, the prosecutor asked Smith whether her views would impair her ability to fairly and impartially consider the' evidence in favor of the death penalty. Smith’s response: “I don’t think so. I think that there are circumstances where I would be able to agree with the death penalty.” When , the prosecutor asked Casey the nearly word-for-word identical question, her answer came with a heavy dose of hesitation: “I can’t say yes. I can’t say no. I really don’t — don’t know.”
Smith had a negative run-in with law enforcement, when her husband apparently was falsely identified by a witness to a crime, and described the prospect of jury service as “horrifying” and “frightening.” But in a similar vein, Casey found the idea of capital jury service “scary” and was “very upset about it.” Although Smith used stronger adjectives, both potential jurors exhibited a demeanor poorly suited to sentencing someone to death, setting them apart from others in the jury pool who were ideologically opposed to capital punishment.
The majority emphasizes that Casey’s substantive death penalty answers alone did not warrant the XXXX rating, but this view obscures what the trial court said about Casey — that “she is indecisive” and “couldn’t decide whether or not she would be able to follow the law.” Unlike the array of appellate and federal judges to weigh in over the 26 years since Critten-den’s conviction, the trial judge was there.He supervised voir dire, personally questioned Casey, and took notes on her answers and demeanor. We shouldn’t lightly disregard his impressions. See, e.g., § 2254(e)(1); Cook, 593 F.3d at 816 (“[W]e must defer to the trial judge’s findings regarding the demeanor of the individuals in the courtroom.”).
The voir dire transcript confirms Casey’s apparent angst and anguish. Asked ¿bout the prospect of serving on the jury, she replied: “Not good,” and explained: “It is scary.” Later, when asked whether she could be open and objective about whether to impose the death penalty, Casey equivocated: “I can’t say fully. I would try.” She continued,' “I can’t sit here and really say for sure if I could ... I have to say I think I could. This is all new to me. So I am very upset with it.” She agreed with the prosecutor that she’d have difficulty reaching a decision on the death penalty. Her testimony cannot be characterized as coming to a concrete, definitive willingness to vote for the death penalty. The prosecutor came away with the same impression — writing “[cjan’t say if would set aside” on Casey’s juror sheet.
The trial judge’s factual finding that Casey was indecisive separates her from juror Clark, who described herself as a “pretty decisive person” who makes big decisions without guilt or self-doubt. Significantly, Clark also articulated a distinctly law-and-order outlook. In prior jury service, Clark had voted to convict a criminal defendant of drunk driving and said she was “really disturbed” by a holdout juror who wanted to acquit because that juror “hated cops. It was very disturbing to me.” When defense counsel asked Clark whether she “believed in law enforcement,” she responded, “I certainly do.” She later added: “I feel very strongly that people should be punished for what they do. I feel very strongly about the law.” Not surprisingly, the prosecutor’s notations say.that, aside from her death penalty views, Clark was an “[ojtherwise strong” prosecution juror. By contrast, Casey had never served on a jury, made *1030no similar pro-police statements, and can fairly be described as tentative in her answers.
The majority’s focus on potential jurors Sullivan and Tennies is also misplaced. Maj. Op. 1015-17. The prosecutor used peremptories on both of them after having struck Casey. Comparative juror analysis is supposed to comprise “side-by-side comparisons of some black venire panelists who were struck and white panelists allowed to serve.” Miller-El v. Dretke, 545 U.S. 231, 241, 125 S.Ct. 2317, 162 L.Ed.2d 196 (2005). Sullivan and Tennies were not “allowed to serve,” so any comparison between them and Casey is not illuminating. In other words, that the prosecutor struck Casey, Sullivan, and Tennies — all of whom expressed varying degrees of hesitancy about the death penalty — does nothing to prove racial bias.’6
With the evidence stacked against the proposition that race was the real reason for striking Casey, the district court concluded that Crittenden has met his burden under Batson by showing that the prosecutor was motivated at minimum by unconscious bias. Although I am very sympathetic to the notion of unconscious bias— stealth bias is destructive and real, even though it is often difficult to document — it is not an easy fit within the Batson framework, which focuses on the purpose of the prosecutor rather than the subconscious social and cultural factors that influence decisionmaking.7 The Supreme Court has never endorsed the view that unconscious bias can form the basis for a Batson challenge.8 The only circuit court to address the issue held that “evidence of ‘subconscious’ discrimination is not relevant” to purposeful discrimination under Batson. United States v. Roebke, 333 F.3d 911, 913 (8th Cir.2003).
The majority puts a wishful spin on the district court’s decision. Maj. Op. 1019. To *1031recap: the district court held that the prosecutor “was motivated, consciously or unconsciously, in substantial part by race” and granted Crittenden’s habeas petition. Upon the government’s motion for a stay pending appeal, the district court left its earlier decision intact and added some interpretive gloss that it meant to “leave[] no doubt that it concluded [the prosecutor’s] strike of Casey was purposeful discrimination.” However, the district court went on to reiterate that it couldn’t say “why [the prosecutor] was motivated by race” — i.e., whether “by conscious or unconscious racism,” so the court hardly disavowed its unconscious racism theory. In any event, the district court did not retract or amend its order granting the writ, which is the order under review on appeal.
In sum, Crittenden has not shown that the prosecutor’s strike was motivated by purposeful discrimination. The record simply does not support the conclusion that reference to Casey’s demeanor and death penalty views were pretext for racial bias. In a case such as this, we should be especially wary of overreading isolated snippets of a voluminous voir dire transcript. As the Supreme Court recently reminded, in capital cases jurors often will express varying degrees of hesitancy about imposing the death penalty. Ayala, 135 S.Ct. at 2201. Both prosecution and defense must make “fine judgment calls about which jurors are more or less willing to vote for the ultimate punishment. These judgment calls may involve a comparison of responses that differ in only nuanced respects, as well as a sensitive assessment of jurors’ demeanor.” Id. Prosecutors must act on instinct; they don’t have the hindsight-laden benefit of a leisurely review of a complete transcript. The prosecutor’s actions here fit well within that band of discretion, so far as the cold record reveals.
This case calls to mind Justice Breyer’s observation that the Batson inquiry can be an “awkward, sometime, hopeless, task of second-guessing a prosecutor’s instinctive judgment — the underlying basis for which may be invisible even to the prosecutor exercising the challenge.” Miller-El, 545 U.S. at 267-68, 125 S.Ct. 2317 (Breyer, J., concurring). In view of the record of what actually happened, the trial judge’s findings and the ultimate composition of the jury, our retrospective parsing simply cannot elevate ambiguous, speculative foundation to proof that the prosecutor was motivated in substantial part by racism.
I respectfully dissent.
. In denying Crittenden's prima facie case, in February 1989, the trial judge did not detail the standard he was applying. Before 1994, we presume that California state courts applied the correct Batson standard. Terhune, 202 F.3d at 1196-97. Even absent the Bat-sou-specific presumption, the Supreme Court repeatedly has "instruct[ed] us to give state courts the benefit of the doubt when the basis for their holdings is unclear.” James v. Ryan, 733 F.3d 911, 916 (9th Cir.2013). Significantly, we owe deference to the state trial court notwithstanding the California Supreme Court’s subsequent legal error. See Rever v. Acevedo, 590 F.3d 533, 537 (7th Cir.2010).
. Under Batson step three, I agree with the majority that we could properly consider the new evidence from the 2002 federal evidentia-ry hearing. See Johnson v. Finn, 665 F.3d 1063, 1069 n. 1 (9th Cir.2011). Our review on this issue is de novo and the evidentiary limitations in § 2254(e)(2) do not apply because Crittenden cannot be faulted for a "lack of diligence” in the state courts. Williams v. Taylor, 529 U.S. 420, 432, 120 S.Ct. 1479, 146 L.Ed.2d 435 (2000).
. Even under a clear-error standard, I would reverse the district court. The weak evidence of racial motivation, the state court’s factual finding on Casey’s demeanor and decisiveness, and the district court's reliance on a theory of unconscious bias counsel denial of the petition.
. To recap the tortured procedural history of this case: In January 1999, the magistrate judge issued a Finding and Recommendation ("F & R") stating that the California courts did not unreasonably deny Crittenden's prima facie Batson challenge. In May 2002, the district court rejected the F & R and ordered an evidentiary hearing, which was held in December 2002. The magistrate judge concluded that "race played some part in the prosecutor’s evaluation of Ms. Casey” but that race was “not the real reason or effective reason for her being stuck from the jury.” The district court agreed and denied Critten-den's habeas petition, but the Ninth Circuit reversed in Crittenden I. 624 F.3d at 959-60. On remand, the magistrate’s third F & R recommended that although the prosecutor’s racial motivation in striking Casey was "significant,” it was "not substantial” and again recommended denial of the Batson claim. The district court rejected that conclusion and found that the "prosecutor was motivated, consciously or unconsciously, in substantial part by race” and therefore granted Critten-den’s petition. This appeal followed.
. The majority cites Sullivan and Tennies as proof that anti-death penalty views were not determinative, because they both expressed some hesitation about the .death penalty yet received // ratings. This overreads the significance of the rating notations, elevating them to scientific certainty and excluding evaluation of the jurors' other characteristics. The four-month voir dire featured extensive questioning on the death penalty from the judge, prosecutor, and defense counsel. The prosecutor used 21 of his 26 peremptories against jurors who opposed the death penalty, including Sullivan and Tennies. The magistrate noted that a review of the "entire voir dire transcript” shows that "for the most part” the proceedings "focused on the death penalty....” Just because the prosecutor didn’t view till jurors who expressed anti-death penalty sentiments as exactly the same hardly shows that the entire enterprise was a sham.
. To be sure, Batson's requirement of purposeful discrimination does not lack for critics. Recently, for example, the Washington Supreme Court bluntly declared that "Batson is ... failing us,” because modern-day racism isn’t overt but is embodied in "stereotypes that are ingrained and often unconscious.” State v. Saintcalle, 178 Wash.2d 34, 309 P.3d 326, 334-36 (2013) (en banc). "Unconscious stereotyping upends the Batson framework,” which is "only equipped to root out ‘purposeful’ discrimination, which many trial courts probably understand to mean conscious discrimination.” Id. at 336.
.Two Supreme Court justices have referenced unconscious or subconscious bias in the Batson context. In Batson itself, Justice Marshall concurred to warn that "trial courts are ill equipped to second-guess" facially neutral reasons offered by prosecutors, who may not be conscious of their own bias. 476 U.S. at 105, 106, 106 S.Ct. 1712 (Marshall, J., concurring). In Miller-El, Justice Breyer echoed Justice Marshall’s views and cited evidence that, despite Batson, widespread racial discrimination in jury selection has persisted. 545 U.S. at 267-68, 125 S.Ct. 2317 (Breyer, J., concurring). Both concurrences pointed out shortcomings with the Batson framework and advocated eliminating peremptories altogether; neither is a binding pronouncement of Batson law.