United States v. Aguila-Montes De Oca

BERZON, Circuit Judge, concurring in the judgment, joined by Chief Judge KOZINSKI and Judges W. FLETCHER, M. SMITH, and N.R. SMITH:

It is common ground — as it of course has to be — that Taylor v. United States, the “grandfather” Supreme Court case on the question of applying federal recidivism statutes to particular prior convictions, instructs sentencing courts assessing a criminal defendant’s prior conviction to employ a “categorical approach.” 495 U.S. 575, 600, 110 S.Ct. 2143, 109 L.Ed.2d 607 (1990). Under that approach, we are directed to consider the elements of the crime of conviction in general, not the conduct underlying the defendant’s conviction in particular. Id. at 602, 110 S.Ct. 2143. The problem we address today arises when those elements and the requirements of the federal recidivist statute in question do not match.

As to that question, Taylor tells us that the categorical approach may be modified, but only “in a narrow range of cases where [the trier of fact] was actually required to find all the elements of [the] generic [crime].” Id. at 602, 110 S.Ct. 2143 (emphases added). This “modified categorical approach,” must remain categorical, not factual or “circumstance-specific.” Nijhawan v. Holder, — U.S.-, 129 S.Ct. 2294, 2298, 174 L.Ed.2d 22 (2009); see Taylor, 495 U.S. at 600-02,110 S.Ct. 2143. Were we to abandon the eategorical focus, we have been warned repeatedly, a number of practical and constitutional difficulties would ensue. See Chambers v. United States, 555 U.S. 122, 129 S.Ct. 687, 690, 172 L.Ed.2d 484 (2009); James v. United States, 550 U.S. 192, 214, 127 S.Ct. 1586, 167 L.Ed.2d 532 (2007); Shepard v. United States, 544 U.S. 13, 25, 125 S.Ct. 1254, 161 L.Ed.2d 205 (2005) (plurality op.); Shepard, 544 U.S. at 28, 125 S.Ct. 1254 (Thomas, J., concurring); Taylor, 495 U.S. at 600-01, 110 S.Ct. 2143.

The majority finds in these instructions license for sentencing courts to “consider[ ] to some degree the factual basis” of a (possibly decades-old) prior conviction. Bybee op. at 935. So long as the sentencing court is “confident,” upon examining the “prosecutorial theory of the case” and “the facts put forward by the government” in the earlier proceeding, that the trier of fact was “required” (in a practical, but not legal, sense) to find facts that would satisfy the generic crime, then it may enhance a defendant’s sentence on that basis. Id. at 936-37. And the sentencing court must do so not only when there was a trial, but also where there was a guilty plea, and thus no way to determine what “theory of the case” the nonexistent trier of fact must have adopted. Most crucially, the sentencing court need no longer confine itself to the facts related to the elements of the crime of conviction, even though the prior proceeding, whether ended with a jury verdict or a guilty plea, will have been concerned at bottom only with assessing those elements, and even though elements have long been the touchstone of the categorical and modified categorical approach.1 *948See Sykes v. United States, — U.S.-, 131 S.Ct. 2267, 2272, 180 L.Ed.2d 60 (2011); Johnson v. United States, — U.S. -, 130 S.Ct. 1265, 1272, 176 L.Ed.2d 1 (2010); Chambers, 129 S.Ct. at 690-91; Nijhawan, 129 S.Ct. at 2297-98; Begay v. United States, 553 U.S. 137, 145, 128 S.Ct. 1581, 170 L.Ed.2d 490 (2008); Gonzales v. Duenas-Alvarez, 549 U.S. 183, 186-87, 127 S.Ct. 815, 166 L.Ed.2d 683 (2007); Shepard, 544 U.S. at 19, 125 S.Ct. 1254; Leocal v. Ashcroft, 543 U.S. 1, 7, 125 S.Ct. 377, 160 L.Ed.2d 271 (2004); Taylor, 495 U.S. at 600-02, 110 S.Ct. 2143. In short, the majority has converted the modified categorical approach into a modified factual one.

The majority’s fact-based approach simply cannot be reconciled with Taylor and its many Supreme Court progeny. Taylor warned that “the practical difficulties and potential unfairness of a factual approach are daunting,” and therefore rejected a factual approach, even though “[i]n some cases, the indictment or other charging paper might reveal the theory or theories of the case presented to the jury.” Id. at 601, 110 S.Ct. 2143. ■ And the Court has recognized that a fact-based approach has even less traction in the guilty plea context. See id. at 601-02, 110 S.Ct. 2143.

In adopting its fact-based approach, the majority overrules ohr circuit’s controlling precedent; dismisses as “dicta” and “illustrative” the Supreme Court’s clear guidance on this very question, Bybee op. at 927-28, 931-32; misinterprets Taylor and Shepard; ignores the constraints of the Sixth Amendment, as developed in the Apprendi line of cases; misapprehends several essential characteristics of our nation’s institutions of criminal justice; and refuses to follow the limited modified categorical approach adopted by every circuit that has addressed the question since the Supreme Court made the proper approach lucidly clear in the last few years — in particular, since Johnson and Nijhawan. Because I believe that the modified categorical approach has been strictly limited to “divisible statutes,”2 I concur only in the judgment.

I.

Before delving into the “modified categorical” problem on a conceptual level, I begin where intermediate appellate judges ought to begin — with whether the issue before us is open to fair dispute as a matter of binding Supreme Court precedent. Unlike the majority, I conclude that it definitely is not, as virtually all other circuits have recently recognized.

As will appear, I do not think Taylor and Shepard ever meant the modified categorical approach to go beyond what the majority calls “divisible statutes.” Taylor and Shepard are examined in detail below. Suffice it to say for present purposes that Taylor held that when determining whether a defendant’s prior conviction qualifies under one of several federal recidivist statutes, sentencing courts are ordinarily instructed to compare the elements of the particular crime for which the defendant was convicted with the elements of the “generic” federal definition of that crime. See Taylor, 495 U.S. at 600-02, 110 S.Ct. 2143. When the elements match, the conviction qualifies for the recidivist enhancement. If, however, at least one of the elements of the crime of conviction is written in the disjunctive — criminalizing, for *949example, commission of an act with “a gun or a knife” — and a conviction under one statutory phrase (gun) would qualify under the federal recidivist statute, while a conviction under the other phrase (knife) would not, Taylor and Shepard allow the later sentencing court to consult a short list of records about that prior conviction to ascertain whether the crime of conviction meets the federal definition. See id. at 602, 110 S.Ct. 2148; Shepard, 544 U.S. at 26,125 S.Ct. 1254.

As I discuss later, the majority is right that the modified categorical approach outlined in Taylor and Shepard was, at times, interpreted differently in this court and in other circuits. See Bybee op. at 922-28, 931-32. But Taylor and Shepard are no longer the last word. It is therefore most useful to start with the Supreme Court’s recent cases, Nijhawan v. Holder, — U.S.-, 129 S.Ct. 2294, 174 L.Ed.2d 22 (2009), and Johnson v. United States, — U.S.-, 130 S.Ct. 1265, 176 L.Ed.2d 1 (2010), as they dispel any ambiguity concerning the reach of Taylor and Shepard. Nijhawan and Johnson indicated in the clearest of terms that the modified categorical approach is reserved for determining under which portion of a divisible statute a defendant was convicted.

A.

Nijhawan considered a provision of immigration law that authorizes removal of non-citizens who have a prior conviction for “an offense that ... involves fraud or deceit in which the loss to the victim or victims exceeds $10,000.” 129 S.Ct. at 2297 (alteration and emphasis in original) (quoting 8 U.S.C. § 1101(a)(43)(M)(i)). The Court was called upon to consider “whether the italicized language refers to an element of the fraud or deceit ‘offense’ as set forth in the particular fraud or deceit statute defining the offense of which the alien was previously convicted.” Id. “If so,” Nijhawan explained, “then in order to determine whether a prior conviction is for the kind of offense described, the immigration judge must look to the criminal fraud or deceit statute to see whether it contains a monetary threshold of $10,000 or more.” Id. (citing Taylor, 495 U.S. 575, 110 S.Ct. 2143) (emphasis added). After extensive textual analysis, see id. at 2298-2302, Nijhawan concluded that because the italicized language in the aggravated felony definition “does not refer to an element of the fraud or deceit crime,” it required a factual inquiry into the amount of loss actually occasioned by the alien’s particular prior criminal conduct, rather than a categorical inquiry into the elements of the prior conviction offense. Id. at 2298-99 (emphasis added).

The Court’s reasoning in Nijhawan could not be clearer: if the loss threshold referred to an element of the generic crime, then a crime of conviction would only qualify if “the criminal fraud or deceit statute ... contains a monetary threshold of $10,000 or more.” Id. at 2297. In other words, Nijhawan envisions a binary world of federal recidivism statutes: Factual inquiries into the circumstances of prior convictions are permitted if, but only if, the federal statute does not refer to the element of the prior crimes, but to the underlying circumstances of the prior crime — in which case the entire Taylor categorical analysis is inapplicable. See id. at 2298-2302.

The remainder of Nijhawan confirms this conclusion in spades. The petitioner in Nijhawan argued that even if the $10,000 loss threshold was not an element of the prior crime, the factual inquiry into the nature of that crime should be limited to the same documents in the record of conviction to which Shepard limits sentencing courts under the modified categorical approach. See 129 S.Ct. at 2302-03. The *950Court rejected the argument for several reasons, two of which are particularly relevant here.

First, Nijhawan said that “Taylor, James, and Shepard, the cases that developed the evidentiary list to which petitioner points, developed that list for a very different purpose, namely that of determining which statutory phrase (contained within a statutory provision that covers several different generic crimes) covered a prior conviction.” Id. at 2303 (emphasis added). Nijhawan then referred to its earlier description of the modified categorical approach, in which it explained that:

[T]he categorical method is not always easy to apply. Sometimes a separately numbered subsection of a criminal statute will refer to several different crimes, each described separately. And it can happen that some of these crimes involve violence while others do not. A single Massachusetts statute section entitled “Breaking and Entering at Night,” for example, criminalizes breaking into a “building, ship, vessel or vehicle.” In such an instance, we have said, a court must determine whether an offender’s prior conviction was for the violent, rather than the nonviolent, break-ins that this single five-word phrase describes (e.g., breaking into a building rather than into a vessel), by examining the indictment or information and jury instructions, or, if a guilty plea is at issue, by examining the plea agreement, plea colloquy or some comparable judicial record of the factual basis for the plea.

Id. at 2299 (emphasis added, citations and quotation marks omitted).

The second reason the Court gave for not restricting sentencing courts to the Shepard-sanctioned documents in ascertaining the loss amount was that “[this] proposal itself can prove impractical insofar as it requires obtaining from a jury a special verdict on a fact that ... is not an element of the offense.” Id. at 2303 (emphasis added).

In short, Nijhawan is crystal clear: The modified categorical approach is used to determine under which provision of a divisible statute a defendant was convicted, and it cannot be used to find non-elemental facts. The majority discusses Nijhawan (although tellingly, it does not mention its “special verdict” reasoning) but brushes its guidance aside as “dicta.” Bybee op. at 927-28. Regardless of one’s definition of “dicta,” this isn’t it.

Nijhawan’s discussion of the scope, purpose, and applicability of the categorical and modified categorical approach to Ni~ jhawan’s case was in direct response to arguments briefed and pressed by the parties. The arguments were further developed in four amicus curiae briefs and in the opinions of the Third Circuit, Nijhawan v. Att’y Gen. of the U.S., 523 F.3d 387 (3d Cir.2008), and the Board of Immigration Appeals, In re Nijhawan, No. A39 075 734, 2006 WL 3088788 (B.I.A. Aug. 8, 2006). There is no concern, therefore, that the issues were not adequately presented — or, as Judge Posner has put it, that they “w[ere] not refined by the fires of adversary presentation.” United States v. Crawley, 837 F.2d 291, 293 (7th Cir.1988).

Moreover, as explained above, the Court’s holding — that a categorical approach is inapplicable to non-elemental facts, and, consequently, the court or agency may rely on non-elemental facts only where a statute permits a non-categorical approach — was ultimately dispositive of Nijhawan’s case. The Court’s extended discussion of the categorical and modified categorical approach was thus “grounded in the facts of the case,” id., and was certainly “ ‘germane to [its] eventual resolution,’ ” Miranda B. v. Kitzhaber, 328 F.3d 1181, 1186 (9th Cir.2003) (per curiam) (citation omitted). These were not “state*951mentfs] ... uttered in passing without due consideration of the alternatives.” United States v. Johnson, 256 F.3d 895, 915 (9th Cir.2001) (en banc) (Kozinski, J., concurring). Additionally, since these holdings were part of the logical reasoning provided in support of Nijhawan’s outcome, the reasons that the Court gave for its conclusions cannot be described as “unnecessary,” Miller v. Gammie, 335 F.3d 889, 902 (9th Cir.2003) (en banc) (Tashima, J., concurring), any more than the ground floor is “unnecessary” to a multi-story building. So while the majority may find it convenient to dismiss Nijhawan’s pertinent reasoning as mere “dicta,” that does not make it so. Further, even if it were dicta, as the majority suggests — and again, it certainly is not — we must “treat Supreme Court dicta with due deference,” not brush it aside, as the majority does. United States v. Baird, 85 F.3d 450, 453 (9th Cir.1996). “As we have frequently acknowledged, Supreme Court dicta have a weight that is greater than ordinary judicial dicta as prophecy of what that Court might hold; accordingly, we do not blandly shrug them off because they were not a holding.” United States v. Montero-Camargo, 208 F.3d 1122, 1132 n. 17 (9th Cir.2000) (en banc) (citation and quotation marks omitted).

B.

Were Nijhawan not clear enough — and it is — Johnson dispels any remaining doubt. Johnson held that a conviction under Florida’s divisible battery statute, Fla. Stat. § 784.03, was not categorically a violent felony because the statute encompassed convictions for “any intentional physical contact, no matter how slight.” Johnson, 130 S.Ct. at 1269-70 (citation and quotation marks omitted). Such convictions, the Supreme Court held, lacked the “violent force” necessary to make a conviction thereunder a “violent felony” for purposes of the Armed Career Criminal Act, 18 U.S.C. § 924(e)(2)(B). See Johnson, 130 S.Ct. at 1271. The dissenters objected that this holding would make it more difficult to remove noncitizens convicted under that statute and other “generic felony-battery statutes that cover both violent force and unwanted physical contact.” Id. at 1273 (characterizing dissenting opinion of Alito, J.). The Court responded:

This exaggerates the practical effect of our decision. When the law under which the defendant has been convicted contains statutory phrases that cover several different generic crimes, some of which require violent force and some of which do not, the “modified categorical approach” that we have approved permits a court to determine which statutory phrase was the basis for the conviction by consulting the trial record____

Id. (citation and quotation marks omitted).

The majority reads Johnson’s description of the functioning of the modified categorical approach as merely “illustrative.” Bybee op. at 931-32. But it gives no reason to take such a view of this passage in Johnson, and there is none. Johnson sought to highlight the flexibility that the modified categorical approach provides. So, if the modified categorical approach does, as the majority in this case maintains, allow the sentencing court to examine “the facts put forward by the government” in the earlier proceeding to determine the facts on which the conviction must have rested, Bybee op. at 936-37, the Supreme Court had every reason to say .so. That Johnson instead contemplates a far more circumscribed, less flexible inquiry — one limited to identifying “which statutory phrase was the basis for the conviction,” 130 S.Ct. at 1273 — confirms that nothing more is permissible.

The majority says that it has “several reasons” for declining to follow the clear import of Nijhawan and Johnson, but it only names two. Both amount to the as*952sertion that since the Supreme Court did not say explicitly that we cannot do what the majority now does, it’s fair game. See Bybee op. at 930-31. Not so. If we dutifully apply the principles enunciated by the Supreme Court, we can only conclude that the modified categorical approach applies only to divisible statutes.

C.

Seeking support for its conclusion that the question we address is an open one in the Supreme Court, the majority incorrectly maintains that the other courts of appeal are in broad disagreement as to the correct modified categorical approach. In fact, since 2008, and especially since Nijhawan, there has been a steady march toward applying the modified categorical approach only to divisible statutes.

It is fair to say that the courts of appeals — including this one — failed at first fully to appreciate the outer limits of the categorical and modified categorical approaches. At one time, the courts of appeals settled into essentially three camps: Some, recognizing that juries are never required to find facts that go beyond the elements of the crime, ruled that the modified categorical approach is available only to determine under which portion of a divisible statute the defendant was convicted. See, e.g., United States v. Smith, 544 F.3d 781, 786-87 (7th Cir.2008); United States v. Howell, 531 F.3d 621, 622-23 (8th Cir.2008); United States v. Gonzalez-Terrazas, 529 F.3d 293, 297-98 (5th Cir.2008). Other courts applied the modified categorical approach more liberally, finding that prior convictions rested on facts that appeared nowhere in the statute of conviction. See, e.g., United States v. Armstead, 467 F.3d 943, 947-48 (6th Cir.2006); Vardas v. Dep’t of Homeland Sec., 451 F.3d 1105, 1108-09 (10th Cir.2006).

In this circuit, after considerable waffling, we struck a middle course, interpreting Taylor and Shepard to permit resort to the modified categorical approach to find generic facts that are fairly encompassed within an element of the statute of conviction — i.e., in both what the majority terms the “divisible statute” and “broad element” situation, see Bybee op. at 924-25 — but not when the generic element was entirely missing. Navarro-Lopez v. Gonzales, 503 F.3d 1063, 1073 (9th Cir.2007) (en banc).3 This approach required a connection between the generic fact to be found at sentencing and the elements found in the prior proceeding, and so provided assurance that facts mined from the record of conviction would have been viewed by the parties to the prior proceeding as material and thus worth contesting. See also Li v. Ashcroft, 389 F.3d 892, 900 (9th Cir.2004) (Kozinski, J., concurring).

This three-way split is gone. The majority strains mightily to manufacture ambiguity in the jurisprudence of a few circuits, see Bybee op. at 10606-12, but the stark reality is that since Nijhawan and Johnson, every circuit to examine the issue has applied an approach consistent with those two cases’ pronouncements, the majority’s relegation of those pronouncements to the “dicta” wastebasket notwithstanding.

For example, the majority cites two Tenth Circuit cases (from 2006 and 2007, respectively) as support for a fact-based approach. See id. at 934 & n. 15. But those cases do not allow a sentencing court to review non-element facts.4 Moreover, *953the majority overlooks later Tenth Circuit case law clarifying further that the modified categorical approach “does not involve a subjective inquiry into the facts of the case, but rather its purpose is to determine ‘which part of the statute was charged against the defendant and, thus, which portion of the statute to examine on its face.’ ” United States v. Charles, 576 F.3d 1060, 1067 (10th Cir.2009). Indeed, in 2008 the Tenth Circuit resolved an intra-circuit split that involved, at times, cases that were “not always focused on the elements of the prior conviction.” United States v. Zuniga-Soto, 527 F.3d 1110, 1121 (10th Cir.2008). The Tenth Circuit resolved this conflict by declaring that when determining whether a prior conviction was for a “crime of violence,” thereby qualifying the defendant for a sentencing enhancement, courts “must consider only the statutory definition of the prior offense and not the specific factual circumstances underlying the defendant’s conviction.”5 Id.

The majority similarly, and similarly erroneously, claims support for its open-ended approach from a 2006 case from the Sixth Circuit, United States v. Armstead, 467 F.3d 943, 947-48 (6th Cir.2006). By-bee op. at 933 & n. 13. But in United States v. Bartee, 529 F.3d 357 (6th Cir. 2008), the Sixth Circuit rejected an argument that presaged the analysis adopted by the majority.

The criminal defendant in Bartee had been previously convicted of violating a statute that criminalizes “sexual contact with another person.... under circumstances involving the commission of any other felony.” Mich. Comp. Laws Ann. § 750.520c(l)(c); see Bartee, 529 F.3d at 360. The information to which the defendant had pleaded guilty identified the “other felony” as “solicitation of a minor for immoral purposes.” Id. The information *954even identified the individual with whom the defendant had sexual contact while soliciting a minor for immoral purposes as “Angela,” and there was no dispute in Bartee that Angela was 15 years old. Id. at 360-61. The Government conceded that the defendant’s conviction for criminal sexual contact lacked the federal generic crime’s element of contact with a minor, but urged the court to “use ‘common sense’ to infer that since the defendant had sexual contact with Angela ‘while’ soliciting a minor, Angela must have been that minor and, therefore, the sexual contact must have been with a minor.” Id. at 361.

The Sixth Circuit emphatically rejected the Government’s argument. Although it recognized that “this [inference] appears to have been the case factually,” the court agreed “with defendant that, categorically speaking, the conviction did not necessarily require proof of sexual contact with a minor.” Id. Bartee in fact criticized the district court for permitting “facts [to] invade[ ] [its] analysis.” Id.6

Some other examples: Citing a 2004 case, the majority claims that the Third Circuit’s approach is “ambiguous.” Bybee op. at 934-35 & n. 16. Whether or not that is a fair characterization of the Third Circuit case cited,7 the Third Circuit now consistently applies the divisible statute approach. Jearu-Louis v. Attorney General of the United States, 582 F.3d 462 (3d Cir.2009), summarized that approach:

Where a statute of conviction contains disjunctive elements, some of which are sufficient for conviction of the federal offense and others of which are not, we have departed from a strict categorical approach. In such a case, we have conducted a limited factual inquiry, examining the record of conviction for the narrow purpose of determining the specific subpart under which the defendant was convicted.

Id. at 466; see also Thomas v. Att’y Gen. of U.S., 625 F.3d 134, 143-47 (3d Cir.2010); United States v. Stinson, 592 F.3d 460, 462 (3d Cir.2010); United States v. Johnson, 587 F.3d 203, 208, 214 (3d Cir.2009).

The majority similarly, and incorrectly, describes the state of the law in the Second and Eleventh Circuits as “ambiguous.” Bybee op. at 934-35 & nn. 15 & 17. Not so. Both the Second8 and Eleventh9 Circuit restrict the modified categorical ap*955proach to the divisible statute situation. The majority’s attempt to muddy the Seventh Circuit’s jurisprudence is similarly unavailing, as that circuit also refuses to apply the modified categorical approach unless the statute is divisible.10 See United States v. Sonnenberg, 628 F.3d 361, 367 (7th Cir.2010) (refusing to apply the modified categorical approach to the statute at issue, as it “simply was not drafted so as to be divisible in th[e] [relevant] manner,” and even though the conduct underlying the conviction was clear); United States v. Woods, 576 F.3d 400, 406 (7th Cir.2009); United States v. Smith, 544 F.3d 781, 786-87 (7th Cir.2008).

The final count: All of our sister circuits (except for the District of Columbia Circuit, which apparently has had no occasion to weigh in on whether the modified categorical approach applies beyond the divisible statute context11) now apply the modified categorical approach only to divisible statutes.12 By overruling Navarro-Lopez, *956our circuit becomes the only one to expand the scope of our modified categorical inquiries in the wake of the Supreme Court’s recent, lucid direction that we narrowly restrict them.

II.

Even were we free to ignore the more recent Supreme Court cases — and of course we are not — the majority could not adopt the rule that the modified categorical approach is available to find any facts the jury “must have found.” Bybee op. at 935. Taylor and Shepard simply do not admit of that interpretation.

A.

Taylor considered a conviction under a state burglary statute.13 495 U.S. at 578 n. 1, 110 S.Ct. 2143. The Court first concluded that crimes described by federal recidivism statutes must be understood generically — that is, as describing uniform elements rather than the elements defined by each state’s law. Id. at 592, 110 S.Ct. 2143. After defining the generic crime of burglary, Taylor turned to the problem of determining whether a particular state conviction was for the generic, federally-defined crime — here and in Taylor, the crime of burglary. Framing the choice of approaches to this problem as “whether the sentencing court ... must look only to the statutory definitions of the prior offenses, or whether the court may consider other evidence concerning the defendant’s prior crimes,” Taylor adopted the then-uniform position of the courts of appeals, that courts must look “only to the statutory definition[ ].” Id. at 600, 110 S.Ct. 2143 (citing, inter alia, United States v. Chatman, 869 F.2d 525, 529 (9th Cir.1989), and United States v. Sherbondy, 865 F.2d 996, 1006-10 (9th Cir.1988)).

Taylor then directly addressed the question before us today: How do we match a prior state conviction to the crime covered by a federal recidivism statute where the state conviction was under a statute that prohibits both conduct covered by the federal statute and other conduct that does not count for federal purposes. In that circumstance, Taylor permitted federal courts and agencies to go beyond consulting the state statutory definition, but only in “a narrow range of cases where a jury was actually required to find all the elements of [the] generic [crime].” Id. at 602, 110 S.Ct. 2143 (emphasis added).

A jury is only “required” to find whether, on the facts before it, the elements of the crime charged have been proven. Other factual circumstances surrounding the crime — -if it was a dark and stormy night, whether the postman actually rang twice, that the defendant wore a scarlet kimono, whether defendant harmed the victim using a gun or a blunt object, that a note with the word “moor” was found — may be central, even essential, considerations for the jury in determining what actually happened, as any reader of Sherlock Holmes stories or Agatha Christie novels knows. Still, while “a jury in a federal criminal case cannot convict unless it unanimously finds that the Government has proved each element,” the jury need not “decide unanimously which of several possible sets of underlying brute facts make up a particular element, say, which of several possible means the defendant used to commit an element of the crime.” Richardson v. United States, 526 U.S. 813, 817, 119 S.Ct. 1707, 143 L.Ed.2d 985 (1999).14 Because juries are never “required” to find any*957thing other than elements of the crime as set out in the pertinent statute, the fact-based rule the majority adopts today is at odds with Taylor.

The majority’s approach is no more consistent with Shepard’s more specific instructions for applying the modified categorical approach to guilty pleas. See 544 U.S. at 16, 125 S.Ct. 1254. Shepard, like this case, considered a guilty plea under a divisible, non-generic burglary statute. Id. Translating Taylor’s “actually required” standard to the plea context, Shepard held that the modified categorical approach is available only to find facts that were “necessarily admitted” in the prior proceeding. Id. Disapproving a standard strikingly similar to the one the majority adopts in this case, Shepard declared off-limits factual determinations “about what the defendant and state judge must have understood as the factual basis of the prior plea.” Id. at 25, 125 S.Ct. 1254 (plurality op.) (emphasis added). Shepard also emphasized that no matter the likelihood in light of the plausible theories apparent from the record that a conviction rested on generic facts, that likelihood could not satisfy Taylor’s “demand for certainty.” Id. at 21-22, 125 S.Ct. 1254 (majority op.).

Crucially, Shepard rejected an alternate, factual approach even where, as was the case in Shepard itself, “the records of the prior convictions ... are in each instance free from -any inconsistent, competing evidence on the pivotal issue of fact separating generic from nongeneric burglary.” Id. at 22, 125 S.Ct. 1254. The plurality15 portion of the opinion in Shepard explained that.it was “limiting] the scope of judicial factfinding on the disputed generic character .of a prior plea” in order “to avoid serious risks of unconstitutionality” presented by the need — in light of the intervening decision in Apprendi v. New Jersey, 530, U.S. 466, 120 S.Ct. 2348, 147 L.Ed.2d 435 (2000), discussed in more detail below — for a jury to find “any disputed fact essential to increasing] the ceiling of a potential sentence.” Shepard, 544 U.S. at 25-26, 125 S.Ct. 1254.

In short, Shepard like Taylor, permits application of the modified categorical approach only when a prior conviction can be said as a matter of law to have rested on generic elements found by a jury or admitted by the defendant. No inferences from the factual context are allowed. ■

B.

The majority’s fact-based approach, limited though it purports to be, just cannot be squared with Taylor or Shepard, let alone with Nijhawan and Johnson.

According to the majority, sentencing courts need not stick to what the trier of fact was legally required to find, or the defendant was legally required to admit; they are free to determine what must have been found or admitted, in light of the “prosecutorial theory of the case.” Bybee op. at 936. As the majority concedes, its *958approach authorizes sentencing courts to enhance sentences based on factual inferences concerning the prior conviction.

The majority’s own examples show why this formulation flies in the face of Taylor and Shepard. Positing a hypothetical aggravated assault statute with only one element, harmful contact, the majority asserts that a sentencing court could use the modified categorical approach to find that, “given the facts put forward by the government, the jury was ‘required’ to find that the defendant used a gun,” id., if the record includes “an indictment alleging that the defendant used a gun to inflict harmful contact on a victim from 200 feet away.” Id. at 937. This allegation alone, we are told, would establish that “the fact-finder was actually required to find the facts that satisfy the elements of generic aggravated assault.” Id.

Again, not so. The majority’s example seems to assume that where an indictment alleges a fact not essential to the conviction — like the fact that a gun was used, when the statute is violated by any harmful contact — the jury must find that fact to convict. That’s wrong. Although state law may impose additional requirements, the federal constitution requires only that juries agree as to elements of the crime, and juries are generally free to disagree as to means by which the defendant committed a particular element. See Richardson, 526 U.S. at 817, 119 S.Ct. 1707 (“[A] federal jury need not always decide unanimously ... which of several possible means the defendant used to commit an element of the crime.”); Schad, 501 U.S. at 631-32, 111 S.Ct. 2491 (plurality op.); United States v. Hofus, 598 F.3d 1171, 1176-77 (9th Cir.2010) (holding that the jury did not need to agree as to which particular act of the defendant was a “substantial step” toward the commission of a crime sufficient to find the defendant guilty of attempt).

So, to work, the majority’s “theory of the case” thesis must depend on an antecedent inference — namely, that no evidence supporting any other theory of harm was ever presented to or could have been inferred by the factfinder. And that is a question of fact. Consequently, and despite its protestations to the contrary, the majority’s approach does permit a factual inquiry — specifically, an inquiry into what the participants in the prior proceeding must have been thinking and doing. What is the standard of proof for this factual inquiry, according to the majority? Apparently, before enhancing a defendant’s sentence, the judge need only be “confident” that the factfinder in the prior proceeding was “required” to find a fact that it was not actually required to find. See Bybee op. 936-37.

The Shepard majority in no uncertain terms forbade the majority’s fact-lite “theory of the case” approach. Shepard recognized the logic underlying that approach:

If the transcript of a jury trial showed testimony about a building break, one could say that the jury’s verdict rested on a finding to that effect. If the trial record showed no evidence of felonious entrance to anything but a building or structure, the odds that the offense actually committed was generic burglary would be a turf accountant’s dream.

544 U.S. at 22, 125 S.Ct. 1254. But Shepard was emphatic that despite its commonsense appeal, this fact-based investigation would be “a menace to Taylor ” and would overstep the “limitation [at] the heart of [that] decision.” Id. at 22-23, 125 S.Ct. 1254.

Indeed, Taylor expressly rejected basing any inferences on the “theory or theories of the case” presented to the factfinder:

In some cases, the indictment or other charging paper might reveal the theory *959or theories of the case presented to the jury. In other cases, however, only the Government’s actual proof at trial would indicate whether the defendant’s conduct constituted generic burglary. Would the Government be permitted to introduce the trial transcript before the sentencing court, or if no transcript is available, present the testimony of witnesses? Could the defense present witnesses of its own and argue that the jury might have returned a guilty verdict on some theory that did not require a finding that the defendant committed generic burglary?

Taylor, 495 U.S. at 601, 110 S.Ct. 2143. As this passage indicates, Taylors rule— that the sentencing court can “look only to the fact of conviction and the statutory definition of the prior offense,” id. at 602, 110 S.Ct. 2143 — was based on the same practical and constitutional difficulties that the majority holds we can ignore, so long as we do so with “confidence.”

Taylor and Shepard’s square rejection of the majority’s “theory of the case” approach is reason enough to cast it away, even if one manages somehow to put aside Nijhawan and Johnson. But, digging deeper, it becomes apparent that several of the reasons that Taylor and Shepard gave for rejecting any factual approach apply equally to the majority’s purportedly modest proposal: The majority’s “theory of the case” factual analysis will lead to routine violations of the Sixth Amendment right to trial by jury, as articulated in Apprendi; will create massive practical difficulties; and will subject defendants who plead guilty to unfair and unintended consequences.

1.

The majority’s fact-based approach entirely disregards an underlying, essential premise of Shepard — that strict adherence to the Taylor rule is required to avoid “serious risks of unconstitutionality” in light of the Sixth Amendment rule announced in Apprendi v. New Jersey, 530 U.S. 466, 120 S.Ct. 2348, 147 L.Ed.2d 435 (2000). Shepard’s concern was directly put: “If the sentencing court were to conclude, from its own review of the record, that the defendant [who was convicted under a nongeneric burglary statute] actually committed a generic burglary, could the defendant challenge this conclusion as abridging his right to a jury trial?” 544 U.S. at 24, 125 S.Ct. 1254 (plurality op.) (quoting Taylor, 495 U.S. at 601, 110 S.Ct. 2143) (alterations in original). The majority’s factual approach only exacerbates this concern.

The Sixth and Fourteenth Amendments guarantee the right of criminal defendants to have any fact that increases the statutory maximum sentence submitted to a jury and proved beyond a reasonable doubt.16 *960Apprendi, 530 U.S. at 490, 120 S.Ct. 2348. Before Apprendi, Almendarez-Torres v. United States, 523 U.S. 224, 118 S.Ct. 1219, 140 L.Ed.2d 350 (1998), had held that the fact of recidivism was a sentencing factor, rather than an element, and therefore did not have to be alleged in an indictment or proved to a jury. See id. at 226-27, 118 S.Ct. 1219. To preserve Almendarez-Torres, Apprendi retained, albeit with some hesitation, one “narrow exception” to the general Sixth Amendment rule, holding that “the fact of a prior conviction” is not subject to the same constitutional safeguards that apply to other facts that increase a sentencing range. Apprendi, 530 U.S. at 490, 120 S.Ct. 2348. Accordingly, applications of the modified categorical approach that increase the maximum sentence are permissible under Apprendi, but only if the sentencing court confines itself to finding “the fact of a prior conviction.” Id.

Apprendi did not suggest that finding facts about a prior conviction was permissible; the exception was limited by its terms to the fact of a prior conviction. Moreover, several aspects of Apprendi’s treatment of Almendarez-Torres reinforce that this exception, like the modified categorical approach, is indeed “narrow,” and meant to remain so. Id.

First, Apprendi acknowledged that Almendarez-Torres is in significant tension with its holding, calling Almendarez-Torres “at best an exceptional departure from ... historic practice” and declaring that “it is arguable that Almendarez-Torres was incorrectly decided.” Id. at 487, 489, 120 S.Ct. 2348. Second, Apprendi signaled that it only tolerated the continued vitality of Almendarez-Torres on its “unique facts,” id. at 490, 120 S.Ct. 2348, where the defendant had conceded that his prior convictions were categorical aggravated felonies. See Almendarez-Torres, 523 U.S. at 227, 118 S.Ct. 1219. As Apprendi explained:

Both the certainty that procedural safeguards attached to any “fact” of prior conviction, and the reality that Almendarez-Torres did not challenge the accuracy of that “fact” in his case, mitigated the due process and Sixth Amendment concerns otherwise implicated in allowing a judge to determine a “fact” increasing punishment beyond the maximum of the statutory range.

530 U.S. at 488, 120 S.Ct. 2348. Third, by thrice placing in quotation marks the word “fact” in its discussion of the AlmendarezTorres exception, Apprendi indicated that the fact of a prior conviction is so different in kind from other facts that it can scarcely be so called. See id. Criminal convictions are accompanied by sufficient “procedural safeguards” that the fact of a prior conviction attains a level of “certainty” that other facts do not, and therefore merits special treatment. Id.; see also Shepard, 544 U.S. at 21-22, 125 S.Ct. 1254 (rejecting the Government’s argument that *961sentencing courts should be permitted to examine police reports “free from any inconsistent, competing evidence on the pivotal issue of fact” because they do not satisfy “Taylaf s demand for certainty”); Wilson v. Knowles, 638 F.3d 1213, 1215 (9th Cir.2011) (“It would'be unreasonable to read Apprendi as allowing a sentencing judge to find the kinds of disputed facts at issue here — such as the extent of the victim’s injuries and how the accident occurred. These are not historical, judicially noticeable facts.... [and] Wilson did not have any reason to contest these alleged facts when he was convicted in 1993.” (citations and footnote omitted)); United States v. Von Brown, 417 F.3d 1077, 1079 (9th Cir.2005) (per curiam) (“[T]he categorical and modified categorical analyses ... prohibit inquiry into the facts underlying a prior conviction.... When [these] approaches] [are] followed, the categorization of a prior conviction as a ‘violent felony1 or a ‘crime of violence’ is a legal question, not a factual question coming within the purview of Apprendi, Blakely [v. Washington, 542 U.S. 296, 124 S.Ct. 2531, 159 L.Ed.2d 403 (2004) ], and Booker.”).

The majority’s approach permits fact-finding that goes well beyond Apprendi’s “narrow” exception for the “fact of a prior conviction,” and so does not meet the “certainty” requirement that justifies the exception. As explained in Part III, the majority extends the Almendarez-Torres exception to instances in which the “procedural safeguards” that undergird that exception are flimsy or entirely absent. By doing so, the majority countenances the violation of the Sixth Amendment rights of criminal defendants.

2.

As Taylor observed, determining from a record of conviction what factual theories and arguments were advanced in the prior proceeding poses a “daunting” practical difficulty. 495 U.S. at 601, 110 S.Ct. 2143. In only “some cases” will “the indictment or other charging paper ... reveal the theory or theories of the case presented to the jury”; other cases would require resort to “the Government’s actual proof at trial” to make this determination. Id. But because Shepard did not include the trial transcript in its list of “records of the convicting court” that are properly considered under the modified categorical approach, 544 U.S. at 20-23, 125 S.Ct. 1254, it may not be consulted. See also Taylor, 495 U.S. at 601, 110 S.Ct. 2143; United States v. Espinoza-Morales, 621 F.3d 1141, 1152 (9th Cir.2010).

Even in the relatively few cases that go to trial,17 the record of a prior conviction itself will often present an incomplete or inaccurate picture of what was argued and disputed in the proceeding underlying the conviction. See Taylor, 495 U.S. at 601, 110 S.Ct. 2143. For example, a charging document may, but may not, outline the prosecution’s theory in a manner that contains details that are not essential to the conviction but that fit the federal recidivism statute. When the charging document alleges non-elemental facts, a subse*962quent jury verdict of guilty does not mean that the jury found those non-elements established. And if the defendant has “overwhelming evidence” to dispute an alleged non-elemental fact, he may have reasonably chosen not to present it to the jury, as it “would have been a waste of time and probably excluded as irrelevant, since [the evidence about that fact] was not an element of the offense for which he was being tried.” Li, 389 F.3d at 900 (Kozinski, J., concurring).

Additionally, a defendant’s contrary theory of the case will rarely, if ever, appear from the records of conviction. A question the majority does not address is whether the defendant would be able to introduce his own evidence to show that the jury could have convicted (or he could have pleaded guilty) on an alternate theory. Due process would likely dictate that he have that opportunity, see United States v. Petty, 982 F.2d 1365, 1369 (9th Cir.), amended by 992 F.2d 1015 (9th Cir.1993), but this would lead the district court into the very factfinding Taylor sought to avoid. See Taylor, 495 U.S. at 601, 110 S.Ct. 2143 (“Could the defense present witnesses ... and argue that the jury might have returned a guilty verdict on some theory that did not require a finding that the defendant committed [the] generic [crime]?”). Even if this opportunity is afforded, however, defendants will nearly always be at a disadvantage vis-a-vis the prosecutor in arguing what theories of the case the jury (or judge) could have accepted. Because defendants have no burden in criminal cases, they never have an obligation to convince anyone of anything, and so may have opted not to introduce any evidence into the record. Sentencing courts are thus left to conduct their factual inquiry into what “must have” happened in the prior proceeding based on records that, at best, tell only part of the story.18

Moreover, although the majority focuses on convictions after trial, in practice the overwhelming majority of modified categorical inquiries will consider convictions entered pursuant to a guilty plea, not trial. The “practical difficulties” posed by the majority’s approach when applied to guilty pleas are, if anything, more “daunting” than when applied to convictions after trial. Taylor, 495 U.S. at 600,110 S.Ct. 2143.

The majority’s limited discussion of the vastly prevalent plea context takes off from the premise that “[w]hen a defendant pleads guilty to a count, he admits the factual allegations stated in that count.” Bybee op. at 945. Once more, the majority is wrong.

For decades, our case law has been clear: “We have declined to treat ‘guilty pleas as admitting factual allegations in the indictment not essential to the government’s proof of the offense.’ ” United States v. Forrester, 616 F.3d 929, 945 (9th *963Cir.2010) (quoting United States v. Cazares, 121 F.3d 1241, 1247 (9th Cir.1997)). “Any other rule,” Cazares explained, “would be inconsistent with the rationale underlying these decisions that ‘[t]he effect [of a guilty plea] is the same as if [the defendant] had been tried before a jury and had been found guilty on evidence covering all of the material facts.’ ” 121 F.3d at 1247 (quoting United States v. Davis, 452 F.2d 577, 578 (9th Cir.1971) (per curiam) (all but first alteration in original)). In Cazares and Forrester, we held that non-elemental facts recited in an indictment and a signed plea agreement are not admitted. See Forrester, 616 F.3d at 946; accord Malta-Espinoza v. Gonzales, 478 F.3d 1080, 1082 n. 3 (9th Cir. 2007) (“[A] plea of guilty admits only the elements of the charge necessary for a conviction.”); United States v. Thomas, 355 F.3d 1191, 1196-98 (9th Cir.2004); 5 Wayne R. LaFave et al., Criminal Procedure § 21.4(f) n. 171 (3d ed. 2007 & Supp. 2010) (“Though there is some authority that a plea of guilty also admits factual allegations in the indictment not essential to the government’s proof, the Cazares court wisely rejected that view as ‘inconsistent with Rule 11.’ ” (citations omitted)); see also Bargas v. Burns, 179 F.3d 1207, 1216 n. 6 (9th Cir.1999) (“We have repeatedly held that language [in an indictment] that describes elements beyond what is required under [the] statute is surplusage and need not be proved at trial.”).19

So, in federal court, a defendant pleading guilty is only required to admit the elements of the offense, including specifying one or another of any alternative elements. Under our cases, then, an attorney advising his client about whether he should attempt to correct erroneous or disputed non-elemental facts contained in a plea agreement or indictment would be perfectly justified in assuring his client that there is simply no need to correct those misstatements of fact. Not only are they by definition irrelevant to guilt or innocence, but they are also not admitted by a guilty plea, standing alone. In this circumstance, there is little to gain by squabbling over irrelevant facts, and perhaps much to lose, such as the prosecutor’s goodwill or the sentencing judge’s perception that the defendant has accepted responsibility for his actions.20

In state courts — which, of course, handle the majority of criminal prosecutions— whether a guilty plea admits non-elemental facts varies by jurisdiction. Compare State v. Kappelman, 162 Or.App. 170, 986 *964P.2d 603, 605 (1999) (holding that defendant’s guilty plea “was not an admission of any facts that went beyond the essential elements of the charge”) with Wallace v. State, 308 S.W.3d 283, 286-87 (Mo.Ct.App. 2010) (“A plea of guilty is an admission as to the facts alleged in the information.” (citation and quotation marks omitted)). The majority, despite its purported concern for uniformity, adds an additional layer of dis uniformity in the application of the modified categorical approach, contingent on each state’s procedural rules regarding whether non-elemental facts are admitted by a guilty plea.

In sum, the majority’s inquiry, extending to non-elemental facts, goes beyond asking what the defendant necessarily admitted, as Shepard commands. Instead, it embarks on a fact-mining inquiry that is without support in federal case law and ignores the procedural niceties applicable in various state courts.

3.

Finally, application of the majority’s “theory of the case” approach to guilty plea convictions, in particular, bears out “the unfairness of a factual approach.” Taylor, 495 U.S. at 601, 110 S.Ct. 2143.

Defendants often plead, as part of plea “deals” with the prosecution, to less serious crimes than originally charged. See, e.g., Ellis v. U.S. Dist. Court for the W. Dist. of Wash. (Tacoma) (In re Ellis), 356 F.3d 1198, 1210 (9th Cir.2004) (en banc) (discussing the factors that go into a prosecutor’s decision to offer a plea to a lesser charge, including “allocation of prosecutorial resources, ... the relative strengths of various cases and charges,” and the defendant’s particular circumstances (citation and quotation marks omitted)). If the modified categorical approach can be used to find facts not grounded in the elements of the crime of conviction, a defendant can receive a sentencing enhancement for pleading guilty to a generic crime even when the essence of the plea bargain was the surrender of the defendant’s right to a trial by jury in exchange for the assurance that he would not be convicted of that generic crime.

Taylor recognized this very problem and shaped its non-fact-based approach to pri- or convictions to avoid it: “Even if the Government were able to prove [facts constituting generic burglary], if a guilty plea to a lesser, nonburglary offense was the result of a plea bargain, it would seem unfair to impose a sentence enhancement as if the defendant had pleaded guilty to burglary.” Id. at 601-02, 110 S.Ct. 2143. In other words, Taylor said that no matter how “confident” we might be that the defendant had committed generic burglary, that does not matter unless he was convicted of generic burglary.

As one would expect from the prevalence of pleas to lesser offenses than those originally charged, Taylor’s concern was not hypothetical. Before Navarro-Lopez, at least one panel of this court concluded that a defendant who pleaded guilty to a lesser included offense was actually convicted of the more serious offense with which he was originally charged. In United States v. Guerrero-Velasquez, 434 F.3d 1193 (9th Cir.2006), Guerrero-Velasquez was originally charged with Washington first-degree burglary, a categorical “burglary of a dwelling,” but pleaded guilty to second-degree burglary, “which expressly excludes burglaries of dwellings.” Id. at 1197 & n. 5. Nonetheless, the court held that the guilty plea to second-degree burglary admitted all of the facts in the indictment for first-degree burglary — the crime to which Guerrero-Velasquez did not plead — and accordingly held that Guerrero-Velasquez could be considered convicted of generic burglary for purposes of federal sentencing. Id. Needless to say, *965Guerrero-Velasquez was in dramatic conflict with Taylor.

The majority’s version of the modified categorical approach invites a parade of Guerrero-Velasquezes. For example, a defendant charged with willful infliction of corporal injury on a spouse, Cal.Penal Code § 273.5(a), who pleaded guilty only to simple battery, CaLPenal Code § 242, could be found to have been convicted of the more serious “crime of domestic violence,” 8 U.S.C. § 1227(a)(2)(E)(l)(i), and thus ineligible for cancellation of removal, based upon facts recited in a superseding indictment, superseding information, or plea agreement. Or a defendant charged with sexual abuse of a minor, CaLPenal Code § 261.5(d), who in fact maintained that he engaged in consensual intercourse with his seventeen-year-old girlfriend and thus pleaded guilty only to misdemeanor sexual intercourse with a person under eighteen, CaLPenal Code § 261.5(b), may nevertheless be found to have admitted to the more serious crime on the same basis.

Such an approach creates the very “potential for unfairness” Taylor warned against. 495 U.S. at 601, 110 S.Ct. 2143. It provides the government with more than it bargained for, the defendant with less. Put another way, it treats the defendant as having conceded that the government met its burden of proving beyond a reasonable doubt an element essential to the federal recidivism statute when, in fact, the government chose to forego the need for such proof by charging a lesser crime as to which the element did not matter. Not only is this unfair, but it will undoubtedly discourage defendants from pleading guilty. What good is a bargain that a later court might rewrite? The majority increases the chances defendants will go to trial, and the corresponding burden on state and federal trial courts.

In sum, the majority’s purported inquiry into what the jury “must have found,” Bybee op. at 935, or what the defendant admitted, see id. at 937-38, creates the very same “practical difficulties and potential unfairness,” Taylor, 495 U.S. at 601, 110 S.Ct. 2143, that led the Supreme Court to reject any fact-based approach, including the majority’s fact-lite approach, in Taylor and Shepard.

III.

The crux of the majority’s reasoning is that the problems identified .above, serious though they may be, cannot require that the modified categorical approach be limited to the divisible statute situation. That limit, according to the majority, would effectively collapse the modified categorical approach into the categorical approach. See Bybee op. at 935-38. The majority’s bottom line is that when the Supreme Court instructs us to examine the elements the jury “was actually required to find,” Taylor, 495 U.S. at 602, 110 S.Ct. 2143, or the defendant “necessarily admitted,” Shepard, 544 U.S. at 26, 125 S.Ct. 1254, those instructions cannot mean “as a purely legal matter.” Reading precedent that way, the majority claims, would preclude applying the modified categorical approach to divisible statutes, because we can never know, as a purely legal matter, under which of the divisible statute’s alternatives the jury convicted. See id. at 936-38. Accordingly, the majority posits, the modified categorical approach cannot be limited to asking what elements were necessarily established as a legal matter. See id.

Before explaining the numerous reasons why this assertion is wrong, it is important to be clear that this superfluity premise is the lynchpin of the majority’s justification for its factual approach. Starting from this erroneous premise, the majority casts about for an alternate meaning for TayloVs “actually required to find,” 495 U.S. at 602, 110 S.Ct. 2143, and Shepard’s “necessarily admitted,” 544 U.S. at 26, 125 S.Ct. 1254, finally settling on its factual, *966“theory of the case” approach. See Bybee op. at 936-37. So if the majority’s underlying principle — that we can never know, as a legal matter, which statutory alternative of a divisible statute that a defendant was convicted of violating — is erroneous, its house of cards collapses.

And indeed, the majority’s premise is quite wrong. In addition to its misunderstanding of Supreme Court precedent, its unfairness to defendants, and its disruption of the guilty plea process, the majority’s account of how a criminal defendant is convicted under a divisible statute misapprehends several fundamental features of our criminal justice system. The majority also ignores the variability amongst the states as to criminal procedure, even though state courts are the source of most convictions relevant to federal recidivist statutes. Once those errors are straightened out, it becomes apparent that the divisible statute approach does leave a role for the modified categorical approach consistent with the one the Supreme Court intended.

A.

As an initial matter, we should not be surprised to find that the modified categorical approach, correctly applied, is both quite narrow and bears a close resemblance to the formal categorical approach. As the name implies, the modified categorical approach is a variant of the categorical approach, not, as the majority would have it, an “exception” to it. Bybee op. at 938; see Nijhawan, 129 S.Ct. at 2298-99 (describing the manner by which a court narrows a crime of conviction by consulting the record of conviction as a “categorical” approach and contrasting it with the “circumstance-specific” approach); Taylor, 495 U.S. at 600-02,110 S.Ct. 2143 (explaining that the situation in which a statute of conviction is broader than the generic crime calls for a “categorical approach,” not a “factual approach”).

True, many criminal statutes are not divisible in the pertinent sense, so the modified categorical approach will not be universally, or perhaps even broadly, available. But that is why, presumably, the Supreme Court has stated from the outset that the approach is available only “in a narrow range of cases.” Id. at 602, 110 S.Ct. 2143; see also Shepard, 544 U.S. at 23 n. 4, 125 S.Ct. 1254 (“Taylor is clear that any enquiry beyond statute and charging document must be narrowly restricted to implement the object of the statute and avoid evidentiary disputes.”).

B.

The majority makes a fundamental error when it asserts that, even in the divisible statute situation, a court could never conclude with regard to a prior conviction that the earlier trier of fact was required as a purely legal matter to find the “precise elements of the generic crime,” because a trier of fact is always free to convict the defendant under any statutory alternative, “leaving no room for a modified approach.” Bybee op. at 935 (emphasis omitted). In other words, to use the hypothetical posited by the majority, the majority assumes that even if the statute of conviction criminalizes harmful offensive conduct with a gun or an axe, a sentencing court would never be able to ascertain whether a jury’s verdict or guilty plea was predicated on the use of a gun or an axe, because the trier of fact would have been free to convict on either of the two statutory alternatives. See id. at 935-37.

But — still one more time — the majority is wrong. In reality, procedural safeguards governing charging documents prevent a prosecutor from charging a defendant, under the majority’s hypothetical aggravated assault statute, with “the use of a gun or an axe.” As will be explained, the application of the various procedural safeguards is complex, and the outer limits are not always clear. Moreover, there *967is remarkable heterogeneity amongst the states. However, when taken together, the upshot is that the situation underlying the majority’s entire argument will rarely, if ever, occur. Moreover, while the majority completely ignores all of this nuance, sentencing courts will not have that luxury, as applying the majority’s approach will require them to master each state’s law of criminal procedure.

The constitutional principle that sets the outer boundaries for permissible prosecutorial pleading is the Sixth Amendment’s guarantee to all criminal defendants of the right “to be informed of the nature and cause of the accusation.” See Hamling v. United States, 418 U.S. 87, 117, 94 S.Ct. 2887, 41 L.Ed.2d 590 (1974); Russell v. United States, 369 U.S. 749, 768-69, 82 S.Ct. 1038, 8 L.Ed.2d 240 (1962); United States v. Kurka, 818 F.2d 1427, 1431 (9th Cir,1987). This requirement means, among other things, that a legally-sufficient charging document “must state the elements of an offense charged with sufficient clarity to apprise a defendant of what to defend against.” United States v. Christopher, 700 F.2d 1253, 1257 (9th Cir. 1983). Thus, charging a defendant in the disjunctive — “the use of a gun or an axe”— is generally prohibited, for doing so lacks the requisite clarity. See Confiscation Cases, 87 U.S. 92, 104, 20 Wall. 92, 22 L.Ed. 320 (1874) (“[A]n indictment or a criminal information which charges the person accused, in the disjunctive, with being guilty of one or of another of several offenses, would be destitute of the necessary certainty, and would be wholly insufficient____[because] [i]t would not give the accused definite notice of the offense charged, and thus enable him to defend himself, and [because] neither a conviction nor an acquittal could be pleaded in bar to a subsequent prosecution for one of the several offenses.”); 5 LaFave et al., supra, § 19.3(a) (“[Wjhere a- statute specifies several different ways in which the crime can be committed, [state courts] hold that the pleading must refer to the particular alternative presented in the individual case.”); 1 Charles Alan Wright & Andrew D. Leipold, Federal Practice and Procedure: Criminal § 125 (4th ed. 2008) (“[I]f the pleading alleges several acts in the disjunctive, it fails to give the defendant notice of the acts he allegedly committed and should be found insufficient.”). For this reason alone, the majority’s premise that an indictment or information ever would charge a defendant with “the use of a gun or an axe” assumes a reckless prosecutor and denial of the due process challenge likely to follow such charge.21

Instead, a prosecutor has a choice: First, he can charge the “use of a gun” or “the use of an axe,” but not both. If he charges in this manner, that choice will be evident from the charging document itself, illustrating the reason why Shepard permits a later sentencing court to consult the indictment or information to determine the provision of a divisible statute under which the defendant was convicted'. See Shepard, 544 U.S. at 26, 125 S.Ct. 1254. More to the point, the jury will have necessarily found, as a purely legal matter, either that the defendant used a gun or that he used an axe.22

The law regarding variances between the charging document’s allegations and *968the proof at trial will generally prevent the prosecutor from deviating from this choice, at least absent a formal amendment to the charging document — which, of course, should be evident to a later sentencing court. See Berger v. United States, 295 U.S. 78, 82, 55 S.Ct. 629, 79 L.Ed. 1314 (1935) (setting forth the general test for assessing variances); 5 LaFave et al., supra, § 19.6(b) & n. 10 (describing Berger as “the most frequently cited analysis of the law governing variances”).23

Alternatively, and assuming that the statute does not set forth separate offenses but only separate means of committing the same offense — an important assumption examined below — the prosecutor may permissibly charge “the use of a gun and an axe,” so long as the defendant has sufficient notice, consistent with the Sixth Amendment, of the charges he actually faces. See Turner v. United States, 396 U.S. 398, 420, 90 S.Ct. 642, 24 L.Ed.2d 610 (1970); United States v. Renteria, 557 F.3d 1003, 1008 (9th Cir.2009); People v. Moussabeck, 157 Cal.App.4th 975, 68 Cal. Rptr.3d 877, 881-82 (2007). If the prosecutor chooses this charging avenue, consultation of the charging document alone will not reveal the statutory alternative of which the defendant was actually convicted. Instead, a later sentencing court will need to examine the jury instructions or the plea colloquy to ascertain the basis of the conviction.24 See Shepard, 544 U.S. at 26,125 S.Ct. 1254.

*969There is a farther limit on the prosecutor’s discretion to charge an offense that may be committed by one or more acts set forth in the statute in the disjunctive: The statute’s use of the disjunctive “or” must merely describe different means of committing a single offense, rather than describing different offenses. The Sixth Amendment’s notice requirement and the Fifth Amendment’s protection against double jeopardy prohibit charging documents from containing duplicitous counts — “the joining in a single count of two or more distinct and separate offenses.” United States v. UCO Oil Co., 546 F.2d 833, 835 (9th Cir.1976); see also id. (“One vice of duplicity is that a jury may find a defendant guilty on a count without having reached a unanimous verdict on the commission of a particular offense. This may conflict with a defendant’s Sixth Amendment rights and may also prejudice a subsequent double jeopardy defense. Duplicity may also give rise to problems regarding the admissibility of evidence, including its admissibility against one or more codefendants.”). To avoid duplicity, trial courts have to determine when a statute sets forth separate offenses and when it merely proscribes various means of committing a single offense.

Traditionally, there has been no bright-line rule distinguishing one circumstance from the other. See, e.g., id. at 835-38 (listing four factors to evaluate); cf. Blockburger v. United States, 284 U.S. 299, 304, 52 S.Ct. 180, 76 L.Ed. 306 (1932) (“[Wjhere the same act or transaction constitutes a violation of two distinct statutory provisions, the test to be applied to determine whether there are two offenses or only one is whether each provision requires proof of a fact which the other does not.”).25 But now, to account for Apprendi, offenses must be considered separate ones if convictions under the various statutory alter*970natives subject the defendant to different maximum sentences. See Sattazahn v. Pennsylvania, 537 U.S. 101, 111, 123 S.Ct. 732, 154 L.Ed.2d 588 (2003) (“[I]f the existence of any fact (other than a prior conviction) increases the maximum punishment that may be imposed on a defendant, that fact — no matter how the State labels it — constitutes an element and must be found by a jury beyond a reasonable doubt.”). So, if a conviction under the majority’s hypothetical aggravated assault statute punishes convictions involving the “use of a gun” more harshly than those involving the “use of an axe,” whether the defendant used a gun or an axe is an element that must be included in the charging document and found by the jury beyond a reasonable doubt. See Handing, 418 U.S. at 117, 94 S.Ct. 2887 (holding that all elements must be include in an indictment); United States v. Omer, 395 F.3d 1087, 1089 (9th Cir.2005) (per curiam) (holding that the failure to allege an essential element can lead to the overturning of a conviction upon a timely objection, even without a showing of prejudice), cert. denied, 549 U.S. 1174, 127 S.Ct. 1118, 166 L.Ed.2d 906 (2007); see also United States v. Inzunza, 580 F.3d 894, 903 (9th Cir. 2009).

Nonetheless, there will likely be circumstances involving divisible statutes in which the charging document does not demonstrate that the factfinder necessarily found all the elements of the generic crime. One possible circumstance, already described, is when the prosecutor charges “the use of a gun and an axe” — which, as described, is only permissible if: (a) the use of a gun subjects the defendant to the exact same possible sentence as the use of an axe; (b) the statute’s gun/axe division merely describes different means of committing one offense, rather than two separate offenses; and (c) the defendant is provided sufficient notice of the accusations against him. Another, more likely, possibility is when there is a permissible variance between the allegations in the charging document and the subsequent proof. See United States v. Hartz, 458 F.3d 1011, 1021 (9th Cir.2006) (discussing how a variance between the indictment and the proof is permissible so long as it is not about “an essential element” of the crime charged and “does not alter the behavior for which the defendant can be convicted”). In either situation, resort to the jury instructions or the plea colloquy can assist the sentencing court in determining what elements the defendant “necessarily admitted,” Shepard, 544 U.S. at 26, 125 S.Ct. 1254, or the jury was “actually required to find,” Taylor, 495 U.S. at 602,110 S.Ct. 2143.

In particular, in federal court, Rule 11 polices the integrity of guilty pleas by requiring that the district court, “[bjefore entering judgment on a guilty plea, ... determine that there is a factual basis for the plea.” Fed.R.Crim.P. 11(b)(3). Courts interpreting Rule 11 have made clear that the essential elements of the crimes admitted must be addressed in a verified “factual basis.” United States v. Alber, 56 F.3d 1106, 1110 (9th Cir.1995) (“Rule 11(f) [now Rule 11(b)(3) ] requires the district court to satisfy itself that there is a factual basis for all elements of the offense charged before accepting a guilty plea.”). District courts in this circuit adhere to the same practice. See, e.g., United States v. McTiernan, 546 F.3d 1160, 1164 (9th Cir.2008); United States v. Vance, 62 F.3d 1152, 1158 (9th Cir.1995). Many states require similar procedures for a valid guilty plea. See generally 5 La-Fave et al., supra, § 21.4(f).

The record of the Rule 11 proceedings may enable a sentencing court to assure itself that the defendant was “actually required,” as a purely legal matter, to admit sufficient facts to support the elements of *971the crime — including, for example (and returning to our hypothetical divisible statute once more), that the defendant used either a gun or an axe in the assault. Here again, while it is possible that the judge would ask merely whether the defendant used “a gun or an axe,” it is surely unlikely that the judge would leave it at that. And, for the reasons explained above, if the statutory alternatives are punished differently, Apprendi requires that the judge ensure that there is a factual basis for any statutory alternative that would increase the defendant’s possible maximum sentence. See Sattazahn, 537 U.S. at 111, 123 S.Ct. 732.

The majority’s supposition that the trier of fact is never required, as a purely legal matter, to convict under any particular statutory alternative is perhaps most puzzling in light of the manner in which juries are typically instructed. Usually, trial judges instruct the jury on the elements of the crime, and “a failure to charge each of the elements may constitute cognizable error on appeal even where the defense failed to object.” 6 LaFave et al., supra, § 24.8(c). Judges are called upon to craft their instructions in light of the charges and the proof at trial. See United States v. Orozco-Acosta, 607 F.3d 1156, 1164 (9th Cir.2010); United States v. Frega, 179 F.3d 793, 806 n. 16 (9th Cir.1999). Because “in all cases, juries are presumed to follow the court’s instructions,” CSX Transp., Inc. v. Hensley, 556 U.S. 838,129 S.Ct. 2139, 2141, 173 L.Ed.2d 1184 (2009), the modified categorical approach is often available to determine that a conviction covered all the elements of a generic crime when the jury instructions narrow the charge to a particular statutory alternative.

In sum, there are several procedural safeguards — rooted primarily in the Fifth and Sixth Amendments — that apply to charging documents, plea colloquies, and jury instructions. These safeguards can assist the sentencing judge in a later proceeding in determining, with regard to divisible statutes, whether the trier of fact in a prior proceeding “necessarily found,” or the defendant “necessarily admitted” — as a purely legal matter — all of the elements of the generic crime, without having to engage in the sort of factfinding that the majority permits (and which Taylor and Shepard prohibit). Moreover, these procedural safeguards demonstrate why the majority’s contention that limiting the modified categorical approach to divisible statutes adds nothing to the categorical approach is incorrect.

IV.

The short of the matter is this: The Supreme Court has made abundantly clear that the modified categorical approach’ is employed only to determine under which statutory phrase the defendant was convicted in a prior proceeding. Every circuit to address the issue now agrees. To hold otherwise, even if we could, would create a myriad of practical and constitutional problems, as well as problems of basic fairness to criminal defendants.

So why does the majority strain to conclude otherwise? The majority’s primary concern is that confining the modified categorical approach to divisible statutes “makes [whether] a defendant [is] subject to a sentence enhancement turn entirely on the location in which he committed the prior offense,” which the majority claims is “the precise outcome that Taylor sought to avoid in establishing a uniform definition of burglary.” Bybee op. at 940 n. 19.

This objection is both jurisprudentially inaccurate and practically wrong. First off, it takes Taylor's uniformity discussion out of context, ignoring its simultaneous, explicit limitation on the circumstances in which the categorical approach may be *972modified, though not abandoned. In other words, Taylor was concerned about uniformity, but not to the unmitigated degree that the majority asserts.

Taylor invoked the uniformity concern to explain why it adopted a “uniform definition” of burglary “independent of the labels employed by the various States’ criminal codes.” 495 U.S. at 592, 110 S.Ct. 2143. The articulated concern was that some defendants would receive sentencing enhancements based on convictions for conduct, such as burglarizing an automobile, that Congress could not have meant to encompass with its designation of “burglary” as a conviction qualifying a defendant for an enhancement under 18 U.S.C. § 924(e). See Taylor, 495 U.S. at 591-92, 110 S.Ct. 2143. So yes, uniformity was a concern in Taylor.

Still, Taylor self-consciously chose a definition of generic burglary that excludes certain states’ statutes (like California’s) from counting as a qualifying conviction. See id. It also squarely, and emphatically, rejected the notion that, when faced with a conviction that did not meet the generic definition of burglary, the sentencing court could nonetheless look to the defendant’s actual conduct to see if it “would fit the generic definition of burglary.” Id. at 601, 110 S.Ct. 2143. The Court was fully aware of the fact that this approach would not achieve full uniformity based on a defendant’s past conduct. Nonetheless, as Taylor explained, this under-inclusiveness was necessary because of the “practical difficulties and potential unfairness” of any other approach. Id.

In subsequent cases, the Supreme Court has repeatedly refused to expand the modified categorical approach even when faced with the majority’s overweening concern— that it could lead to sentencing disparities based on the state in which a particular defendant was convicted. As the Court has explained, uniformity, while an important value, is not the only value at stake here. That is why Shepard rejected the Government’s argument “for a more inclusive standard of competent evidence,” which was based on “the virtue of a nationwide application of a federal statute unaffected by idiosyncrasies of record keeping in any particular State.” 544 U.S. at 22, 125 S.Ct. 1254. “[R]espeet for congressional intent and avoidance of collateral trials,” Shepard concluded, “require that evidence of generic conviction be confined to records of the convicting court approaching the certainty of the record of conviction in a generic crime State.” Id. at 23, 125 S.Ct. 1254; see also James, 550 U.S. at 204-05 & nn. 3-4, 127 S.Ct. 1586 (acknowledging that the overbreadth of some states’ attempted burglary statutes means that convictions thereunder have been held not to qualify as “violent felonies]” under the Armed Career Criminal Act (ACCA)); United States v. Rodriquez, 553 U.S. 377, 398, 128 S.Ct. 1783, 170 L.Ed.2d 719 (2008) (Souter, J., dissenting) (pointing out that the majority’s holding— that the ACCA’s sentence enhancement provision, under which a state drug-trafficking conviction qualifies as a “serious drug offense” if the “maximum term of imprisonment prescribed by law” was at least 10 years, includes any penalty imposed under the state’s recidivist statute-will lead to “vast disparities” depending on the state where the defendant was convicted); Johnson, 130 S.Ct. at 1273 (“It may well be true, as the Government contends, that in many cases state and local records from battery convictions will be incomplete. But absence of records will often frustrate application of the modified categorical approach — not just to battery but to many other crimes as well.”).

Secondly, the lack of uniformity the majority decries assumes that the mismatch between state and federal law with which we are currently struggling — and with *973which the Supreme Court has also struggled over the last two decades — is set in stone. It is not. As Shepard pointed out, Congress is free to modify federal law to better reflect variations in state law — for example, by adding privileged entries to the generic definition of burglary, or by altering the categorical approach. See Shepard, 544 U.S. at 23, 125 S.Ct. 1254 (“In this instance, time has enhanced even the usual precedential force, nearly 15 years having passed since Taylor came down, without any action by Congress to modify the statute as subject to our understanding that it allowed only a restricted look beyond the record of conviction under a nongeneric statute.”). Nijhawan demonstrated that Congress can by appropriate drafting take a statute out of the categorical approach entirely, in which case courts are free to look to the underlying facts of the prior crime of conviction (although, depending on the context, with possible procedural consequences in light of Apprendi). See Nijhawan, 129 S.Ct. at 2298-99.

States, too, are free to amend their criminal codes to better match the generic definitions contained in the federal recidivist statutes. If California, for example, is concerned that a conviction under its burglary statute will not qualify for a federal recidivist enhancement, it could remove certain offenses, like shoplifting, from its burglary statute.26 But it is not our role to bend precedent until it breaks simply because we do not like the outcome.

Finally, federal officials can encourage states to alter their prosecutorial practices — for example, to encourage fewer Alford pleas, see North Carolina v. Alford, 400 U.S. 25, 91 S.Ct. 160, 27 L.Ed.2d 162 (1970), which have complicated application of the modified categorical approach in other circuits. See, e.g., United States v. Savage, 542 F.3d 959, 967 (2d Cir.2008) (holding that because defendant’s prior conviction was pursuant to an Alford plea, in which he did not admit the factual basis of the crime, that conviction under a divisible Connecticut statute broader than the generic federal definition could not be sufficiently narrowed to assure the sentencing court that it necessarily rested on the elements of the generic crime).

Y.

The application of the majority’s theory to the facts of this case illustrate the theory’s flaws: Although Aguila-Montes pleaded guilty to “unlawfully entering] an inhabited dwelling,” neither Judge Bybee nor Judge Rawlinson endeavor to explain how the (non-elemental) fact that Aguila-Montes allegedly entered the dwelling “unlawfully,” in the generic sense, could ever be “necessary” to his conviction. See By-bee op. at 937 (“It is not enough that an indictment [from the prior conviction] merely allege a certain fact or that the defendant admit to a fact; the fact must be necessary to convicting that defendant.”).

I cannot join in the adventure sanctioned by the majority. I therefore concur only in the overruling of United States v. Rodriguez-Rodriguez, 393 F.3d 849, 857-58 (9th Cir.2005), and our other cases that held that a conviction under California Penal Code § 459 qualifies as generic burglary if the defendant pleaded guilty to entering the building “unlawfully” or if a jury found the defendant guilty as charged in an indictment that recited that allegation. *974See Bybee op. at 945-46 (listing cases). I agree with Judge Bybee’s understanding that under California law, alleging “unlawful” entry does not meet the generic burglary offense. As Judge Bybee explains— but Judge Rawlinson ignores — the California concept of “unlawful” is much broader than the generic concept of “unlawful or unprivileged.” The California concept includes entry into a building open to the public or as to which one has consent to enter, so long as one does so with the intent to commit a felony therein.

But on my view of the overall limitations on use of the modified categorical approach, Rodriguez-Rodriguez was wrongly decided even aside from the categorical mismatch, simply because the allegation that the defendant entered the building “unlawfully,” in the generic sense, is not an element of the crime of conviction, either exclusively or in the alternative. When a California prosecutor charges a defendant with burglary and alleges that he entered the building in question “unlawfully,” that allegation can be read to mean one of two things: First, it could be simply a shorthand repetition of the allegations that the defendant entered the building with the intent to commit a felony therein. Second, it could be alleging the absence of an affirmative defense that would otherwise make the entry “lawful” (in the sense that the California courts use that word). See People v. Sherow, 196 Cal.App.4th 1296, 128 Cal.Rptr.3d 255, 260 (2011) (“Case law establishes that the lack of consent to enter the building at issue is not an element of burglary.” (citations omitted)). But either way we read California’s allegation that an entry was “unlawful,” as Judge Bybee acknowledges, “the word ‘unlawfully’ in [a California] indictment tells us nothing about whether [a defendant’s] entry was ‘unlawful or unprivileged’ in the generic sense.”27 Bybee op. at 945-46.

I would hold, therefore, that Aguila-Montes’ burglary conviction cannot be used to enhance his sentence because the California burglary statute’s “entry” element does not require an unlawful entry, in the sense that term is used to define generic burglary — that is, an entry in which the premises are not open to the public and the person does not have a privilege or invitation to enter. See Bybee op. at 943-44. Accordingly, a jury is never “actually required” to find that a defendant’s entry was unlawful, Taylor, 495 U.S. at 602, 110 S.Ct. 2143; nor does a defendant pleading guilty ever “necessarily admit[ ]” that he entered unlawfully. Shepard, 544 U.S. at 16, 125 S.Ct. 1254. The district court therefore erred in enhancing Aguila-Montes’ sentence under U.S.S.G. § 2L1.2.

CONCLUSION

The majority wanders well beyond the confines of the Supreme Court’s abundantly clear and narrow modification of the categorical approach and thereby subjects criminal defendants to enhanced punishment on the basis of impermissible and unreliable judicial factfinding. It does so *975on ephemeral grounds whose validity evaporate upon inspection. For all the reasons surveyed, I would hold, as has the Supreme Court and every other circuit, that only elements of a crime, as defined in the predicate offense statute, are pertinent to the modified categorical approach.

. "Elements” are those necessary and sufficient facts that, if proven (or admitted), support a conviction for a particular crime. See United States v. Beltran-Munguia, 489 F.3d 1042, 1045 (9th Cir.2007) ("To constitute an element of a crime, the particular factor in question needs to be a constituent part of the offense [that] must be proved in every case to sustain a conviction under a given statute.” (citation and quotation marks omitted, alterations in original)); see generally Richardson v. United States, 526 U.S. 813, 817, 119 S.Ct. 1707, 143 L.Ed.2d 985 (1999) ("Calling a particular kind of fact an ‘element’ carries certain legal consequences.”).

. Like the majority, I use the term "divisible statute” as shorthand to refer to a statute that lists alternative ways that one or more elements can be established. Limiting the modified categorical approach to the "divisible statute” situation, as I would do, means that it may only be used to determine under which express statutory alternative the defendant was convicted.

. The majority, joined by Judge Rawlinson and her co-dissenters, suggests that NavcaroLopez precludes us from applying the modified categorical approach to broad-element statutes. See Bybee op. at 924-25; Rawlinson op. at 975. That’s wrong. See Navarro-Lopez, 503 F.3d at 1073.

. The 2007 case merely rejected a defendant's argument that in calculating his criminal his*953tory category under the Sentencing Guidelines, the district court should be prohibited from examining any documents not pérmitted under Shepard. See United States v. Townley, 472 F.3d 1267, 1277 (10th Cir.2007). In other words, Townley is not about the modified categorical approach at all.

The 2006 case, on the other hand, did use the modified categorical approach to determine that a particular conviction under Colorado Revised Statutes § 18-6-701 (for ''inducting], aid[ing], or encouraging] a child to violate any federal or state law, municipal or county ordinance, or court order commits contributing to the delinquency of a minor”) constituted the aggravated felony of sexual abuse of a minor. See Vargas v. Dep’t of Homeland Sec., 451 F.3d 1105, 1108-09 (10th Cir.2006). As the majority points out, the predicate offense (that the suspect allegedly induced the minor to commit) “could be anything from jaywalking to murder.” Id. at 1109. What the majority omits, however, is that "the specific predicate offense must be charged and proved as an element of the offense of contributing to the delinquency of a minor.” Id. In other words, the statute is divisible; as Vargas explained, “to convict a defendant of contributing to the delinquency of a minor, the jury ‘necessarily ha[s] to find’ a specified predicate offense that the defendant induced, aided, or encouraged the child to violate.” Id. (quoting Taylor, 495 U.S. at 602, 110 S.Ct. 2143) (alteration in original). One element of the offense to which the Vargas defendant pleaded guilty was that he had "induced, aided, or encouraged the minor” to engage in "nonconsensual sexual contact,” in violation of Colorado Revised Statutes § 18-3-404. Id. Vargas thus concluded — upon examination of only the elements of the crime of conviction' — that the defendant had been convicted of the generic crime of sexual abuse of a minor. See id. The majority is thus left without support for its contention that the Tenth Circuit applies the modified categorical approach to find non-elemental facts.

. The majority acknowledges that ZunigaSoto adopted a “divisible-statute-only rule,” but argues that its reach is circumscribed. Bybee op. at 934 n. 15. But even if that is true, no one can contest that the Tenth Circuit applies a "divisible-statute-only rule” in circumstances where the majority would not.

. Subsequent panels of the Sixth Circuit have adhered to Bartee’s approach, maintaining that the modified categorical approach is only appropriate in the divisible statute situation and disavowing earlier Sixth Circuit cases indicating otherwise. See United States v. Young, 580 F.3d 373, 380 n. 8 (6th Cir.2009); see also Kellermann v. Holder, 592 F.3d 700, 703 (6th Cir.2010).

. It is not. Knapik v. Ashcroft, 384 F.3d 84, 92 n. 8 (3d Cir.2004), cited by the majority for its "ambiguous” label, principally addressed a different problem. Specifically, Knapik held (as Nijhawan did later) that the modified categorical approach does not apply where a statute requires an inquiry into a fact underlying the prior conviction. See also Nijhawan, 523 F.3d at 391-92, aff'd, 129 S.Ct. 2294. As to the problem before us, Knapik indicated agreement with the "divisible statute” approach. See 384 F.3d at 92 n. 8. The case on which the Knapik relied, Singh v. Ashcroft, 383 F.3d 144 (3d Cir.2004), refused to apply the modified categorical approach to determine whether the alien had been convicted of the aggravated felony of "sexual abuse of a minor,” 8 U.S.C. § 1101(a)(43)(A). Singh explained that although the record was quite clear that the victim "was under sixteen years of age,” 383 F.3d at 147, “a finding of the age of the victim [was] not required for conviction” under the state statute, and therefore the conviction could not be considered an aggravated felony. Id. at 153.

.See, e.g., Lanferman v. Bd. of Immigration Appeals, 576 F.3d 84, 88-89 (2d Cir.2009) (per curiam) (“The modified categorical approach calls for a two-step inquiry: first, we determine if the statute is divisible, such that some categories of proscribed conduct render an alien removable and some do not; second, we consult the record of conviction to ascertain the category of conduct of which the alien was convicted.” (quotation marks omitted)); United States v. Mills, 570 F.3d 508, *955511 (2d Cir.2009) (per curiam); Hoodho v. Holder, 558 F.3d 184, 189 (2d Cir.2009); Martinez v. Mukasey, 551 F.3d 113, 120 (2d Cir.2008); Gertsenshteyn v. U.S. Dep't of Justice, 544 F.3d 137, 143 (2d Cir.2008) (same). In fact, Judge Bybee previously wrote that "[t]he Second ... Circuit[ ] appear[s] to require that the statute be divisible; that is, the statute of conviction contain at least one subsection that meets the generic definition, even if another section would not satisfy the definition.” Aguilar-Turcios v. Holder, 582 F.3d 1093, 1109 n. 8 (9th Cir.2009) (Bybee, J„ dissenting). The Second Circuit has not changed its rule since Aguilar-Turcios was published.

. See United States v. Palomino Garcia, 606 F.3d 1317, 1336-37 (11th Cir.2010) ("[W]hen the law under which a defendant has been convicted contains different statutory phrases — some of which require the use of force and some of which do not — the judgment is ambiguous and we apply a 'modified categorical approach.’ Under this approach, a court may determine which statutory phrase was the basis for the conviction by consulting a narrow universe of ‘Shepard documents.' ” (citations omitted)).

. The majority maintains that United States v. Fife, 624 F.3d 441 (7th Cir.2010) shows that the Seventh Circuit has wavered from its divisible-statute-only rule by holding that an Illinois statute that states that ”[a] person commits armed violence when, while armed with a dangerous weapon, he commits any felony defined by Illinois Law, except [various enumerated crimes]” is a divisible statute. 720 111. Comp. Stat. 5/33A-2(a) (2007). The majority argues that, "in our terminology [Fife] defined ‘divisible statute’ in a manner that would encompass missing element statutes.” Bybee op. at 933. That’s wrong. The Illinois crime of "armed violence” has two elements: (1) while armed with a dangerous weapon, the defendant (2) commits any felony under Illinois law, with a few enumerated exceptions. See 720 Ill. Comp. Stat. 5/33A-2(a). It is true that the statute does not incorporate a list of the state-law felonies that would meet the second element. But there is no need, as a comprehensive list of those qualifying crimes was readily ascertainable by referencing the rest of the state-law code. As Fife pointed out, ”[t]he point is that the statute itself is ‘divisible’ — that is, it expressly identifies several ways in which a violation may occur.” Fife, 624 F.3d at 446 (citation omitted).

. There is good reason to think that the District of Columbia Circuit would not adopt the "theory of the case” approach advocated by the majority. See, e.g., In re Sealed Case, 548 F.3d 1085, 1091 (D.C.Cir.2008) (”[U]nder Shepard the question is not what [the defendant] probably pled to, but what he necessarily pled to.”).

. The majority acknowledges that this is the rule in the First, Fourth, Fifth, and Eighth circuits. See United States v. Giggey (Giggey I), 551 F.3d 27, 40 (1st Cir.2008) (en banc); see also United States v. Giggey (Giggey II), 589 F.3d 38, 41-42 (1st Cir.2009); United States v. Rivers, 595 F.3d 558, 564 (4th Cir. 2010); United States v. Hughes, 602 F.3d 669, 676 (5th Cir.2010); United States v. Gonzalez-Terrazas, 529 F.3d 293, 297-98 (5th Cir.2008); United States v. Ossana, 638 F.3d 895, 904 (8th Cir.2011); United States v. Webster, 636 F.3d 916, 919 (8th Cir.2011); United States v. Boaz, 558 F.3d 800, 808 (8th Cir.2009).

. As in every modified categorical case that the Supreme Court has considered, the statute at issue in Taylor was divisible in the relevant respect: Missouri had a number of burglary statutes, each of which listed different categories of locations which, if entered, could support a burglary conviction. See 495 U.S. at 578 n. 1, 110 S.Ct. 2143.

. State juries need not agree on non-element facts. See Schad v. Arizona, 501 U.S. 624, 631-32, 111 S.Ct. 2491, 115 L.Ed.2d 555 *957(1991) (plurality op.) (“[D]ifferent jurors may be persuaded by different pieces of evidence, even when they agree upon the bottom line. Plainly there is no general requirement that the jury reach agreement on the preliminary factual issues which underlie the verdict.” (citation and quotation marks omitted); Schad, 501 U.S. at 649, 111 S.Ct. 2491 (Scalia, J., concurring) ("[I]t has long been the general rule that when a single crime can be committed in various ways, jurors need not agree upon the mode of commission.”)).

. Justice Thomas concurred in all of Shepard except its discussion of Apprendi. He would have gone further, declaring that in light of Apprendi v. New Jersey, 530 U.S. 466, 120 S.Ct. 2348, 147 L.Ed.2d 435 (2000), any judicial factfinding — including that permitted by Taylor and Almendarez-Torres v. United States, 523 U.S. 224, 118 S.Ct. 1219, 140 L.Ed.2d 350 (1998) — "would not give rise to constitutional doubt.... It would give rise to constitutional error.” Shepard, 544 U.S. at 28, 125 S.Ct. 1254 (Thomas, J., concurring).

. Aguila-Montes' sentence was increased under the U.S. Sentencing Guidelines on the basis of his prior conviction, but the applicable statutory maximum was not increased by the district court's fact-finding. If the majority had confined its discussion to the Guidelines, therefore, this case would not trigger any Sixth Amendment concern. See United States v. Booker, 543 U.S. 220, 232, 125 S.Ct. 738, 160 L.Ed.2d 621 (2005). But the majority opinion purports to enunciate new standards for the application of the modified categorical approach generally, see Bybee op. at 922 (“[0]ur conclusion ... will have wide repercussions beyond the limited issue in this case.”), and several federal statutes subject to the categorical approach do impose higher maximum penalties upon a finding of a qualifying prior conviction. See, e.g., United States v. Strickland, 601 F.3d 963, 967 (9th Cir.2010) (en banc) (applying 18 U.S.C. § 2252A(b)); United States v. Garcia-Cardenas, 555 F.3d 1049, 1051 (9th Cir.2009) (8 U.S.C. § 1326(b)); United States v. Durham, 464 F.3d 976, 986-87 (9th Cir.2006) (21 U.S.C. § 844(a)). Apprendi constrains the application of the modified categorical approach under those statutes.

*960Additionally, while the Apprendi concern does not apply to all Taylor applications, the same federal statutory definitions do apply in various contexts, and the Supreme Court has never countenanced affording different meaning to the same words in different contexts. For example, an alien is deportable under 8 U.S.C. § 1227(a)(2)(A)(iii) (in conjunction with 8 U.S.C. § 1101(a)(43)(F)) if convicted of a "crime of violence,” which is defined in 18 U.S.C. § 16(a) to include "an offense that has as an element the use, attempted use, or threatened use of physical force against the person or property of another.” That same definition appears in the Armed Career Criminal Act, 18 U.S.C. § 924(e)(2)(B)(i), which provides for enhanced sentences for certain defendants previously convicted of a "violent felony.” See also Leocal, 543 U.S. at 6-7, 125 S.Ct. 377 (explaining how the term "crime of violence” in 18 U.S.C. § 16, has "been incorporated into a variety of statutory provisions, both criminal and noncriminal” (footnote omitted)).

. According to the Bureau of Justice Statistics, over 95 percent of criminal convictions obtained in United States district courts in 2005 (the most recent year for which statistics are available) were the result of guilty pleas. See Mark Motivans, Federal Justice Statistics, 2005, Bureau of Just. Stat. Bull. (U.S. Dep’t of Justice, Washington, D.C.), Sept. 2008, at 5, available at http://bjs.ojp.usdoj.gov/content/ pub/pdf/fjs05.pdf. Similarly, in 2006, 95 percent of convictions of state-court felony defendants in the seventy-five largest U.S. counties were by guilty plea. See Thomas H. Cohen & Tracey Kyckelhahn, Felony Defendants in Large Urban Counties, 2006, Bureau of Just. Stat. Bull. (U.S. Dep't of Justice, Washington, D.C.), May 2010, at 11, available at http://bjs. ojp.usdoj.gov/content/pub/pdi/fdluc06.pdf.

. This is not the only way in which the majority’s approach effectively serves as a one-way ratchet that always favors the government. The Supreme Court has expressly foreclosed the possibility that defendants could introduce evidence demonstrating that their particular convictions for crimes that are categorical matches for the generic crime did not fit the essential categorical elements— for example, that though most aggravated assaults involve violence, their particular conviction did not. See James, 550 U.S. at 208, 127 S.Ct. 1586 (’’[The categorical approach does not require] that every conceivable factual offense covered by a statute ... necessarily present a serious potential risk of injury before the offense can be deemed a violent felony.... Rather, the proper inquiry is whether the conduct encompassed by the elements of the offense, in the ordinary case, presents a serious potential risk of injury to another.” (citation omitted)). The majority permits sentencing courts to go beyond the elements of the prior crime of conviction in order to expand the crimes qualifying for recidivist enhancements, but under James, those same courts may not go beyond the elements when doing so would benefit defendants.

. The only circumstance in which a defendant pleading guilty admits non-elemental facts is if he does so explicitly during the Rule 11 colloquy. See Forrester, 616 F.3d at 946; Cazares, 121 F.3d at 1247-48 ("[T]o attribute to a defendant an admission which was never subject to a plea colloquy under Fed. R.Crim.P. 11 would undermine the rule’s prophylactic purposes.... The appropriate course is ... for the government at the plea colloquy to seek an explicit admission of any unlawful conduct which it seeks to attribute to the defendant.”).

. Under the federal Sentencing Guidelines, defendants’ recommended sentences are lowered if they "accept responsibility” for their crimes. U.S.S.G. § 3E1.1; but see United States v. Green, 346 F.Supp.2d 259, 271 (D.Mass.2004) ("Under the Guidelines, an offender is eligible for a discount on his sentence if he 'accepts responsibility’ for his crime. Actually, this discount has nothing whatsoever to do with true acceptance of responsibility for one's acts.... What we mean by acceptance of responsibility is simply the discount offered for pleading guilty (earlier is better), thus saving the Department [of Justice] the trouble, expense, and uncertainty of a jury trial.”) (footnotes omitted), off d in part and vacated in part sub nom., United States v. Yeje-Cabrera, 430 F.3d 1 (1st Cir.2005) (vacating in light of United States v. Booker, 543 U.S. 220, 125 S.Ct. 738, 160 L.Ed.2d 621 (2005)).

. Even an allegation that the defendant "used a gun and/or an axe” would likely be found duplicitous, see, e.g., People v. Bauman, 12 N.Y.3d 152, 878 N.Y.S.2d 235, 905 N.E.2d 1164, 1165 (2009), a concept discussed later.

. It is this legal certainty that makes the following statement from the majority incorrect:

[T]he same reasons that motivate Judge Berzon to express confidence in the modified categorical approach in divisible statute cases suggest that we should have simi*968lar confidence in applying it to broad and missing element cases, so long as we are relying on the documents approved in Shepard. It is unclear why, according to Judge Berzon, these conviction records are unreliable when the conviction rests on a missing element statute, yet are perfectly reliable in determining under which part of a divisible statute a defendant was convicted.

Bybee op. at 935-36 n. 18. It is not that the conviction records are not "reliable” in the broad element or missing element statute— it’s that they aren’t useful, because they do not (and cannot) demonstrate that the factfinder was "actually required to find,” as a legal matter, the elements of the generic crime. Taylor, 495 U.S. at 602, 110 S.Ct. 2143. See United States v. Lewis, 405 F.3d 511, 515 (7th Cir.2005) ("The list in Shepard is designed to identify documents that illuminate what crime the defendant committed.... What matters is the fact of conviction, rather than the facts behind the conviction.”) (Easterbrook, J.).

. Application of the informal (or "constructive”) amendment doctrine varies a great deal amongst jurisdictions, but courts are generally consistent in finding a variance when the discrepancy between the indictment and the proof effects a shift from one statutory alternative to another — that is, in the divisible statute situation. For example, in Gray v. Raines, 662 F.2d 569 (9th Cir.1981), we considered the claim of a defendant convicted of statutory rape under former Ariz.Rev.Stat. § 13-611(B), covering statutory rape, when the information alleged only forcible rape under § 13-611(A). Although the Arizona courts had held that § 13-611 “merely statfed] the different circumstances under which sexual intercourse constitutes the crime of rape,” Raines, 662 F.2d at 571 (citation and quotation marks omitted), we held that the pivot from one statutory alternative to another violated the Sixth Amendment because statutory rape was not a lesser included offense of forcible rape. Id. at 573; see also, e.g., United States v. Figueroa, 666 F.2d 1375, 1379-80 (11th Cir.1982) (reversing a conviction where the indictment alleged actual force, but the defendant was convicted for seizing an aircraft by threat of force); United States v. Bizzard, 615 F.2d 1080, 1081-82 (5th Cir.1980) (reversing a conviction when the indictment alleged that the defendant had put two bank tellers' lives in jeopardy, but the proof and jury instructions required merely an assault). Thus, a constructive amendment of the charging document that prevents a later court from determining the precise elements a jury was required to find is least likely in the divisible statute situation.

. If neither the charging document nor the jury instructions/plea colloquy establish the statutory alternative under which the defendant was convicted, the inquiry under the modified categorical approach is, as Shepard and subsequent cases instruct, at an end. See *969Johnson, 130 S.Ct. at 1273 (“But absence of records will often frustrate application of the modified categorical approach....”). The sentencing court would then be unable to determine from those documents that the trier of fact "necessarily found” all the elements narrowing the crime of conviction to the generic crime. That is because juries need not agree "which of several possible sets of underlying brute facts make up a particular element.” As explained in Richardson:

Where, for example, an element of robbery is force or the threat of force, some jurors might conclude that the defendant used a knife to create the threat; others might conclude he used a gun. But that disagreement — a disagreement about means — would not matter as long as all 12 jurors unanimously concluded that the Government had proved the necessary related element, namely that the defendant had threatened force.

526 U.S. at 817, 119 S.Ct. 1707; accord Schad, 501 U.S. at 631, 111 S.Ct. 2491.

. Many states apply the Blockburger test to measure whether the prosecution’s proof or the jury instructions create impermissible variance from the charging document. See, e.g., Bell v. State, 296 Ark. 458, 757 S.W.2d 937, 942 (1988); People v. Jefferson, 934 P.2d 870, 872 (Colo.Ct.App.1996); State v. Matautia, 81 Hawai'i 76, 912 P.2d 573, 578 (Haw.Ct.App. 1996); Commonwealth v. Souza, 42 Mass.App.Ct. 186, 675 N.E.2d 432, 437 (1997); Thanos v. State, 282 Md. 709, 387 A.2d 286, 290 (1978); State v. Reed, 737 N.W.2d 572, 580-81 (Minn.2007); Wolfe v. State, 743 So.2d 380, 384 (Miss.1999); State v. Brown, 172 Mont. 41, 560 P.2d 533, 535-36 (1976); State v. Erickson, 129 N.H. 515, 533 A.2d 23, 25 (1987); State v. Woody, 29 Ohio App.3d 364, 505 N.E.2d 646, 646-47 (1986); Commonwealth v. Brown, 556 Pa. 131, 727 A.2d 541, 544 (1999); State v. Markle, 118 Wash.2d 424, 823 P.2d 1101, 1105-06 (1992); but see State v. Matson, 260 Kan. 366, 921 P.2d 790, 796 (1996); State v. Noltie, 116 Wash.2d 831, 809 P.2d 190, 197 (1991); see generally 5 LaFave et al., supra, § 19.6(b). Additionally, many state courts hold that when a defendant is convicted under a statutory alternative different from that alleged, the conviction is invalid. See, e.g., Fleming v. State, 814 So.2d 310, 311 (Ala.Crim.App.2001); State v. Mencer, 798 S.W.2d 543, 546 (Tenn.Crim.App.1990).

. The majority’s assertion that the states have no incentive to amend their statutes to better match federal recidivist statutes is unconvincing. See Bybee op. at 940 n. 20. Congress and state legislatures have concentric constituencies; insofar as Congress has decided that longer sentences for repeat offenders is warranted, one would expect that at least a substantial number of state legislatures would agree.

. Judge Rawlinson professes to be “puzzled” by the assertion that "unlawful entry is not an element of burglary under California law,” stating that "California law is expressly to the contrary.” Rawlinson op. at 982. But her contention is just a play on words. When used in the generic sense, an "unlawful” entry is one that it is trespassory. In the California burglary context, however, "unlawful” just means that the defendant entered with a "larcenous or felonious intent.” See generally B.E. Witkin, 2 Within Cal.Crim. Law Crimes— Property § 123 (3d ed.2010). Of course, it’s the meaning, not the label, that matters. See United States v. Bowen, 527 F.3d 1065, 1077 n. 9 (10th Cir.2008) ("Abraham Lincoln once posed the following riddle: ‘How many legs does a dog have if you call the tail a leg?' The answer is, of course, 'four' because ‘calling a tail a leg doesn't make it a leg.' ”).

. California Penal Code § 459 provides:

Every person who enters any house, room, apartment, tenement, shop, warehouse, store, mill, barn, stable, outhouse or other building, tent, vessel, as defined in Section 21 of the Harbors and Navigation Code, floating home, as defined in subdivision (d) of Section 18075.55 of the Health and Safety Code, railroad car, locked or sealed cargo container, whether or not mounted on a vehicle, trailer coach, as defined in Section 635 of the Vehicle Code, any house car, as defined in Section 362 of the Vehicle Code, inhabited camper, as defined in Section 243 of the Vehicle Code, vehicle as defined by the Vehicle Code, when the doors are locked, aircraft as defined by Section 21012 of the Public Utilities Code, or mine or any underground portion thereof, with intent to commit grand or petit larceny or any felony is guilty of burglary. As used in this chapter, 'inhabited' means currently being used for dwelling purposes, whether occupied or not. A house, trailer, vessel designed for habitation, or portion of a building is currently being used for dwelling purposes if, at the time of the burglary, it was not occupied solely because a natural or other disaster caused the occupants to leave the premises.