Opinion by Judge FLETCHER, joined by Chief Judge HUG and Judges BROWNING, SCHROEDER, REINHARDT, DAVID R. THOMPSON, and GRABER. Dissent by Judge Trott, joined by Judges BRUNETTI, O’SCANNLAIN, and KLEINFELD.
FLETCHER, Circuit Judge:Miguel Sanehez-Rodriguez pled guilty to reentering the United States illegally following a felony conviction, 8 U.S.C. § 1326(a). The district court sentenced him under 8 U.S.C. § 1326(b)(1) to thirty months in custody after departing downward on three bases: the minor nature of the underlying felony conviction; the fact that Sanehez-Rodriguez stipulated to deportation and waived deportation proceedings; and the fact that, because of delays in indicting and sentencing Sanchez-Rodriguez, he lost the opportunity to serve a greater portion of his state sentence concurrently with his federal sentence. The government appealed. We granted en bane review sua sponte to consider whether two of our previous decisions, United States v. Rios-Favela, 118 F.3d 653 (9th Cir.1997), cert. denied, — U.S.-, 118 S.Ct. 730, 139 L.Ed.2d 668 (1998), and United States v. Mendoza, 121 F.3d 510 (9th Cir.1997), are inconsistent, and, if so, what law should govern this appeal. Concluding that they are inconsistent, we overrule our previous decision in Rios-Favela to the extent that it is inconsistent with this opinion, and affirm the district court.
I.
In May 1996 Sanehez-Rodriguez was sentenced in state court to a three year prison term for the sale of a controlled substance. *559The Immigration and Naturalization Service (INS) was notified of his arrest in April 1996. Seven months later the government charged Sanehez-Rodriguez with violating 8 U.S.C. § 1326(b)(1) by illegally reentering the United States subsequent to a felony conviction. The underlying felony was a 1988 conviction for the sale of heroin valued at $20 for which Sanchez-Rodriguez received three years probation (later revoked) and 90 days in county jail.
Sanchez-Rodriguez pled guilty to illegal reentry. No plea bargain was entered into. Pursuant to section 2L1.2 of the United States Sentencing Guidelines (U.S.S.G.), Sanchez-Rodriguez’ base level offense was eight.1 A 16-level enhancement was imposed because the predicate felony, the $20 heroin sale, was an aggravated felony.2 See U.S.S.G. § 2L1.2(b)(2). The Probation Office recommended a three-level reduction for acceptance of responsibility pursuant to U.S.S.G. § 3E1.1, bringing Sanchez-Rodriguez’ offense level to 21. His criminal history category was VI. The sentencing range for an offense level of 21 and a criminal history category of VI was 77 to 96 months. The Probation Office, however, recommended three bases for departure: (1) the small amount of drugs involved in the predicate offense used as the basis for the 16-level increase in Sanchez-Rodriguez’ offense level; (2) the lost opportunity to reduce the total time in custody due to the delay in charging and sentencing Sanchez-Rodriguez on the federal charge; and (3) Sanchez-Rodriguez’ agreement to waive a deportation hearing. The Probation Office recommended a 30-month sentence.
The government agreed that a departure for stipulating to deportation was appropriate, but argued that departure based on the other two factors listed was unwarranted.3 The district court adopted the reeommendation of the Probation Office and imposed a 30 month sentence.4 The government timely appealed.
II.
We have jurisdiction pursuant to 18 U.S.C. § 3742(b) (Sentencing Guidelines) and 28 U.S.C. § 1291 (final judgments). We review a district court’s departure decision for an abuse of discretion. See United States v. Sablan, 114 F.3d 913, 916 (9th Cir.1997) (en banc) (citing Koon v. United States, 518 U.S. 81, 98-100, 116 S.Ct. 2035, 135 L.Ed.2d 392 (1996)), cert. denied, —— U.S. --, 118 S.Ct. 851, 139 L.Ed.2d 752 (1998). In doing so, we give “substantial deference” to the district court’s decision to depart, “for it embodies the traditional exercise of discretion by a sentencing court.” Koon v. United States, 518 U.S. 81, 98, 116 S.Ct. 2035, 135 L.Ed.2d 392 (1996). Whether a factor is a permissible ground for departure is a matter of law, although “[l]ittle turns on whether ... we label review of this particular question abuse of discretion or de novo, for an abuse of discretion standard does not mean a mistake of law is beyond appellate correction.” Id. at 100, 116 S.Ct. 2035.
III.
In Rios-Fav'ela, we addressed the precise issue presented by this appeal — whether the district court may depart downward based on the nature of the defendant’s felony that is predicate to his current conviction. Relying on two cases5 that had been decided before *560Koon, the Rios-Favela panel held that a district court may never depart downward based on the nature of the predicate felony, “[b]ecause the Commission adequately considered the nature of the predicate aggravated felonies warranting the sixteen-level adjustment.” Rios-Favela, 118 F.3d at 658.
In Mendoza, we considered whether a district court could depart based on the fact that the defendant, a middle-man between methamphetamine suppliers and their buyers, was unaware of the purity of the methamphetamine that he was delivering. The district court held that it had no authority to depart, because the Guidelines Commission already had taken into account the purity of the methamphetamine when it designed the Guidelines. We reversed, holding that the district court was not precluded from considering whether a downward departure was warranted based on the defendant’s lack of knowledge of and control over the purity of the methamphetamine that he was delivering. The panel noted that “[w]e are not at liberty, after Koon, to create additional categories of factors that we deem inappropriate as grounds for departure in every circumstance.” Mendoza, 121 F.3d at 513 (citing United States v. Cubillos, 91 F.3d 1342, 1344 (9th Cir.1996)).
We conclude that the reasoning and holding of Mendoza are consistent with the Supreme Court’s approach in Koon, while Rios-Favela is antithetical. The district court may depart in its discretion based on the nature or circumstances of an underlying aggravated felony. In so holding, we join (albeit for different reasons) the Eighth Circuit, the only other circuit court to consider this issue since Koon was decided. See United States v. Diaz-Diaz, 135 F.3d 572 (8th Cir.1998).
A.
Koon made clear that we cannot categorically forbid a district court from departing downward on any basis except for those specifically proscribed in the Guidelines.6 The government raised the exact argument in Koon that it raises to us in the instant case — that certain factors simply are not proper bases for departure. See 518 U.S. at 106, 116 S.Ct. 2035 (“As an initial matter, the Government urges us to hold each of the factors relied upon by the District Court to be impermissible departure factors under all circumstances.”). The Court rejected that argument, holding that
Congress did not grant federal courts authority to decide what sorts of sentencing considerations are inappropriate in every circumstance.... The Commission set forth factors courts may not consider under any circumstances but made clear that with those exceptions, it “does not intend to limit the kinds of factors, whether or not mentioned anywhere else in the guidelines, that could constitute grounds for departure in an unusual case.”
Id. (citing 1995 U.S.S.G. ch. I, pt. A, intro, cmt. 4(b)) (emphasis added). In deciding whether a factor may never be used as a basis for departure, we must ask only “whether the Commission has proscribed, as a categorical matter, consideration of the factor” at issue. Id. at 109, 116 S.Ct. 2035. Here, the Commission has not precluded consideration of the nature of a defendant’s predicate felony. We may not categorically forbid departure on this basis.
B.
The government insists that the district court nevertheless may not depart in the instant case, because the Guidelines provide for a 16-level enhancement if a defendant has been convicted of a felony, defined to include “any illicit trafficking in any controlled substance.” U.S.S.G. § 2L1.2, cmt. (n.7). The government argues that this definition demonstrates Congressional intent to limit the district court’s ability to consider, in its discretion, whether the nature of a defen*561dant’s underlying drug trafficking offense warrants departure. We disagree.
As noted, the Court has held that the Commission in fact did not intend to preclude consideration of any factor except those that the Guidelines specifically forbid. Further, while it is true that any drug trafficking offense will trigger the 16-level enhancement for illegal reentry following an aggravated felony, the district court’s inquiry does not end there. After applying the enhancement, the district court must then determine whether a downward departure is warranted. See U.S.S.G. § 1B1.1 (outlining the steps taken in determining an appropriate sentence under the Guidelines). As the Commission itself has made clear, a departure may be warranted even if a specific guideline “linguistically applies” to the defendant’s actions, and even if the basis for departure is mentioned specifically in the Guidelines. See U.S.S.G. ch. 1, pt. A, intro, cmt. 4(b); Koon, 518 U.S. at 93,116 S.Ct. 2035.
Koon tells us that when determining whether departure based on a particular factor in a specific case is warranted, the district court, and subsequently we, should assess whether that factor is an encouraged factor, a discouraged factor, or a factor unmentioned in the Guidelines. Id. at 94-95, 116 S.Ct. 2035. If it is an unmentioned factor, as it is here, a district court contemplating whether to depart, after considering the “structure and theory of both the individual guideline and the Guidelines taken as a whole,” must determine whether the factor takes the case out of the heartland of the Guidelines. Id. at 96, 116 S.Ct. 2035. Here, the district court did just that when it determined that the $20 heroin sale committed by Sanchez-Rodriguez was not comparable to, and not proportional to, the typical crimes of defendants who receive the 16-level enhancement. See Koon, 518 U.S. at 104-05, 116 S.Ct. 2035 (holding that the proper comparison is among all defendants who are sentenced pursuant to the same Guideline).7 Whether or not a factor makes a case unusual is a determination particularly suited to the district court, “informed by its vantage point and day-to-day experience in criminal sentencing.” Id. at 98, 116 S.Ct. 2035. According due deference to the district court, we cannot say that it was an abuse of discretion to hold that a $20 heroin sale is different in kind and degree from, or outside of the norm of, other offenses, including murder and large-scale drug operations, that similarly trigger the 16-level enhancement. The district court has an “institutional advantage” in making this assessment, for it “see[s] so many more Guideline cases than [we] do”.8 Koon, 518 U.S. at 98, 116 S.Ct. 2035.
Our conclusion is bolstered further by a previous decision of this court in which we held that a similar disproportionality among offenses triggering the same sentencing outcome may warrant a downward departure. In United States v. Reyes, 8 F.3d 1379 (9th Cir.1993), a pre-Koon decision, we reviewed a *562sentence involving a substantial downward departure under circumstances closely analogous to those presented by the instant case. The defendant was convicted of illegal reentry and distribution of marijuana and cocaine. The applicable Guideline range was 33 to 41 months. However, because of his previous convictions, Reyes qualified as a career offender. See U.S.S.G. § 4B1.1 (1995). The Guidelines provide that a defendant is a career offender if he was at least 18 at the time of the instant offense, the instant offense resulted in a felony conviction for a crime of violence or a controlled substance offense, and the defendant had at least two prior convictions involving either crimes of violence or controlled substance offenses. Id. Because Reyes qualified as a career offender, his sentencing range increased to 210 to 262 months. As with the enhancement for an aggravated offense pursuant to U.S.S.G. § 2L1.2, the Guidelines were silent as to what consideration, if any, the district court could give to the nature of the previous controlled substance offenses.
The district court departed from that range on the ground that although Reyes had committed two previous controlled substance offenses, both involved very small amounts of drugs and thus were minor in nature as compared to other offenses that would trigger the career offender enhancement. Reyes, 8 F.3d at 1384. We held that this disproportionality justified the district’s court decision to depart. Id. at 1387. We specifically rejected the same argument that the government makes in this ease — that if the district court believes that the defendant’s previous offenses were relatively minor, the district court is restricted to departing vertically only, that is, along the criminal history axis. Id. at 1388-89. Rather, we approved a departure along the base offense level axis.9
C.
In reaching our decision, we reject the contention of both parties that the recent revisions to section 2L1.2 of the Guidelines affect or control the outcome of this case. In 1995, commentary to that section provided that “ ‘aggravated felony’ as used in subsection (b)(2) means ... any illicit trafficking in any controlled substance.” U.S.S.G. § 2L1.2, cmt. (n.7). In 1997, subsection (b)(2) was replaced by new subsection (b)(1)(A). See U.S.S.G., App. C., Amendment 563 (1997). Those who reenter the United States illegally after having been convicted of an aggravated felony still receive a 16-level enhancement of their base offense levels. However, note seven in the commentary (defining “aggravated felony”) was replaced by two new notes. The new note five provides:
Aggravated felonies that trigger the adjustment from subsection (b)(1)(A) vary widely. If subsection (b)(1)(A) applies, and (A) the defendant has previously been convicted of only one felony offense; (B) such offense was not a crime of violence or firearms offense; and (C) the term of the imprisonment imposed for such offense did not exceed one year, a downward departure may be warranted based on the seriousness of the aggravated felony.
U.S.S.G. § 2L1.2, cmt. (n.5) (1988). The parties debate the significance of this amendment and disagree about whether it is substantive or merely clarifying.10 The Eighth Circuit relied on this amendment in its decision permitting the district court to depart based on the nature of the underlying felony. See Diaz-Diaz, 135 F.3d at 581. The Eighth Circuit held that the amendment clarified the *5631995 version of the Guidelines and established that the seriousness of the predicate felony was an encouraged, ground for departure. Id.
Although we agree with the ultimate decision reached by the Eighth Circuit, we reach the same conclusion without reference to the new amendment, and without deciding whether the amendment is clarifying or substantive.11 For the reasons previously stated, we hold that section 2L1.2, as drafted in 1995 and as applied to Sanchez-Rodriguez, does not preclude a district court from considering the nature of the aggravated offense when deciding whether to depart from the Guidelines’ sentencing range. The new amendment does not affect our decision.12
IV.
The government also argues that the district court erred in departing downward based on the fact that, because of the delay in indicting and sentencing Sanchez-Rodriguez with illegal reentry,13 he lost the opportunity to serve a greater portion of his state sentence concurrently with his federal sentence.14 Although the government failed to raise this argument below, we choose to exercise our discretion to address this issue. See Bolker v. Commissioner, 760 F.2d 1039, 1042 (9th Cir.1985) (holding that we generally will not consider an issue not first raised below but may do so if the issue is purely legal and does not depend on development of the record).
The government insists that departure based on time served in state custody is an impermissible ground for departure, relying on United States v. Huss, 7 F.3d 1444 (9th Cir.1993), and United States v. Daggao, 28 F.3d 985 (9th Cir.1994). In Huss, we affirmed the district court’s holding that it had no authority to depart downward to compensate for the time the defendant already had spent in state custody. 7 F.3d at 1448. We relied on United States v. Wilson, 503 U.S. 329, 112 S.Ct. 1351, 117 L.Ed.2d 593 (1992), in which the Court held that the district court lacks the authority at sentencing to grant credit for time served in detention before sentencing. Id. at 333, 112 S.Ct. 1351. The Court reasoned that pursuant to 18 U.S.C. § 3585(b), only the Attorney General, through the Bureau of'Prisons, could determine whether credit for time served is appropriate. Id. Wilson did not address the propriety of granting a downward departure for time served. We, however, interpreted Wilson to preclude the possibility of a downward departure for time served in state custody. See Huss, 7 F.3d at 1449; see also Daggao, 28 F.3d at 987 (holding that the district court may not depart downward to take into account the time defendant spent in in-house detention prior to sentencing).
The absolute bar to downward departure that Huss and Daggao pronounced is no longer appropriate given the Supreme Court’s intervening decision in Koon. See United States v. Cubillos, 91 F.3d 1342, 1344 (9th Cir.1996) (“After Koon, federal courts can no longer categorically proscribe a basis for departure — unless the Commission has proscribed, as a categorical matter, eonsider-*564ation of the factor.”). Cases that proscribe categorically consideration of a factor not specified as one of the forbidden factors were impliedly overruled by Koon. See United States v. Sherpa, 110 F.3d 656, 661-62 (9th Cir.1996); see also United States v. Brock, 108 F.3d 31, 35 (4th Cir.1997) (overruling earlier Fourth Circuit precedent that had categorically forbidden departure based on post-offense rehabilitation because of the intervening effect of the Koon decision). To the extent that Huss and Daggao can be interpreted as categorically forbidding departure based on time already served in state custody, or the lost opportunity to serve more of one’s state term concurrently with one’s federal term, they are overruled.15
The lost opportunity to serve more of one’s state term concurrent with one’s federal term is a factor unmentioned by the Guidelines. Departure based on an unmentioned factor is permissible if the factor takes the case out of the heartland of the Guidelines. Koon, 518 U.S. at 94, 116 S.Ct. 2035. We cannot say that it was an abuse of discretion for the district court to conclude that Sanchez-Rodriguez’s lost opportunity takes this case out of the heartland of the Guidelines and to grant departure on this basis in this case. The district court noted that the delay in charging and sentencing Sanchez-Rodriguez resulted in a lost opportunity to reduce his total time in custody and was “entirely arbitrary,” a circumstance warranting departure. We have held in analogous circumstances that departure is warranted if a harsher sentence is imposed because of the “fortuity of delay.” See United States v. Martinez, 77 F.3d 332, 337 (9th Cir.1996) (noting that a harsher sentence imposed because of a fortuitous delay in charging the defendant with offenses that would have been grouped together for sentencing purposes, but for the delay in bringing those charges, is a mitigating circumstance not taken into consideration by the Guidelines); see also United States v. Saldana, 109 F.3d 100, 104 (1st Cir.1997) (noting that it was “possible” that a departure might be granted “where a careless or even an innocent delay produced sentencing consequences so unusual and unfair that a departure” would be warranted). According the district court the deference that it is due and relying on our precedent for departure under analogous circumstances, we conclude that the district court did not abuse its discretion in departing downward, in part, for the lost opportunity to serve state and federal time concurrently.
V.
We affirm the sentencing decision of the district court. We overrule United States v. Rios-Favela, 118 F.3d 653 (9th Cir.1997), to the extent that it is inconsistent with this opinion. We overrule United States v. Huss, 7 F.3d 1444 (9th Cir.1993), and United States v. Daggao, 28 F.3d 985 (9th Cir.1994), to the extent that they categorically forbid departure based either on time already served in state custody, or on the lost opportunity to serve more of one’s state term concurrently with one’s federal term.
AFFIRMED.
. All references to the Guidelines are to the November 1, 1995, edition unless otherwise specified.
. An aggravated felony includes "any drug trafficking crime as defined in 18 U.S.C. § 924(c)(2).” U.S.S.G. § 2L1.2(b), cmt. (n.7).
. The government was willing to concede that two points may be deducted for stipulation to deportation and argued that Sanchez-Rodriguez’ offense level would thus be 19, and that he should be sentenced to 63 months, the low end of the applicable 63-to-78-month range.
. The district court departed nine levels. While the court failed to specify how many levels it was departing for each basis, it did note that it was departing between one and four points for the minor nature of the underlying felony.
. See United States v. Amaya-Benitez, 69 F.3d 1243 (2d Cir.1995); United States v. Maul-Val-verde, 10 F.3d 544 (8th Cir.1993). Maul-Val-verde implicitly was overruled by the Eighth Circuit's subsequent decision in United States v. Diaz-Diaz, 135 F.3d 572 (8th Cir.1998), when it considered the effect of the intervening decision *560in Koon and the new amendments to U.S.S.G. § 2L1.2.
. These factors include race, gender, national origin, creed, religion, socio-economic status, lack of guidance as a youth, drug or alcohol dependence, and economic hardship. Koon, 518 U.S. at 95, 116 S.Ct. 2035. See U.S.S.G. §§ 5H1.10, 5H1.12, 5H1.4, 5K2.12.
. The dissent claims that the district court erred because the reference to "any" drug-trafficking offense in U.S.S.G. § 2L1.2 means that any drug-trafficking offense is within the heartland of that guideline. The dissent here employs an inappropriate, linguistically-bound reading of the Guidelines.
The Guidelines provide for departure where "a particular guideline linguistically applies but where conduct significantly differs from the norm.” U.S.S.G. ch. I, pt. A, intro, cmt. 4(b). According to the dissent, the use of the word “any” in § 2L1.2 means that any drug-trafficking offense fits within the norm, or heartland, of that provision. By this rationale, the scope of activities linguistically encompassed by § 2L1.2 is coterminous with, the scope of activities normally falling under the provision. This reading renders meaningless the Guidelines’ distinction between acts to which a guideline "linguistically applies” and those constituting the "norm” or
"heartland” of the guideline, and is precisely the sort of reading against which the Guidelines themselves warn.
Moreover, the dissent’s argument misapplies Koon. In Koon, the Supreme Court specifically held that the determination of whether a factor takes a case outside the heartland is not made "as a general proposition.” Koon, 518 U.S. at 99, 116 S.Ct. 2035. Rather, the sentencing court must consider whether the "particular factor is within the heartland given all the facts of the case." Id. at 100, 116 S.Ct. 2035 (emphasis added). The dissent’s reading of § 2L1.2 would make such case-specific analysis impossible.
. "In 1994, for example, 93.9% of Guidelines cases were not appealed. Letter from Pamela G. Montgomery, Deputy General Counsel, United States Sentencing Commission (Mar. 29, 1996).” Koon, 518 U.S. at 98-99, 116 S.Ct. 2035.
. The Eighth Circuit has made a similar determination. In United States v. Smith, 909 F.2d 1164 (8th Cir.1990), the defendant qualified as a career offender pursuant to the Guidelines. The court held that a downward departure was warranted given "the relatively minor nature of Smith's [previous] crimes, the briefness of his career, and his age at the time the crimes were committed.” Id. at 1169. The court noted that although factors making Smith a career offender were present, they were "only barely present.” Id. at 1170.
. If the amendment substantively affects the interpretation of a guideline and increases the punishment, the amended portion does not apply to preenactment conduct; if the amendment merely clarifies the interpretation of a guideline, it may be applied retroactively. See United States v. Washington, 66 F.3d 1101, 1103 (9th Cir.1995).
. The Commission has indicated that the new amendment is clarifying, not substantive. While we give some deference to the Commission's conclusion, we are not bound by it. Washington, 66 F.3d at 1104.
. If the amendment is substantive and adversely affects this defendant, it would not apply. If it is clarifying, it would apply and would support our conclusion that the district court may depart downward based on the nature of a defendant's predicate felony. Thus, we reject the government’s argument that the new note five to section 2L1.2 limits the circumstances in which departure is warranted.
. The Probation Office determined that Sanchez-Rodriguez would have been able to serve ten more months of his state term concurrently with his federal term had he been indicted and sentenced in a more timely fashion. The INS was informed of his presence in the United States in April 1996. He pled guilty very soon after being charged, and was sentenced in March 1997.
.Sanchez-Rodriguez would have lost this opportunity only if the district court determined that he could serve what remained of his state sentence concurrently with the beginning of his federal sentence. Having concluded that concurrent sentences were appropriate pursuant to U.S.S.G. § 5G1.3, the district court departed for this lost opportunity.
. We note that the basis for departure in the instant case is somewhat different from that asserted in Huss. There, the defendant asked for a departure based simply on time served in state custody. Here, Sanchez-Rodriguez makes a slightly different argument. He asks for a departure based on the lost opportunity to serve more of his state term concurrent with his federal term because of the arbitral/ delay in charging and sentencing him in regard to the federal charges. That distinction does not affect our holding, however, because we overrule the primary holding of Huss on the basis of Koon.