except for footnote 12.
I respectfully dissent from the result reached by the majority. The majority has affirmed the District Court’s admission into evidence of Bonjo and Martin’s guilty pleas,1 over the defendants’ objection, despite the defendants’ agreement not to mention the guilty pleas on cross-examination or to raise any inference which these guilty pleas might rebut. I believe that in doing so the majority deviates from the result mandated by Federal Rules of Evidence 403 and 608. Moreover, the majority’s holding would now make it possible for the government in a criminal case to introduce the guilty plea of a defendant’s accomplice simply by claiming that this evidence must be admitted for the jury to properly assess the testifying accomplice’s credibility. Because I conclude that the District Court abused its discretion by admitting the guilty pleas into evidence, I would reverse the convictions of Lukesh and Universal and remand this case to the District Court for a new trial.
I. A.
To demonstrate how the majority’s opinion deviates from our existing precedent, I will first place this case in a historical context. In 1949, in United States v. Toner, we first considered whether the guilty plea of a conspirator was admissible as evidence in the criminal trial of an alleged co-conspirator. See United States v. Ton*671er, 173 F.2d 140 (3d Cir.1949). In Toner, we ultimately held that the trial court’s admission of an alleged co-conspirator’s guilty plea, combined with a defective limiting instruction, required reversal of the defendant’s conviction. See id. at 142. The Toner Court’s reasoning, articulated by Judge Goodrich, forms the foundation upon which the present case must be decided:
From the common sense point of view[,] a plea of guilty by an alleged fellow conspirator is highly relevant upon the question of the guilt of another alleged conspirator. If A’s admission that he conspired with B is believed, it is pretty hard to avoid the conclusion that B must have conspired with A. This is one of the cases, therefore, where evidence logically probative is to be excluded because of some countervailing policy. There are many such instances in the law.
The foundation of the countervailing policy is the right of every defendant to stand or fall with the proof of the charge made against him, not against somebody else. Acquittal of an alleged fellow conspirator is not evidence for a man being tried for conspiracy. So, likewise, conviction of an alleged fellow conspirator after a trial is not admissible as against one now being charged. The defendant had a right to have his guilt or innocence determined by the evidence presented against him, not by what has happened with regard to a criminal prosecution against someone else. We think that the charge given upon this point was contrary to that rule and inadvertently, of course, deprived the defendant of a very substantial protection to which he was entitled.
See id. (citations omitted). As Toner highlighted, the danger of unfair prejudice when admitting the guilty plea of a co-defendant is more acute if the charge in question is conspiracy because a conspiracy requires an agreement between two or more individuals. See, e.g., United States v. Davis, 183 F.3d 231, 244 (3d Cir.1999) (“A conspiracy requires agreement between at least two people to the illegal object of the conspiracy, though other participants need not be indicted.”) (citing United States v. Delpit, 94 F.3d 1134, 1150 (8th Cir.1996); United States v. Krasovich, 819 F.2d 253, 255 (9th Cir.1987)). If two defendants allegedly conspired, and one defendant has been convicted or has pleaded guilty, the clear implication is that the other defendant is also guilty. This point has been re-emphasized in subsequent Third Circuit case law:
The guilty plea to a conspiracy charge carries with it more potential harm to the defendant on trial because the crime by definition requires the participation of another. The jury could not fail to appreciate the significance of this and would realize, as the court said in a similar case, United States v. Harrell, 436 F.2d 606, 614 (5th Cir.1970), that “it takes two to tango.”
United States v. Gullo, 502 F.2d 759, 761 (3d Cir.1974). Consistent with our holding in Toner, we have subsequently held on many occasions that a witness’s guilty plea cannot be admitted for the purpose of proving a defendant’s guilt. See United States v. Cohen, 171 F.3d 796, 801 (3d Cir.1999) (“[T]he plea agreements of co-conspirators are not admissible to prove the defendant’s guilt.”); United States v. Gaev, 24 F.3d 473, 476 (3d Cir.1994) (“It is well established that the plea agreements of co-conspirators cannot be used as evidence of a defendant’s guilt.”); Government of the Virgin Islands v. Mujahid, 990 F.2d 111, 115 (3d Cir.1993) (“It is well-established that a co-defendant’s guilty plea is not admissible to prove the defendant’s guilt.”); United States v. Werme, 939 F.2d 108, 113 (3d Cir.1991) (“We have long recognized that evidence of another party’s guilty plea is not admissible to prove the defendant’s guilt.”).2
Implicit in, and necessary to, the reasoning of Toner and subsequent cases is the *672principle that if a witness’s guilty plea is to be admissible at all, it must be admissible for some purpose other than proving the defendant’s guilt. See Cohen, 171 F.3d at 801 (holding that an alleged co-conspirator’s plea agreement is admissible for “some purposes”); Gaev, 24 F.3d at 476 (holding that an alleged co-conspirator’s guilty plea is admissible for “some valid purpose[s]”); United States v. Thomas, 998 F.2d 1202, 1205 (3d Cir.1993) (holding that an alleged co-conspirator’s guilty plea is admissible for “limited purposes”); Mujahid, 990 F.2d at 115 (holding that an alleged co-conspirator’s guilty plea is admissible for “other[ ] permissible purposes”); Werme, 939 F.2d at 113 (holding that another party’s guilty plea is admissible for “other[ ] permissible purposes”); United States v. Gambino, 926 F.2d 1355, 1363 (3d Cir.1991) (holding that an alleged co-conspirator’s guilty plea is admissible for “some valid purpose[s]”). Thus, the guilty plea is inadmissible, as a matter of law, unless presented for a valid or permissible evidentiary purpose. See, e.g., Thomas, 998 F.2d at 1203-06.
We have then, despite this general rule against the introduction of a witness’s guilty plea, recognized three valid, permissible purposes for which a guilty plea can be admitted into evidence. First, it may be admitted “in order to rebut defense counsel’s persistent attempts on cross-examination to raise an inference that the co-conspirators had not been prosecuted and that [the defendant] was being single out for prosecution.” United States v. Inadi, 790 F.2d 383, 384 n. 2 (3d Cir.1986).
Second, a guilty plea may be admitted “on direct examination” in order “to dampen subsequent attacks on credibility, and to foreclose any suggestion that the party producing the witness was concealing evidence.” Gambino, 926 F.2d at 1364. This situation arises most often when the defense plans to attack an accomplice’s testimony as being fabricated so that he might receive a less severe punishment in return for testifying.
Finally, although not relevant to this case, a guilty plea may be admitted “to rebut the defense assertion that [the witness] was acting as a government agent when he engaged in the activities that formed the basis for[his guilty] plea.” Werme, 939 F.2d at 114.
In addition, some Third Circuit cases have suggested (in dicta) a fourth permissible or valid purpose. For example, in Gaev we suggested that, “[i]t may also be proper to introduce a witness’s guilty plea to explain his first-hand knowledge of the defendants’ misdeeds.” Gaev, 24 F.3d at 476 (emphasis added) (citing United States v. Halbert, 640 F.2d 1000, 1005 (9th Cir.1981)). I am left wondering, however, how the introduction of a witness’s guilty plea into evidence establishes the basis for his or her firsthand knowledge of the crime. Presumably, all that the introduction of the guilty plea establishes is that the witness pleaded guilty. It is the witness’s testimony itself that establishes the basis for his or her firsthand knowledge of the crime — the witness has firsthand knowledge because s/he was present during or participated in the crime, not because s/he pleaded guilty to the crime.
B.
In the present case, because the defendants agreed not to challenge the witnesses’ credibility based on their plea agreements, we are presented with a more focused question than we' met in Toner: Whether and under what circumstances a trial court can admit into evidence the guilty plea of an alleged accomplice, over the defendant’s objection, when the defendant agrees not to mention the guilty plea on cross-examination and not “to raise any inference on which the accomplices’ pleas of guilty would be admissible to rebut.”3
*673In United States v. Thomas, 998 F.2d 1202 (3d Cir.1993), we first considered this more restricted issue. The District Court in Thomas had admitted two co-conspirators’ guilty pleas into evidence, concluding that admission was proper for the limited purposes of “aidfing] the jury in assessing [the witnesses’] credibility,” “establishfing] the [witnesses’] acknowledgment of their participation in the offense,” and “counter[ing] the inference that [the witnesses] had not been prosecuted.” Thomas, 998 F.2d at 1204. In reviewing the trial court’s decision to admit the guilty pleas into evidence, we noted that the Third Circuit had recognized two relevant, valid or permissible purposes for which an alleged co-conspirator’s guilty plea could be introduced into evidence, “to blunt the impact on a government witness’s credibility of having evidence of a guilty plea brought out on cross examination by the defense,” and “to prevent any improper inference by the jury that the defendant has been singled out for prosecution while the co-eon-spirators have not been prosecuted.” Id. at 1205. We reasoned, however, that neither purpose justified admitting the guilty pleas into evidence, because the defendant had agreed not to challenge the witnesses’ credibility based on their guilty pleas, and because the defendant had not suggested he was being selectively prosecuted. See id.4. We rejected the District Court’s claim that the alleged co-conspirators’ guilty pleas were admitted into evidence in order to establish their acknowledgment of their participation in the crime, pointing out that defense counsel did not challenge the witnesses’ assertion that they participated in the crime. See id.
Balancing the danger of unfair prejudice associated with the admission of the guilty pleas against their probative value pursuant to Federal Rule of Evidence 403, we ultimately held in Thomas that, “[i]n the absence of a proper purpose for the admission of the guilty pleas, the curative instructions of the district court were not sufficient to remove the prejudice to Thomas presented by the evidence of his eo-conspirators’s [sic] guilty pleas.” Id. at 1206. We concluded that we were “not left with the requisite ‘sure conviction that the error did not prejudice the defendant’ ” and thus concluded that “the introduction at trial of evidence of Thomas’s co-conspirators’s[sic] guilty pleas was reversible error.” Id. at 1207 (quoting United States v. Jannotti, 729 F.2d 213, 219-20 (3d Cir.1984)).
Judge Rosenn filed a vigorous dissent in Thomas, arguing that the alleged co-conspirators’ guilty pleas were properly admitted “(1) to bolster the credibility of the co-conspirators as prosecution witnesses; (2) to quell the inference that the co-conspirators were not prosecuted and that Thomas was thus ‘singled out’ for punishment; and (3) to establish the basis for the co-conspirators’ firsthand knowledge of the crime about which they testified.” Id. at 1208 (Rosenn, J., dissenting). Contending that the alleged co-conspirators’ credibility would be at issue regardless of the defense’s assurance that it would not attack the witnesses’ credibility with respect to their guilty pleas, Judge Rosenn acknowledged that his dissent was at odds with the Third Circuit’s holding in Toner. “One could argue that credibility is always at *674issue and that my position thus effectively overrules Toner.” Id. at 1209. However, Judge Rosenn argued that his position was in fact consistent with the holding in Toner.
[A] witness’s credibility is only at issue when he or she testifies about a relevant and disputed fact. Moreover, Toner merely states that a guilty plea of a witness cannot be used to establish the guilt of the defendant. Thus, even if the guilty plea is always admissible for the purpose of establishing the credibility of the witness, that does not overrule Toner: Toner would still require a limiting instruction, similar to the ones given by the trial judge in the present case, to insure that the jury understands that the guilty plea cannot be used to establish the guilt of the defendant.
Id.
I cannot, however, accept the implications of this explanation, just as I cannot accept the majority’s position, unless there has been a meaningful weighing of the probative value of the guilty pleas against the danger of unfair prejudice, as required by Federal Rule of Evidence 403.
C.
Subsequent to our decision in Thomas, we again addressed whether the trial court erred by admitting into evidence the guilty plea of an alleged co-conspirator, even though the defendant agreed not to challenge the alleged co-conspirator’s credibility nor to raise any inference that would make the guilty plea admissible. See United States v. Gaev, 24 F.3d 473, 474-79 (3d Cir.1994). On facts nearly identical to those in Thomas, we held in Gaev that the alleged co-conspirators’ guilty pleas had been properly admitted into evidence. See id. at 479. In conducting the requisite Rule 403 balancing, we concluded, consistent with Judge Rosenn’s dissent in Thomas, that “[w]hen a co-conspirator testifies that he .took part in the crime with which the defendant is charged, his credibility will automatically be implicated.” Gaev, 24 F.3d at 477 (emphasis added). Ultimately, in Gaev we went beyond the confines of Judge Rosenn’s dissent in Thomas, holding that a witness’s credibility in a case like Thomas will “automatically” be at issue. Prior to Gaev, this proposition, that “[w]hen a co-conspirator testifies that he took part in the crime with which the defendant is charged, his credibility will automatically be implicated,” had not arisen in Third Circuit jurisprudence. The consequences of the extension of such an expansive interpretation of our prior case law are illustrated by the majority’s opinion in this case.
II. A.
Federal Rule of Evidence 403 states that:
Although relevant, evidence may be excluded if its probative value is substantially outweighed by the danger of unfair prejudice, confusion of the issues, or misleading the jury, or by considerations of undue delay, waste of time, or needless presentation of cumulative evidence.
Thus, evidence that is otherwise admissible and probative of guilt must sometimes be excluded because of the danger of unfair prejudice to the defendant. See, e.g., United States v. Sriyuth, 98 F.3d 739, 746 (3d Cir.1996).
The District Court, in balancing the danger of unfair prejudice associated with Bonjo and Martin’s guilty pleas against their probative value, concluded that the probative value was not substantially outweighed by the danger of unfair prejudice. The majority, endorsing this conclusion, states:
The District Court heard argument on the defendant’s [sic] motion in limine and accompanying arguments concerning Bonjo and Martin plea agreements and guilty pleas at three separate instances during this criminal proceeding: (1) on May 3, 1995, prior to the testimony of FBI Agent Cook (App. at 806); (2) on May 9, 1995, prior to the testimony of Dr. Paul C. Moock, Jr. (App. at 1768); and (3) subsequent to trial in ruling upon the defendants’ post-trial motions. *675At each instance, the District Court carefully and meticulously weighed the ... factors of credibility, selectivity, and witness knowledge that inform the probative value versus prejudicial effect standard required by Federal Rule 403. At each instance, the District Court’s balancing was careful and comprehensive in concluding that the probative value of Bonjo and Martin’s plea agreements and guilty pleas outweighed any prejudicial effect.
Majority Opinion at 669-70. The record, however, belies this contention.
On May 3, 1995, prior to the testimony of FBI Agent Cook, the District Court first heard argument on the defendants’ motion in limine. See App. at 806-17.5 After hearing argument on the motion, the District Court did not “carefully and meticulously weigh[ ] the ... factors of credibility, selectivity, and witness knowledge that inform the probative value versus prejudicial effect standard required by Federal Rule 403,” nor did the District Court “careful[ly] and comprehensively]” conclude “that the probative value of Bon-jo and Martin’s plea agreements and guilty pleas outweighed any prejudicial effect.” Rather, the District Court simply stated: “I’ll take all the time I have available to think about this.” App. at 816.
On May 9, 1995, prior to Dr. Paul C. Moock’s testimony, the District Court ruled on the defendants’ motion in limine. The District Court did not hear further argument on the motion, nor did the District Court “carefully and meticulously weigh[ ] the ... factors of credibility, selectivity, and witness knowledge that inform the probative value versus prejudicial effect standard required by Federal Rule 403.” The District Court simply made the following statement:
All right, I have weighed all of the factors and I think in the context of this case we have had and from what I know of or have heard by way of reference to Julia Blum [Bonjo] and Penny Martin, I think it sounds to me as if they are somewhat higher up in the structure. And if they testify the jury is going to certainly wonder whether or not they have been charged. It’s going to wonder perhaps what they have been promised by the prosecutor if anything and what they may be getting in return for their testimony.
I think in weighing all those factors with the possible prejudice that I am going to allow the Government to bring out the fact of the guilty plea and the fact of the guilty plea agreement.,..
I think this is exactly like the Gave [sic] case, only there are more reasons here, because there are so many people who have testified and in their testimony have indicated a certain amount of wrong doing. And they — it’s pretty obvious haven’t been charged and I think it raises a very serious question in the minds of the jury, especially as to people who are as I said before, higher up in the structure. What are they getting for their testimony, how is it that these people haven’t been charged and it’s better in my opinion that the jury know it all. That’s the basis of the reason.
App. at 1768, 1771-72. The language quoted above clearly indicates that the District Court did little if any balancing but instead simply concluded that Bonjo and Martin’s guilty pleas were admissible. In fact, the District Court mentioned only two of the factors that the majority highlights, glossing over them in cursory form:first, credibility, “what are they getting for their testimony,” and, second, se-*676leetive prosecution, “how is it that these people haven’t been charged.” Moreover, no mention is made by the District Court of the defendants’ commitment not to raise these issues or of the possibility of admitting the pleas on rebuttal if the defendants reneged on their commitment. The majority’s characterization of the District Court’s Rule 403 analysis as “careful,” “meticulous” and “comprehensive” is undermined by this cursory Rule 403 analysis.
B.
As set forth in Federal Rule of Evidence 403, and as the majority acknowledges, this case turns on whether the District Court properly weighed the probative value of Bonjo and Martin’s guilty pleas against the danger of unfair prejudice to the defendants. Because a proper Rule 403 analysis must consider both the probative value of the guilty pleas, as well as the danger of unfair prejudice associated with the pleas, I will first assess their probative value.
The District Court concluded that the probative value of Bonjo and Martin’s guilty pleas was limited to eliminating the appearance of selective prosecution and to informing the jury what the witnesses were receiving in exchange for their testimony. It is beyond question, however, that the probative value of this type of information would have been minimized by the defendants’ commitment not to “raise the guilty plea/plea agreements on cross examination nor[ ] to raise any inference on which the accomplices’ pleas of guilty would be admissible to rebut.” The credibility attack, based on any quid pro quo that the witnesses derived from the plea agreements, would not take place if the defendants refrained from employing this line of attack in their cross-examination.
I am firmly convinced, moreover, that the evaluation of probative value cannot be made without a consideration of the defendants’ commitment. The majority disregards the commitment, however, and focuses on the probative value associated with assisting the jury in assessing the credibility of the accomplices in response to jury speculation or in response to the defense’s cross-examination attacking a witness’s credibility — a stage of the trial which need not occur if the defendants lived up to their commitment.
In adopting this focus, the majority skirts the line between pointing out that these guilty pleas may have probative value and declaring that the guilty pleas themselves constitute substantive evidence of the defendants’ guilt. It is black letter law, as the majority acknowledges, that a witness’s guilty plea cannot be admitted as substantive evidence of a defendant’s guilt. See Cohen, 171 F.3d at 801 (“[T]he plea agreements of co-conspirators are not admissible to prove the defendant’s guilt.”); Gaev, 24 F.3d at 476 (“It is well established that the plea agreements of co-conspirators cannot be used as evidence of a defendant’s guilt.”); Mujahid, 990 F.2d at 115 (“It is well-established that a co-defendant’s guilty plea is not admissible to prove the defendant’s guilt.”); Werme, 939 F.2d at 113 (“We have long recognized that evidence of another party’s guilty plea is not admissible to prove the defendant’s guilt.”). Nevertheless, by ignoring the defendants’ agreement not to “raise the guilty plea/plea agreements on cross examination nor [ ] to raise any inference on which the accomplices’ pleas of guilty would be admissible to rebut” the majority fails to appreciate that, in light of defendants’ commitment not to raise the issue of the pleas, the probative value of Bonjo and Martin’s guilty pleas is negligible. Moreover, the jury will then be presented with evidence that has minimal probative value but which may improperly imply that because Bonjo and Martin pled guilty, Luk-esh and Universal are also guilty.6
*677c.
Having considered the probative value of Bonjo and Martin’s guilty pleas, we must next assess the danger of unfair prejudice associated with admitting their guilty pleas into evidence. As the majority acknowledges, and as we have previously noted, “[t]he guilty plea to a conspiracy charge carries with it more potential harm to the defendant on trial because the crime by definition requires the participation of another.” United States v. Gullo, 502 F.2d 759, 761 (3d Cir.1974). It is true that the defendants were convicted of mail fraud and not of conspiracy. Nevertheless, the offense of conviction, as it was presented at trial, in many respects was similar to a conspiracy. In order to obtain a mail fraud conviction under 18 U.S.C. § 1341, the government must prove that the defendant devised a scheme to defraud, that the defendant participated in the scheme with the specific intent to defraud and that the defendant could reasonably foresee use of the mails. See United States v. Feola, 420 U.S. 671, 693, 95 S.Ct. 1255, 43 L.Ed.2d 541 (1975); Pereira v. United States, 347 U.S. 1, 8, 74 S.Ct. 358, 98 L.Ed. 435 (1954); United States v. Pflaumer, 774 F.2d 1224, 1233 (3d Cir.1985). As the scheme to defraud was described in the indictment and presented at trial, defendants, including Lukesh, Universal, and Bonjo, participated together in the scheme to defraud and obtain money from the Medicare program. Indeed, it would appear that the government could have elected to indict the defendants on a conspiracy count as well as on the substantive mail fraud counts.
I find, however, that the majority trivializes the heightened danger of unfair prejudice presented by this type of situation, a situation that requires closer scrutiny of the Rule 403 balance. See Majority Opinion at 669. In the context of this case, the majority’s characterization of the offense to which Bonjo and Martin pleaded guilty as a “substantive count[ ]” while legally accurate, is also misleading. In the case of Universal and Lukesh, section 1341 criminalized what was essentially a successful conspiracy to commit Medicare fraud. In fact, the jury found that Lukesh and Universal had devised a scheme to defraud Medicare by fraudulently re-writing and altering patient evaluations to increase the likelihood that Medicare would reimburse Universal for medical services that were not otherwise reimbursable. Bonjo and Martin pled guilty to participating in this scheme. Ultimately, on the facts before us, the distinction that the majority attempts to draw, between the “substantive” count of mail fraud under section 1341 and the “non-substantive” count of conspiracy to commit mail fraud under section 371, is a distinction without a difference. Thus, the danger of unfair prejudice associated with the District Court’s decision to admit Bonjo and Martin’s guilty pleas into evidence is not only significant but also virtually identical to the danger of unfair prejudice associated with admitting into evidence the guilty pleas of two alleged co-conspirators.
D.
Having considered both the probative value of and the danger of unfair prejudice associated with Bonjo and Martin’s guilty pleas, we must next determine whether the probative value of these guilty pleas is substantially outweighed by the danger of unfair prejudice to the defendants. The probative value of Bonjo and Martin’s guilty pleas is negligible — the defendants agreed not to “raise the guilty plea/plea agreements on cross examination nor [ ] to raise any inference on which the accomplices’ pleas of guilty would be admissible to rebut.” The principal effect of this agreement is a reduction in the probative value of this evidence. On the flip side, the danger of unfair prejudice associated with Bonjo and Martin’s guilty pleas is significant — mail fraud, as a matter of law, involves a scheme or artifice to defraud, and Bonjo and Martin allegedly participated in this scheme with and under the direction of Universal and Lukesh. Thus, if Bonjo and Martin’s admission that they *678committed mail fraud is believed, it is difficult not to conclude that Universal and Lukesh committed mail fraud as well. As we noted in Toner, “[a] defendant ha[s] a right to have his guilt or innocence determined by the evidence presented against him, not by what has happened with regard to a criminal prosecution against someone else.” Toner, 173 F.2d at 142. Clearly, Bonjo and Martin’s guilty pleas create a significant danger of unfair prejudice. This significant danger of unfair prejudice substantially outweighs the minimal probative value of Bonjo and Martin’s guilty pleas. For that reason, Federal Rule of Evidence 403 mandates that their guilty pleas be excluded. Thus, the District Court’s decision to admit Bonjo and Martin’s guilty pleas into evidence was an abuse of discretion.
III.
By concluding that Bonjo and Martin’s guilty pleas were properly admitted into evidence, and by endorsing the holding in Gaev, the majority ignores the fact that, over time, Tonér and its progeny have come to stand for the proposition that guilty pleas of co-conspirators are not admissible to establish the guilt of the defendant and can only be introduced into evidence for a proper evidentiary purpose. See, e.g., United States v. Gambino, 926 F.2d 1355, 1363 (3d Cir.1991); Werme, 939 F.2d at 113-14; Mujahid, 990 F.2d at 115. Following the majority’s reasoning, unless a defendant is willing to refrain from cross-examining a witness entirely, the witness’s credibility will always be at issue, and his or her guilty plea will always be admissible. While this may be the rule of law in other circuits, it. is definitely not the rule of law in the Third Circuit. Compare, e.g., United States v. Mealy, 851 F.2d 890, 899 (7th Cir.1988) (“The well established rule in this circuit is that, on direct examination, the prosecutor may elicit direct testimony regarding the witness’s plea agreement and actually introduce the plea agreement into evidence.”) with Gambino, 926 F.2d at 1363 (holding that an alleged co-conspirator’s guilty plea can be admitted into evidence only for a proper eviden-tiary purpose). The majority’s holding effectively overrules Toner and its progeny without acknowledging this fact or providing a reason for doing so.7
The majority apparently concludes that Toner stands for the proposition that an alleged co-conspirator’s guilty plea cannot be offered as proof of the defendant’s guilt; therefore, when an alleged co-conspirator’s guilty plea is admitted into evidence, the jury must be instructed that the guilty plea cannot be used to establish the guilt of the defendant. While this may be a proper interpretation of Toner read alone, subsequent cases in the Third Circuit have recognized that, absent a proper purpose, guilty pleas of an alleged co-conspirator are inadmissible. Framed in terms of the balancing approach required by Federal Rule of Evidence 403, absent a proper purpose, the probative value of an alleged co-conspirator’s guilty plea is substantially outweighed by the danger of unfair prejudice to the defendants.
In light of our established precedent, I believe that the trial court in Universal erred by admitting into evidence the guilty pleas of two alleged co-schemers in face of the defendants’ commitment that they *679would not, on cross-examination, challenge the credibility of the government’s witnesses or raise any inferences that would make the guilty pleas admissible. Absent a proper evidentiary purpose, a trial court’s decision to admit an alleged co-conspirator’s guilty plea is improper and an abuse of discretion. An alleged co-conspirator’s guilty plea cannot be admitted for the purpose of proving a defendant’s guilt. See Cohen, 171 F.3d at 801 (“[T]he plea agreements of co-conspirators are not admissible to prove the defendant’s guilt.”); Gaev, 24 F.3d at 476 (“It is well established that the plea agreements of co-conspirators cannot be used as evidence of a defendant’s guilt.”); Mujahid, 990 F.2d at 115 (“It is well-established that a co-defendant’s guilty plea is not admissible to prove the defendant’s guilt.”); Werme, 939 F.2d at 113 (“We have long recognized that evidence of another party’s guilty plea is not admissible to prove the defendant’s guilt.”). Contrary to the majority’s claim that “Federal Rule of Evidence 403 creates a presumption of admissibility,” an alleged co-conspirator’s guilty plea is only admissible for a limited number of valid, permissible purposes. See United States v. Inadi, 790 F.2d 383, 384 n. 2 (3d Cir.1986) (“[A co-conspirator’s guilty plea may be admitted] in order to rebut defense counsel’s persistent attempts on cross-examination to raise an inference that the co-conspirators had not been prosecuted and thatfthe defendant] was being singled out for prosecution.”); Gambino, 926 F.2d at 1364 (“[A co-conspirator’s guilty plea may be admitted] on direct examination [in order] to dampen subsequent attacks on credibility, and to foreclose any suggestion that the party producing the witness was concealing evidence.”); Werme, 939 F.2d at 114 (“[A witness’s guilty plea may be admitted] to rebut the defense assertion that [the witness] was acting as a government agent when he engaged in the aetivities that formed the basis for [his guilty] plea.”).
As our analysis above demonstrates, when a defendant agrees not to “raise the guilty plea/plea agreements on cross examination nor to raise any inference on which the accomplices’ pleas of guilty would be admissible to rebut,” the Rule 403 balance clearly tips in favor of excluding the evidence.8 If an alleged co-conspirator’s guilty plea is to be admissible at all, it must be admissible for some purpose other than proving the defendant’s guilt. See Cohen, 171 F.3d at 801 (holding that an alleged co-conspirator’s plea agreement is admissible for “some purposes”); Gaev, 24 F.3d at 476 (holding that an alleged co-conspirator’s guilty plea is admissible for “some valid purposefs]”); United States v. Thomas, 998 F.2d 1202, 1205 (3d Cir.1993) (holding that an alleged co-conspirator’s guilty plea is admissible for “limited purposes”); Mujahid, 990 F.2d at 115 (holding that an alleged co-conspirator’s guilty plea is admissible for “other[ ] permissible purposes”); Werme, 939 F.2d at 113 (holding that another party’s guilty plea is admissible for “other[ ] permissible purposes”); United States v. Gamhino, 926 F.2d 1355, 1363 (3d Cir.1991) (holding that an alleged co-conspirator’s guilty plea is admissible for “some valid purpose[s]”). Allowing the government, when prosecuting a criminal case, to introduce the guilty plea of a defendant’s alleged co-conspirator simply by claiming that the guilty plea must be admitted into evidence so that the jury can assess the witness’s credibility creates an exception that swallows the rule. The government will always be able to claim that a witness’s guilty plea must be admitted into evidence so that the jury can assess the witness’s credibility, and thus the guilty plea will always be admissible. It is impossible to reconcile this result with our prior jurisprudence or with the result mandated by Rule 403.
*680IV.
Focusing primarily on the jury’s need to assess the credibility of Bonjo and Martin, and relying on a statement in Gaev and cases in other circuits, the majority concludes that the government may seek to introduce a witness’s guilty plea even in the absence of a challenge to the witness’s credibility. The majority’s holding deviates not only from the outcome mandated by Rule 403, its holding is at odds with Federal Rule of Evidence 608(a) and (b). Rule 608(a) states:
The credibility of a witness may be attacked or supported by evidence in the form of opinion or reputation, but subject to these limitations: ■ (1) the evidence may refer only to character for truthfulness or untruthfulness, and (2) evidence of truthful character is admissible only after the character of the witness for truthfulness has been attacked by opinion or reputation evidence or otherwise.
Arguably, under Rule 608(a), absent an attack on Bonjo and Martin’s credibility, their guilty pleas are inadmissible. The Advisory Committee Notes to Rule 608(a), which summarize the policy behind the rule, indicate:
Character evidence in support of credibility is admissible under the rule only after the witness’ character has first been attacked, as has been the case at common law. Maguire, Weinstein, et al., Cases on Evidence 295 (5th ed.1965); McCormick § 49, p. 105; 4 Wigmore § 1104. The enormous needless consumption of time which a contrary practice would entail justifies the limitation.
Thus, even prior to the enactment of Rule 608(a), as a matter of common law, evidence was admissible to bolster a witness’s credibility only after the witness’s credibility had been attacked. See, e.g., Perkins v. United States, 315 F.2d 120, 123 (9th Cir.1963) (highlighting “the general rule that until the credibility of a witness has been attacked by evidence pertaining to credibility, evidence tending to establish credibility is inadmissible”) (citing Homan v. United States, 279 F.2d 767, 772 (8th Cir.1960)).
Not only is the majority’s holding contrary to Rule 608(a), its holding is also at odds with Rule 608(b). Rule 608(b) states:
Specific instances of the conduct of a witness, for the purpose of attacking or supporting the witness’ credibility, other than conviction of crime as provided in Rule 609, may not be proved by extrinsic evidence. They may, however, in the discretion of the court, if probative of truthfulness or untruthfulness, be inquired into on cross-examination of the witness (1) concerning the witness’ character for truthfulness or untruthfulness, or (2) concerning the character for truthfulness or untruthfulness of another witness as to which character the witness being cross-examined has testified.
Because Bonjo and Martin’s guilty pleas (or more specifically their decision to plead guilty) could be considered conduct under Rule 608(b), to the extent that the government introduced Bonjo and Martin’s guilty pleas to support their credibility, their admission is barred, as a matter of law, by Rule 608(b). See Fed.R.Evid. 608(b); cf., e.g., United States v. Anderson, 859 F.2d 1171, 1178 (3d Cir.1988) (“To the extent that [the probation officer’s] testimony was an attempt to attack [the witness’s] credibility by extrinsic evidence, it is strictly prohibited by Federal Rule of Evidence 608(b).”).9 Indeed, the government argued *681in its briefs and during oral argument that Bonjo and Martin’s guilty pleas should be admitted into evidence in order better to allow the jury to assess their credibility. Certainly, since Bonjo and Martin were government witnesses, their guilty pleas were not being introduced into evidence to attack their credibility but rather to bolster it.
Consequently, the majority’s conclusion that Bonjo and Martin’s guilty pleas were properly admitted into evidence is not only contrary to the result mandated under Federal Rule of Evidence 403 but also is at odds with the Federal Rule of Evidence 608.10
V.
While the majority’s conclusion, that the District Court did not abuse its discretion by admitting into evidence Bonjo and Martin’s guilty pleas, is disturbing, equally disturbing is the majority’s conclusion that “the detailed limiting instructions provided by the District Court cured the prejudicial effect, if any, flowing from the introduction of Bonjo and Martin’s guilty pleas and plea agreements.” Majority Opinion at 669.
It is beyond dispute that when an alleged co-conspirator’s guilty plea is admitted into evidence, even if the trial court has given a proper cautionary instruction to the jury, the prejudice to the defendant may be serious enough to constitute reversible error. See, e.g., Thomas, 998 F.2d at 1206 (“In the absence of a proper purpose for the admission of the guilty pleas, the curative instructions of the district court were not sufficient to remove the prejudice to Thomas presented by the evidence of his co-conspirators’ guilty pleas.”); Gaev, 24 F.3d at 478 (“There may also be cases where the inference of guilt from the co-conspirator’s plea agreement is sufficiently strong that even limiting instructions will not effectively contain it.”). The majority not only concludes that this prejudicial effect is typically cured by *682a limiting instruction to the jury but also dismisses the defendants’ contention that juries cannot comprehend or follow such limiting instructions.
Moreover, the majority’s analysis obscures what I consider to be the key issue: The District Court abused its discretion by admitting into evidence Martin and B on-jo’s guilty pleas, over the defendants’ objection, despite the fact the defendants agreed not to “raise the guilty plea/plea agreements on cross examination nor [ ] to raise any inference on which the accomplices’ pleas of guilty would be admissible to rebut.” While a limiting instruction given by a District Court may render an otherwise erroneous evidentiary ruling harmless, a limiting instruction cannot transform an otherwise erroneous eviden-tiary ruling into a legally proper evidentia-ry ruling. Ultimately, we must decide whether, the District Court, at the time it ruled on the defendants’ motion in limine, abused its discretion by admitting Bonjo and Martin’s guilty pleas into evidence. To do so, we must focus on the probative value and danger of unfair prejudice associated with Bonjo and Martin’s guilty pleas and not on whether the District Court’s limiting instructions cured any resulting, unfair prejudice.
Moreover, the danger of unfair prejudice highlighted above renders both baffling and confounding the District Court’s decision to instruct the jury “that it may not consider the guilty plea and/or plea agreement as evidence that the defendant is guilty of the offenses with which he is charged,” rather than to instruct the jury that it need not concern itself with the possibility of selective prosecution or what the witnesses have be promised in return for their testimony. See, e.g., Thomas, 998 F.2d at 1205. If, as the majority contends, juries comprehend and follow limiting instructions such as those given by the District Court in this case, surely the better approach, and the one most consistent with Third Circuit jurisprudence, is to exclude Bonjo and Martin’s guilty pleas and to instruct the jury members that they should concern themselves only with the guilt or innocence of defendants and not with the possibility of selective prosecution or the involvement of any other persons in any alleged scheme. See supra, Majority Opinion at 667-68, note 13; Thomas, 998 F.2d at 1205; cf. Spencer v. Texas, 385 U.S. 554, 562-63, 87 S.Ct. 648, 17 L.Ed.2d 606 (1967) (“[T]his type of prejudicial effect is acknowledged to inhere in criminal practice, but it is justified on the grounds that ... the jury is expected to follow instructions in limiting this evidence to its proper function.”). Instead of following our holding in Thomas, the majority relies on precedent in other circuits, citing one case from the Fifth Circuit and one case from the Eleventh Circuit, to support its conclusion that Bonjo and Martin’s guilty pleas are presumptively admissible. See Majority Opinion at 664-65.11 Ultimately, the majority’s conclusion that Federal Rule of Evidence 403 “creates a presumption of admissibility” with respect to an alleged accomplice's guilty plea, a conclusion that is crucial to the majority’s holding, is unsupported by Third Circuit precedent.12
*683VI.
Although the District Court abused its discretion by admitting into evidence Bon-jo and Martin’s guilty pleas, I must also consider whether its evidentiary ruling amounts to harmless error. See, e.g., United States v. Werme, 939 F.2d at 111 (“We also conclude that it was harmless error to introduce the [witnesses’] guilty pleas.”). An error at trial is harmless if an appellate court concludes that there is a “high probability” that the error did not affect the defendant’s substantial rights. Id. at 116-17. Phrased differently, an appellate court must have “a sure conviction that the error did not prejudice the defendant, but need not disprove every reasonably possibility of prejudice” to conclude that the error was harmless. United States v. Jannotti, 729 F.2d 213, 219-20 (3d Cir.1984).
Reviewing the record, it is clear that the District Court’s erroneous evidentiary ruling was not harmless error. Of the thirty-nine counts that the defendants were charged with, they were acquitted on thirty-eight counts and were convicted on only one count, the count to which government witness Judy Blum Bonjo pleaded guilty. Further suggesting the likelihood of prejudice, the count on which the defendants were convicted involved a patient named Mildred Hynes, but Mildred Hynes was involved in four other counts on which the defendants were acquitted. Lastly, and perhaps most importantly, discarding Bon-jo’s and Martin’s guilty pleas, the evidence against the defendants on Counts Two through Thirty-Nine was virtually identical to the evidence presented on the single count which the defendants were convicted. In light of these facts, I believe that the error here could not be harmless.
VII.
For the above reasons, I would reverse the defendants’ convictions and remand the ease to the District Court for a new trial.
. Like the majority, I believe that the distinction between guilty pleas and plea agreements is, in the context of this case, a distinction without a difference. See supra Majority Opinion at 660, As such, I use the term "guilty plea(s)” to refer to guilty plea(s) and/or the corresponding plea agreement(s).
. These cases alone refute the majority's claim that an accomplice's guilty plea is presumptively admissible. See supra Majority Opinion at 664-65.
. Both defendants joined the motion in limine to exclude the guilty pleas of the two alleged accomplices. The motion stated in relevant part, “Defendant asserts that at trial of this action he will not raise the guilty plea/plea agreements on cross examination nor seek to raise any inference on which the accomplices' pleas of guilty would be admissible to rebut.” *673Supplemental Brief for the Appellants at 23; see supra Majority Opinion at 662, note 6.
. In Thomas, we concluded that the case at hand differed from United States v. Inadi where the alleged co-conspirator's guilty plea was admitted only "to rebut defense counsel’s persistent attempts on cross-examination to raise an inference that the co-conspirator's had not been prosecuted and [that] the defendant was being singled out for prosecution." Thomas, 998 F.2d at 1205 (citing United States v. Inadi, 790 F.2d 383, 384 n. 2. (3d Cir.1986)). We noted that if the defendant violated the agreement and "attempted to raise an inference on cross-examination that [the defendant] was being unfairly singled out for prosecution, additional remedial steps could [then] have been taken.” Id. at 1205 n. 1. Presumably, "additional remedial steps” would have included introducing the alleged co-conspirator's guilty plea into evidence on rebuttal.
. The record suggests that May 3, 1995, was actually the last time that the District Court heard arguments on the defendants’ motion in limine. The District Court stated:
All right, I asked you to come at this point so that we could have a last opportunity to argue the motion in limine and I addressed your attention to the Gave [sic] case. Anyone wish to make any additional arguments, you may do so.
App. at 806. Regardless, this exchange on May 3, 1995, is the first point in the record at which the District Court heard arguments on the defendants’ motion in limine.
. I will deal further with two other aspects of the probative value of the guilty pleas in my discussion of Rule 608 in Section IV and of limiting instructions in Section V.
. The majority is quick to focus on the following statement in Gaev: "While plea agreements have often been admitted in response to actual or anticipated attacks on a witness's credibility, an attack is not always necessary to justify their introduction,” Gaev, 24 F.3d at 477-78. To support this proposition, the Gaev Court cites the following passage in Gambino: "In this case, the defendants began their attack on the credibility of the government’s witnesses in their opening statement. Yet, even in the absence of this attack, the [introduction of the witnesses' guilty pleas] was proper here.” Gambino, 926 F.2d at 1363. This statement, which is clearly dictum, is made without any supporting cite to case law in the Third Circuit or any other circuit. Such a statement is without support or foundation in Third Circuit jurisprudence, and since it is merely dictum, it alone should not provide the basis for affirming the District Court’s decision to admit Bonjo and Martin’s guilty pleas into evidence.
. As discussed below, the jury's verdict confirms that the defendants were in fact prejudiced by the District Court’s erroneous evi-dentiary ruling. See infra Section VI. I note moreover that if a defendant reneges on a commitment not to impeach a witness's credibility on the basis of the guilty plea, the government will have the opportunity to introduce the guilty plea on rebuttal.
. Extrinsic evidence under Rule 608(b) is admissible for purposes other than supporting or attacking a witness's credibility. See, e.g., Lamborn v. Dittmer, 873 F.2d 522, 528 (2d Cir.1989) ("[Rule 608] is inapplicable in determining the admissibility of evidence introduced to impeach a witness’s testimony as to a material issue.”). While the majority con-eludes that Bonjo and Martin’s guilty pleas are admissible for purposes other than evaluating their credibility, i.e., avoiding the .appearance of selective prosecution and establishing a basis for the witness’s knowledge of the crime, that the guilty pleas were admitted to allow the jury to evaluate the witnesses’ *681credibility is the cornerstone of the majority's holding.
. As the majority points out, it is arguable whether Federal Rule of Evidence 608 governs the admission of Bonjo and Martin’s guilty pleas. See supra Majority Opinion at 667-68, note 13. However, even if one were to conclude that Rule 608 does not govern the admission of Bonjo and Martin's guilty pleas, it is clear that Rule 608 provides insight into the appropriate balancing required under Rule 403. Specifically, Rule 608 allows a party to introduce “evidence in the form of opinion or reputation” to attack or support the credibility of a witness only after the credibility of the witness has been attacked. The majority contends that Bonjo and Martin's guilty pleas are admissible to bolster their credibility despite the defendants’ agreement not to attack Bonjo and Martin's credibility. This contention is at odds with the framework set forth in Rule 608. To admit Bonjo and Martin's guilty pleas, absent a prior attack on their credibility, when similar evidence would, as a matter of law, be admissible under Rule 608 only after a testifying witness's credibility had been attacked, undermines the majority's entire Rule 403 analysis.
Moreover, the majority's analysis of Old Chief v. United States is also at odds with the framework set forth in Rule 608. In arguing that the introduction of Bonjo and Martin’s guilty pleas has less probative value than the defendants' agreement not to mention the guilty pleas on cross-examination or to raise any inference which these guilty pleas might rebut, the majority overlooks the fact that “evidence in the form of opinion or reputation” is admissible only after the credibility of a witness has been attacked. See Majority Opinion at 667. Thus, under Rule 608, the comparison of probative value required under Old Chief and alluded to by the majority would be purely hypothetical and unnecessary; absent a prior attack on credibility, "evidence in the form of opinion or reputation” is, as a matter of law, inadmissible.
Finally, contrary to the majority's suggestion, the potential applicability of Rule 608 was not only discussed at the en banc oral argument, the government filed a supplemental brief after oral argument to address the issue. See Supplemental Brief of Appel-lee United States of America, Filed November 22, 1999 ("At oral argument before the en banc Court on November 8, 1999, the Court raised two issues which had not previously been addressed in this appeal: (1) The relevance of Rule 608 of the Federal Rules of Evidence; and (2) the applicability of Luce v. United States, 469 U.S. 38, 105 S.Ct. 460, 83 L.Ed.2d 443 (1984).”).
. The Eleventh Circuit case that the majority cites, Hendrix v. Raybestos-Marihattan, Inc., 776 F.2d 1492 (11th Cir.1985), is a civil tort case. Arguably, there exists a heightened concern associated with the “danger of unfair prejudice” in the context of a criminal case.
. The majority attempts to draw support for its holding from a recent Supreme Court case, Old Chief v. United States, 519 U.S. 172, 117 S.Ct. 644, 136 L.Ed.2d 574 (1997). In Old Chief, the Supreme Court held that a trial court abuses its discretion when, in a prosecution pursuant to 18 U.S.C. § 922(g)(1) for possession of a handgun by a convicted felon, it admits into evidence the name or nature of the defendant’s prior conviction despite the defendant’s offer to stipulate to his status as a felon under section 922(g)(1). See id. at 190-91, 117 S.Ct. 644. While the issue addressed in Old Chief is not entirely unrelated to the issue presented in this case, a careful reading of Old Chief confirms that it provides no support to either the majority or the dissent in this case.