ORDER
JOHN W. POTTER, District Judge:This cause was transferred to the United States Magistrate for a Report and Recommendation. Magistrate James G. Carr filed his Report and Recommendation on October 26, 1990. Defendants filed objections to the Report and Recommendation on November 9,1990 and, since the government responded to defendants’ objections on November 19, 1990, the cause is now decisional in this Court. This Court has once before addressed pretrial matters in this case wherein the Magistrate issued a Report and Recommendation. In a Memorandum and Order dated September 21, 1990, this Court adopted the Recommendation of the Magistrate to deny defendant Yee’s motion to suppress evidence obtained in the search of his car. The Court declined then, as it does now, to attempt a detailed recitation of the facts of this case. However, for a better explanation of the proceedings that led to this point see the opinion issued by Magistrate Carr in United States v. Yee, 129 F.R.D. 629 (N.D.Ohio1990).
The central issue which the Court confronts today involves the government’s motion to admit, and defendants’ corresponding motion to exclude, the results of a deoxyribonucleic acid (DNA) test undertaken by the F.B.I. laboratory in connection with the investigation and prosecution of this case. As the Magistrate correctly pointed out, Section 636(b)(1)(A) of 28 U.S.C. excepts rulings on this type of issue from those where this Court may review the Magistrate’s findings under a clearly erroneous standard. Instead 28 U.S.C. § 636(b)(1)(B) requires that this Court make a de novo review of the Magistrate’s findings. Since a magistrate has no authority to make a final and binding disposition, the final resolution of this type of issue must be made by the district court. United States v. Raddatz, 447 U.S. 667, 674, 100 S.Ct. 2406, 2411, 65 L.Ed.2d 424 (1980).
Section 636(b)(1) of 28 U.S.C. and L.Civ.R. 19.04 require an objecting party to specifically identify those portions of the *163Report and Recommendation to which he objects. In the Memorandum and Order issued by the Court on September 21,1990, the Court said the following:
defendant appears to object to the Magistrate’s report in its entirety by submitting to this Court the memorandum of law which he submitted to the Magistrate to support his motion to suppress. The United States has likewise chosen to rest on the brief which it filed with the Magistrate opposing defendant’s motion to suppress. Such trial practice makes the issues presented even more difficult for the Court to resolve. However, given the time constraints which both parties and the Court are under in getting this case prepared for trial, the Court will deem defendant’s objections in compliance with the statute and local rule rather than ordering defendant to object with the required specificity.
In the pretrial conference which this Court held with counsel after the Magistrate issued his instant Report and Recommendation, the Court not only vacated the trial date initially set for November 26,1990 and reset it for January 22, 1991, but the Court also requested that the parties submit specific objections to the Report and Recommendation and specific responses to those objections so that the Court could better focus on the issues at the heart of the DNA test.
The defendants have presented three objections and the government has responded. Defendants’ objections are: (1) the Magistrate’s conclusion with respect to the meaning of “general acceptance” is flawed; that there must be consensus before the theory can be admissible for jury consideration; (2) defendants object to the Magistrate’s findings on reliability; and (3) the Magistrate failed to consider the final prong of the Green test for the admission of novel scientific evidence in a criminal case—the “FRE 403” question.
Were it not for the Magistrate’s keen attention to detail and dedication to providing the Court with a most extensive and thorough discussion of the issues, the Court would, in all likelihood, need to conduct its own hearing to resolve the issues. The Court is not required to do so, in light of the United States Supreme Court’s decision in Raddatz:
the statute [28 U.S.C. § 636(b)(1) ] calls for a de novo determination, not a de novo hearing. We find nothing in the legislative history of the statute to support the contention that the judge is required to rehear the contested testimony in order to carry out the statutory command to make the required “determination.”
Raddatz, 447 U.S. at 674,100 S.Ct. at 2411. The Raddatz Court went on to add that “in providing for a ‘de novo determination’ rather than de novo hearing, Congress intended to permit whatever reliance a district judge, in the exercise of sound judicial discretion, chose to place on a magistrate’s proposed findings and recommendations.” Raddatz, 447 U.S. at 676, 100 S.Ct. at 2413. The Court has reviewed the material submitted to the Magistrate in sufficient detail to come to its own conclusions. After doing so, the Court is satisfied with the Magistrate’s Report and Recommendation. Though the Magistrate was required to come to conclusions on a number of hotly contested issues, the conclusions he reached are supported by both the testimony and the case law cited.
The Magistrate devoted pages 169-173 of his Report to the task of explaining the forensic application of the DNA technology. The Court must now determine whether the government will be allowed to present evidence at trial that relates to the governments use of a procedure called Restriction Fragment Length Polymorphism analysis (RFLP analysis). Through this use of RFLP analysis, the government maintains that it will be able to show that the DNA patterns present in the human genome of defendant John Ray Bonds match the DNA patterns found in blood samples collected from the automobile of homicide victim David Hartlaub. The government also maintains that by using additional procedures implemented in the FBI laboratory, it can use the results from this RFLP analysis to conclude that the *164probability that a pattern of matches, like the pattern of matches found in the comparison of Bonds’ DNA to the blood DNA found at the crime scene, would be found in the United States Caucasian population is 1/35,000. Defendants ask this Court to exclude all evidence relating to the RFLP analysis and the probability estimate.
Pages 173-187 of the Report contain the Magistrate’s overview of the testimony he heard during the six weeks of what this Court will call his Frye1 hearing. Neither the defendants nor the government accuse the Magistrate of incorrectly relaying the substance of the testimony heard during those six weeks. No one claims that the substance of a witness’ testimony is different from how the Magistrate relays it, nor does anyone claim that a witness’ testimony stood for something more than what the Magistrate credited that witness with saying. Given that the Court has received no objection from either party relating to pages 173-187 of the Report, the Court will not attempt a comparison of the transcripts taken from the hearing to the overview provided by the Magistrate. The Court does, however, credit the Magistrate with having done a very thorough job of synthesizing and relaying the very intricate testimony from witnesses who appear to have been extremely impressive.
Starting on page 187, the Magistrate begins the most extensive review of Sixth Circuit case law that a court has ever done on the standard for admitting into evidence in a criminal ease novel scientific theories and procedures. This discussion culminates on page 194 with the Magistrate concluding:
Thus, in light of my examination of the Sixth Circuit’s decisions, I conclude that in the context of this case, the government must show that the principles and procedures on which its proposed DNA evidence is based are generally accepted in the scientific community. In other words, those principles and procedures must be shown to conform to generally accepted explanatory theories about molecular biology and population genetics.
Magistrate’s Report and Recommendation at 194. With this language, the Magistrate expressed his belief that the standard for admitting this type of evidence in the Sixth Circuit is no different than that standard formulated by the United States Court of Appeals for the District of Columbia Circuit and announced in Frye v. United States, 293 F. 1013 (D.C.Cir.1923). The Magistrate went to great lengths to come to this conclusion. He considered every case coming from the Sixth Circuit Court of Appeals which mentioned, in any way, the standard being employed by the Court of Appeals when confronted with the issue of admitting evidence of a novel scientific theory or procedure.
The Court now considers defendants’ first objection with special attention to the contention of lack of “consensus.” The Court finds that the holding in United States v. Green, 548 F.2d 1261 (6th Cir.1977), as follows,
In recognition of the outcome determinative impact of “opinion evidence clothed with the weight of expertise,” Bridger [v. Union Ry. Co.], supra [355 F.2d] at 388 [ (6th Cir.1966) ], we adopt for use in criminal appeals the four criteria proposed in [ United States v.] Amaral [488 F.2d 1148 (9th Cir.1973)] for review of trial court decisions concerning expert testimony: “1. qualified expert; 2. proper subject; 3. conformity to a generally accepted explanatory theory; and 4. probative value compared to prejudicial effect.” 488 F.2d at 1153.
does not vary the standard expressed in Frye.
The Court must now confront the task of determining whether there is general acceptance within the pertinent scientific community of the government’s principles and procedures regarding DNA. The Magistrate found that the pertinent scientific community is made up of “scientists from the fields of molecular biology and population genetics who have expertise in either or both of those fields and a reasonably *165comprehensive understanding about the F.B.I.’s DNA testing protocol and procedures.” Magistrate’s Report and Recommendation at 195. Neither party objected to this assessment of the pertinent scientific community. The Court likewise agrees with it and adopts the Magistrate’s findings in this regard.
The Magistrate found that the standard of proof required of the government was proof by a preponderance of the evidence. Again, neither party objects to this finding, and the Court adopts it. See United States v. Enright, 579 F.2d 980 (6th Cir.1978).
The Magistrate’s definition of general acceptance is what has drawn the most criticism from the defendants. At pages 196 through 202 of the Report, the Magistrate embarked upon an extremely careful formulation of the factors which denote general acceptance. The Magistrate quoted Professor Gianelli at page 198 of the Report and Recommendation to the effect that the better reasoned approach was to define general acceptance by first explicitly saying what general acceptance does not mean. He then goes through an extensive examination of case law in order to distill what factors have persuaded courts in the past to either admit or exclude evidence of novel scientific theories.
Defendants assert that the definition of general acceptance that this Court should adopt is to require the government to demonstrate that a consensus exists among the “the most distinguished scientists in the pertinent fields about the reliability of a method.”
As the government has pointed out, general acceptance can be achieved without “unanimity” or “certainty.” See also, United States v. Kozminski, 821 F.2d 1186, 1200 (6th Cir.1987) (en banc), as follows: “absolute certainty of result and unanimity of scientific opinion [are] not required so long as the conflicting testimony concerning the conclusions drawn by the experts are based on generally accepted and reliable scientific principles.” United States v. Kozminski, 821 F.2d 1186, 1200 (6th Cir.1987) (en banc). For the above mentioned reasons, the Court rejects the definition proffered by defendants and accepts the Magistrate’s careful delineation of the factors the Court should use in finding general acceptance.
The Magistrate summarized the most crucial findings expressed in the cases dealing with the admissibility of novel scientific evidence:
In summary, I have not encountered, and the parties have not cited, a case applying the Frye standard rejecting the admissibility of evidence where a set of experts, such as in this case, have testified that the procedure was generally accepted. Where such experts have testified, the evidence has been admitted despite the firmly held countervailing views of the opponent’s experts.
Magistrate’s Report and Recommendation at 199 (emphasis added). Further, the Magistrate said:
That [the testimony in support of the technique was that which was being offered by persons who developed and implemented it], in my view, was the crucial consideration, because, as the cases cited above indicate, testimony solely by the developer of the novel technique almost never has been held to have shown that a procedure enjoys general acceptance. In this case, there is extensive testimony by experts other than F.B.I. employees about the scientific acceptability in each of these areas.
Id. at 200 (emphasis added). The Court agrees with the assertion that the presence of expert testimony, both within the community of scientists who helped develop the technique and outside of that community, is crucial to a finding of general acceptance.
In light of this agreement, the Court further adopts the Magistrate’s finding that the government’s four principal experts, who had knowledge of the F.B.I.’s practices, were more persuasive on the issue of whether the government’s laboratory has designed and implemented a program whereby multiple loci matches can reliably be ascertained. The Court also adopts the Magistrate’s finding that the testimony of Drs. Kidd and Caskey was more persuasive on the issue of whether *166the pertinent scientific community generally accepted the F.B.I.’s method for computing an estimate of the likelihood of encountering a match of the type found in this case in the Caucasian population. Therefore, the Court wholly adopts the Magistrate’s finding that the government has met its burden of showing by a preponderance of the evidence that the general scientific community accepts the F.B.I. protocol and procedures for determining a match of DNA fragments and estimating the likelihood of encountering a similar pattern. In so doing, the Court finds not well taken the points raised by the defendants at pages two through four of their memorandum in opposition to the Magistrate’s Report. The Court finds that the points made by defendants merely establish that there are firmly held beliefs in the community of scientists opposed to the government’s use of F.B.I. Such firmly held beliefs, however, as noted above, do not prevent the novel scientific evidence from being found generally accepted in the pertinent scientific community.
Defendants’ first objection is not well taken.
The Court now considers defendants’ objection as to the issue of reliability. The issue concerns the line between challenges to weight and challenges to admissibility and also the traditional function of the jury. The last time an en banc panel of the Sixth Circuit addressed this type of an issue, the standard it used was phrased as follows:
For expert testimony to be admissible under Rule 702 [of the Federal Rules of Evidence], a four part test must be met: (1) a qualified expert; (2) testifying on a proper subject; (3) in conformity to a generally accepted explanatory theory; (4) the probative value of which outweighs any prejudicial effect. United States v. Green, 548 F.2d 1261, 1268 (6th Cir.1977) (emphasis supplied); United States v. Smith, 736 F.2d 1103, 1105 (6th Cir.), cert. denied, 469 U.S. 868, 105 S.Ct. 213, 83 L.Ed.2d 143 (1984).
United States v. Kozminski, 821 F.2d 1186, 1194 (6th Cir.1987).
While the majority in Kozminski simply applied the announced four-part test without commenting on the third requirement, Judge Guy sought to flesh-out the meaning of the third prong in his dissent. There, Judge Guy begins with the observation that “[t]he [United States v.] Green test is actually derivative of Frye v. United States, 293 F. 1013 (D.C.Cir.1923).” Kozminski, 821 F.2d at 1215 (Guy, J. dissenting). Judge Guy speaks of the meaning of the third prong in the test as follows:
The third inquiry is, of course, the crucial one here; that is, whether the testimony is in “conformity with a generally accepted explanatory theory.” 548 F.2d at 1268. As stated above, this element has its genesis in Frye. At the same time, it has undergone a subtle but important change.
Id. at 1216. Judge Guy elaborates on what he perceives to be the difference between Frye and Green by saying:
The distinction is important because in Frye the emphasis was placed upon the technique or methodology employed to reach a result, yet in Green the focus has been transferred to the theory. In other words, rather than assessing the “thing from which the deduction is made,” we have focused on the deduction itself.
Id. at 1217.
Though it appears that the dissenting opinion provided the impetus for Judge Krupansky’s concurrence, Judge Krupansky, like the majority opinion, made no effort to explicitly address Judge Guy’s concerns that the Court had altered the Frye standard. Judge Krupansky begins his concurrence with a quote from United States v. Brown, 557 F.2d 541 (6th Cir.1977):
A necessary predicate to the admission of scientific evidence is that the principle upon which it is based “must be sufficiently established to have gained general acceptance in the particular field to which it belongs,” Frye v. United States, 54 App.D.C. 46, 293 F. 1013, 1014 (1923). In United States v. Franks, 511 F.2d 25, 33 n. 12 (6th Cir.1975), we equated gen*167eral acceptance in the scientific community with a showing that the specific principles and procedures on which the expert testimony is based are reliable and sufficiently accurate.
Kozminski, 821 F.2d at 1199 (Krupansky, J. concurring). However, given what follows in Judge Krupansky’s opinion, the Court is persuaded that neither Judge Krupansky nor the rest of the Sixth Circuit judges would equate the Court’s performance of a reliability determination with a general acceptance determination. The Court is further convinced, as was the Magistrate, that the only issue a court may permissibly focus upon when deciding the third prong of the Green test is whether the pertinent scientific community generally accepts the novel scientific evidence.
The following quote taken from Judge. Krupansky’s concurrence in Kozminski demonstrates the points made above.
The third element enumerated in [United States v.] Amaral [, 488 F.2d 1148 (9th Cir.1973)] provides that such expert testimony must be in “conformity to a generally accepted explanatory theory.” Implicit in the language is the predicate that the theory be firmly anchored in sound, reliable, and sufficiently accurate scientific principles, and sufficiently established to the point of having achieved general acceptance within the particular field to which it belongs. Stated differently, the scientific explanatory theory must have (a) received at least some exposure within the scientific peerage to which it belongs; (b) received peer evaluation to determine its scientific validity and reliability; and (c) achieved general acceptance within the scientific community to which it belongs.
Kozminski, 821 F.2d at 1201 (Krupansky, J. concurring). Further in the opinion, Judge Krupansky discloses the crucial reasons for the Court’s finding that the scientific evidence involved in that case was not admissable.
Dr. Stock’s damaging, if not fatal, admissions condemned his opinion testimony as a hypothetication that had not “attained general acceptance within the scientific community.” (citation omitted.) He readily conceded that he was not aware of any literature, let alone published research, that addressed his theory____ Accordingly, the record disclosures make it evident that if, as Dr. Stock testified, his instant testimony was the first public presentation of his theory, it necessarily followed that it never received peer evaluation or validation, let alone recognition as an explanatory theory that had attained general acceptance within the scientific community to which it belonged within the mandates of existing precedent. See Green, Brown [United States v.] Brady [595 F.2d 359 (6th Cir.1979)], Franks.
Kozminski, 821 F.2d at 1202 (Krupansky, J. concurring) (emphasis added). While the term “reliable” has been used in the Kozminski opinion, Judge Krupansky specifically found that the key to admissibility was the opinion of the scientific community as to the acceptability of the explanatory theory, not the opinion of the court as to the theory’s reliability.
For the reasons stated, defendant’s second objection is not well taken.
Finally, as to defendant’s third objection, the Court also agrees with the Magistrate that a determination of the Federal Rule of Evidence 403 question, i.e. the fourth part of the Frye/Green test, is a decision best ruled upon during trial as the evidence comes in.
Upon a review of the Magistrate’s Report and Recommendation, the record in this case, and said objections, this Court finds that the objections are not well taken.
THEREFORE, for the foregoing reasons, good cause appearing, it is
ORDERED that said Report and Recommendation be, and hereby, is adopted as the Order of this Court; and it is
FURTHER ORDERED that defendants’ motion to exclude evidence be, and hereby is, DENIED; and it is
FURTHER ORDERED that the government’s motion to admit the DNA evidence be, and hereby is, GRANTED.
*168MAGISTRATE’S REPORT AND RECOMMENDATION
JAMES G. CARR, United States Magistrate Judge.This is a criminal case in which pretrial matters have, pursuant to a Standing Order, been referred to the undersigned for initial hearing and determination. Pending are oral motions by the government for leave to admit, and by the defense to exclude, the results of DNA testing undertaken by the F.B.I. laboratory.
According to statements by counsel for the government, the results at issue in this case show that samples of blood taken from the defendant Johnny Ray Bonds contain DNA fragments that match with DNA fragments from blood found at the scene of a homicide. The government’s counsel also represents to the Court that the probability that such a pattern of matches would be found in the United States Caucasian population is 1/35,000.
Following a series of preliminary hearings and arguments (which resulted, inter alia, in an order granting a motion by the defendants for discovery pursuant to Fed. R.Crim.P. 16, United States v. Yee, 129 F.R.D. 629 (N.D.Ohio 1990)), hearings were held for approximately six weeks during the period June 26—September 12, 1990. Thereafter, briefs were filed by the parties, and this matter has become decisional.
For the reasons that follow, I conclude, applying the standard of Frye v. United States, 293 F. 1013 (D.C.Cir.1923), that the government’s motion to admit should be granted, and the defendants’ motion should be denied. Because, in my view, the decision that I reach in this opinion is “dispositive” under the Federal Magistrate’s Act, 28 U.S.C. § 636(b)(1)(B), this opinion is submitted as Report and Recommendation, thereby ensuring de novo review by the Article III Judge to whom this cause is assigned for trial.
This Report and Recommendation consists of the following sections:
1. Forensic Application of DNA Technology
A. DNA Technology
i. Determining a Match
ii. Probability Estimate
B. Overview of the Experts’ Testimony
i. Determining a Match
ii. Probability Estimate
2. The Frye Standard in the Sixth Circuit
A. General Acceptance in the Scientific Community/Generally Accepted Explanatory Theory
B. Pertinent Scientific Community
C. Determination of General Acceptance
i. Standard of Proof
ii. Scope of the Inquiry
iii. Definition of General Acceptance
iv. Findings re. General Acceptance
v. Alternative Findings re. Reliability
3. Rule 403 Issue
To summarize my conclusions at the outset: I conclude that there is general acceptance in the pertinent scientific community that the procedures developed and implemented by the F.B.I. for determining that the DNA patterns from a known (i.e., a criminal suspect) source match with DNA patterns from a “questioned” (i.e., crime scene) source are reliable. I also conclude that there is general acceptance in the pertinent scientific community of the process implemented by the F.B.I. for estimating the probability that such match would be encountered in the United States Caucasian population.
I am persuaded that controlling Sixth Circuit precedent forecloses an adjudication by me of the merits of the underlying scientific disputes between the parties concerning the ability of the F.B.I. to determine a pattern of matches over multiple loci and the reliability of its probability estimates. Nonetheless, in the alternative to my findings about the general acceptance in the scientific community that the F.B.I.’s procedures reach reliable results, I also find that the government has met its burden of showing by a preponderance of the evidence that it can reliably make mul*169ti-loci matches and its population estimates are likewise reliable.
Though I discuss, I take no position with regard to issues of admissibility pursuant to Rule 403 of the Federal Rules of Evidence, because a ruling in that regard, in my opinion, depends on evidence that is not of record in this proceeding.
1. Forensic Application of DNA Technology
A. DNA Technology
Attached to this report is a copy of a description, taken from a recent Report of the Congressional Office of Technology Assessment (Exh. 73), of basic DNA structure in the human cell and how the characteristics of that structure are employed for forensic purposes to enable comparisons between a known sample of DNA (typically from a suspect) and an unknown sample (usually collected at a crime scene). The technical steps by which the F.B.I. undertakes to compare DNA samples are also depicted in the attachment.
The human genome is composed of twenty-three pairs of chromosomes containing approximately six billion individual nucleotide bases comprising approximately three billion nucleotide base pairs. Each chromosome consists of two long chains of deoxyribonucleic acid (DNA) linked together by hydrogen bonding between complementary pairs of nucleotide bases. The overall physical structure of the DNA molecule, otherwise called a double helix formation, has been likened to a ladder the sides of which are twisted or coiled along its longitudinal axis.
The complementary bases bond only with each other. That is, among the four bases which comprise the DNA double helix, the base adenine (A) will bind only with thymine (T), and the base guanine (G) will bind only with cytosine (C). Thus where the order of bases on one strand of the DNA molecule is GGACAATGTCAT the order of bases on the corresponding portion of the other strand, i.e. the complement to this string of nucleotide bases, will be CCTGTTACAGTA.
The long strands of DNA, the biomolecular basis of the chromosome, carry the functional unit of heredity, the gene. The genetic information encoded in the human genome contains the essential instructional material for the assemblage and maintenance of biological life. The genetic information contained in the chromosomes is ultimately responsible for the biosynthesis of the thousands of proteins and enzymes which regulate all the minute biochemical functions of the body.
The nucleus of virtually every cell in the body contains a complete copy of a person’s genetic material. Some biological material, e.g. red blood cells, urine, and feces, contains little or no DNA. White blood cells do contain nucleic DNA, thus allowing blood samples to be used in DNA typing.
Most of the DNA belonging to a species is identical. In humans 99% of the genes are the same for all persons, thereby accounting for the abundant shared characteristics of all human beings. Some DNA is, however, different from person to person, population to population, race to race. These differences, which account for our unique characteristics as individuals, as well as the differences between ethnic groups and races, are the result of variation in the base sequences of the genes that encode for these individualizing characteristics. The portions of the genetic material which differ are called polymorphic to indicate that the base sequences that comprise these regions of the genome occur in varying forms.
Just as polymorphic regions of the genome produce physical characteristics in the organism which individualize that organism, so too is the DNA chain itself distinct from person to person. It is this individualized character of the polymorphic regions of the DNA polymer that is the basis upon which the several DNA based genetic identification technologies have evolved.
The technique utilized by the F.B.I. to perform DNA identification testing is referred to as Restriction Fragment Length Polymorphism analysis, or RFLP analysis. The F.B.I. employs the RFLP technology to *170isolate and analyze regions of the human genome known as Variable Number Tandem Repeats, or VNTRs.
VNTRs are regions of the human genome for which, at least to date, no biological function has been discovered. The physicochemical structure of a VNTR is implied in its name.
VNTRs are regions of the DNA molecule that are composed of segments of nucleotide bases that repeat in tandem many, many times. A base pair sequence that forms one of the many segments of a VNTR that are repeated over and over again is composed of an arrangement of nucleotide bases (for example, AGTTA-AGCCGGCAGAGCCT). This sequence of base pairs is bonded to its corresponding complementary segment.
A single segment of a VNTR may be composed of just a few or as many as several dozen nucleotide bases. This sequence of bases (i.e. the segment) constitutes a unit of a VNTR and repeats itself over and over again. The repeated segments or units are positioned one after the other in tandem, like boxcars of a long train where each boxcar represents a single unit of the VNTR and all the boxcars are identical (i.e. composed of the same arrangement of base pairs).
The number of repeated, tandem sequences comprising a VNTR can vary among persons. VNTRs are polymorphic in that not all individuals possess the same number of repeat sequences for a VNTR at a given gene locus. Thus, for example, in a given individual, a VNTR may be composed of only sixty repeat sequences while for another individual that same VNTR may be composed of two hundred repeat sequences. This example is intended for illustrative purposes only in that VNTRs usually occur in two forms in a person’s genome.
RFLP technology, described below, when applied to VNTRs, permits the isolation and identification of the different VNTRs that form part of an individual’s genome by a method that represents and distinguishes the VNTRs by length. Thus individuals can be differentiated from each other using this technology because the RFLP technique enables the molecular geneticist to identify the different forms, i.e. lengths, of VNTRs as they occur in different persons. VNTR based RFLP technology may be used to compare genetic material derived from known and unknown samples to determine whether those samples may have come from an identical source.
i. Determining a Match
As depicted in the attached materials from the OTA Report, the following steps are undertaken in a forensic laboratory to determine whether one sample of DNA matches with another.
1) DNA is extracted from a sample of biological material like blood or semen. The specimen is dissolved in a solution that breaks down unwanted chemical contents and allows the DNA to be separated from the remaining biological materials. The DNA is precipitated out of solution.
2) The resultant DNA is then digested by an enzyme called a restriction endonuclease. The enzyme reacts with specific sequences of nucleotide bases. The effect of this enzyme on the DNA is to cut it at specific sites, producing numerous DNA fragments of varying lengths. Different enzymes will cut the DNA at different locations depending on the sequence of bases the enzyme reacts with. The resultant DNA fragments vary in length from a few base pairs up to several thousand base pairs.
3) The residual mixture of DNA fragments is then subjected to a process of separation by size by a procedure known as gel electrophoresis. During electrophoresis, a solution of DNA fragments is placed at one end of a thin slab of a gelatinous semi-solid like agarose, and an electrical current is applied. DNA fragments are negatively charged molecules. When the current is activated the DNA fragments move across the gel toward the positively charged end. The distance these fragments travel, while subject to influence from other factors, is primarily a function of the fragment’s size, mass, and electrical charge. The larger and heavier fragments *171move more slowly, and thus, a shorter distance than the smaller, lighter fragments. The end result of electrophoresis is that DNA fragments are arrayed across the gel according to fragment size. The longer are located closer to the top of the gel, the shorter fragments toward the bottom. After the electrophoretic separation process is completed the DNA is denatured and neutralized. Denaturation of DNA occurs when the original double helix structure is “unzipped” and the two complementary strands of DNA are separated.
4) Because the gel on which the process of DNA fragment separation has been achieved is relatively unstable and is not a convenient medium for permanent storage, the array of DNA fragments are transferred to a more stable matrix. This process is known as Southern transfer or Southern blotting. The arrayed DNA fragments are caused to move by capillary action from the gel onto a more stable nylon membrane. A buffered electrolyte solution is placed beneath the gel, and an absorbent material is placed over the nylon membrane which is sandwiched between the gel and the absorbent material. As the solution is absorbed upwards the DNA fragments are carried onto the nylon membrane.
5) The DNA is then hybridized to a radioactive probe. Hybridization is a process in which the single strands of DNA bind to complementary sequences to reform the double helix structure. Hybridization in effect “zips” the DNA molecule back into its original double helix form. The probe hybridizes to only those DNA fragments which contain base sequences complementary to the base sequences of which the probe is composed. The DNA probe is a segment of DNA cloned by recombinant DNA technology to produce thousands of identical sequents. Probes are radioactively labeled with an isotope of phosphorous. Probes differ according to their size and the composition of the repeating base sequences. Several different types of probes are available. The F.B.I. employs single locus probes. Such probes isolate and hybridize to polymorphic regions of the human genome that occur at only one locus of the genome. After hybridization, the nylon membrane is washed to remove excess, unbound probes.
6) The probe hybridized membrane is then exposed to a piece of X-ray film in a process known as autoradiography. The radioactive phosphorous in the probe will react with the film, serving to locate the DNA fragments to which the probe had hybridized. The X-ray film will not react where there is no radioactive probe, therefore the location of fragments that have not been hybridized by the probe will not be indicated on the film.
7) The final step in the process is interpretation. Interpretation is done visually or with the assistance of a computer imaging and measuring system. The primary function of interpretation, whether visual or computer assisted, is to assess the quality of the final product and, most importantly, to compare DNA band patterns from known and unknown samples to determine if they are in alignment. Ultimately, the interpreter declares that it is likely that a known sample and an unknown sample come from an identical source, do not come from an identical source, or that the results of the tests are inconclusive.
ii. Probability Estimate
The F.B.I. uses the following method to estimate the probability that a person picked randomly from the population would have a DNA profile identical to the DNA profile generated from the forensic sample.
First, the Bureau developed a table of allele frequencies. The frequencies of the alleles corresponding to the DNA sample that is being tested are then determined by reference to this table. Finally, the frequencies of the individual alleles from the DNA samples are multiplied together according to the method of calculation developed by the F.B.I. to compute an aggregate probability estimate of the probability that the combination of alleles found in the sample DNA would be encountered in the Caucasian population.
The F.B.I. uses what it has called the fixed bin method to construct its table of allele frequencies. The F.B.I. ran DNA profiles for approximately 225 randomly *172chosen agents. Each agent was profiled for the five or six probes the F.B.I. uses, or had intended to use, in its casework. Relative to each probe, the allele or alleles resulting from the profile run on each agent were assigned to a predetermined bin.
The bins, as they are called by the F.B.I., were established with reference to the size markers that were run with each test. The size markers, or sizing ladders as they are also called (which are also, according to F.B.I. protocol, run with all casework tests) are commercially available solutions composed of DNA fragments of known, predetermined fragment lengths. The size markers appear on the final autorad as an array of bands relatively evenly distributed along the length of the gel. Because the fragments of the size markers are of known lengths, the size of an individual’s DNA bands can be determined by comparison to the band of known length on the sizing ladder that is nearest in location to the sample band of unknown length.
In the fixed bin procedure, the size markers define the boundaries of the bins. The frequencies associated with the bins were established by assigning the bands generated from the profiles of the F.B.I. agents to the bins into which the bands fell. After all the profiles of the agents were completed, and the bins into which their bands fell were determined, the total number of bands located in each bin were counted. As to each probe, the frequency for each bin was calculated by the simple procedure of dividing the total number of bands located in a bin by the total number of bands resulting from the profiling of all the agents tested for that probe.
The F.B.I. applied a standard statistical safety measure used in data collection studies of this sort by “collapsing” into each other, bins that contained fewer than five occurrences. That is, if a particular bin displayed fewer than five bands it would be merged into the adjacent bin and this process was continued until there was a total of at least five bands. The resultant bin would be larger in size. The frequency associated with this bin would be calculated in the same manner described above.
Once the bin frequencies are established for the probes used in casework, these numbers, which are referred to as the “Caucasian database,” can be applied to determine the estimate of the probability that a person picked randomly from the population would have a DNA profile identical to the DNA profile generated from the forensic sample. The first step in this process involves identifying the bins in which are located the various bands that comprise the sample’s DNA profile. This “binning” process involves a visual and computer assisted assessment of the location of the band on the autorad.
When a casework band is located between two adjacent size marker bands, that casework band is said to lie in the bin defined by those two adjacent size marker bands. The frequency assigned to the casework band is the frequency that had previously been determined for that bin through the frequency study described above.
If a casework band is found to lie on the border of two bins, the F.B.I. deems that band to belong to the bin that has the highest frequency, i.e. the more common bin. This, according to the F.B.I., is a conservative measure that favors the defendant because the subsequent calculations will use a number that is of a greater magnitude, thereby raising the final frequency and diminishing the degree of rareness associated with the occurrence of that DNA profile in the population.
Because the F.B.I. has ascribed a size range to a band (i.e. +/— 2.5%), a casework band is said to fall on the border of a bin where any portion of the possible range of sizes for the band that are within the plus or minus two and one-half percent window fall on a bin border. In that case also the frequency assigned to that band would be the higher of the frequencies of the two bins in which that casework band lay.
After the frequencies for the various bands are determined, the overall frequency for the DNA profile is calculated. In *173performing this calculation frequencies must be calculated one probe at a time. Thus, the frequency of the band or bands identified by the first probe are determined, and the overall frequency associated with that probe is determined. Then the frequency of the band or bands for the second probe is ascertained, and the overall frequency associated with the second probe is calculated, and so on. The frequencies of all the different bands comprising the sample’s DNA profile are not, however, all individually multiplied together.
When done correctly the DNA profile at a single probe will display either one or two bands. Those profiles that display only one band are called homozygotes (which simply means that the polymorphic form of the VNTR identified by the probe was the same in both parents of the person from whom the sample was obtained). Those profiles that display two bands are called heterozygotes (which means that the polymorphic forms of the VNTR identified by the probe were different in the two parents of the person from whom the sample was obtained).
For probings that display a heterozygotic, i.e. two banded, pattern the F.B.I. calculates the frequency of occurrence for that particular band pattern at that probe using a formula derived from classical Mendelian genetics, 2PQ, where “P” is the bin frequency associated with one of the bands and “Q” is the bin frequency associated with the other band.
For probings that display a homozygotic, i.e. single banded, pattern the F.B.I. calculates the frequency of occurrence for that band pattern at that probe using a formula modified from the one normally associated with determining the frequency of homozygotes in classical genetics. Traditional genetics would calculate the frequency for a homozygote with the formula of P X P, i.e. P squared. Because of a variety of considerations, related to issues involving whether single banded patterns are true homozygotes, issues too complex to address in the present context, the F.B.I. decided to apply a formula to calculate the frequency of single banded patterns that would compensate for these various problems and would protect the defendant. Based on these considerations, the F.B.I. chose to use the 2P formula when calculating the frequency of occurrence of a single banded pattern, where “P” is the bin frequency associated with that band.
Finally, after the frequencies of occurrence for the profiles of the different probings are calculated, the aggregate frequency for the overall DNA profile can be computed by simply multiplying together the frequencies determined for the several probings. The reason that these frequencies associated with the different loci (as the area of the gene identified by a probe is called) can be multiplied together is that the occurrence of the genetic events at any one loci are considered to be independent of the occurrence of the genetic events at any other loci. Given this assumption, one of the most rudimentary principles of probability theory then follows: that the frequency of occurrence of independently occurring events may be multiplied by one another to determine the frequency of occurrence of the aggregate of those events.
With this final calculation, the frequency of the sample’s DNA profile has been determined, and an estimate of the probability that a person picked randomly from the population would have a DNA profile identical to the DNA profile generated from the forensic sample can be made.
B. Overview of the Expert’s Testimony
In addition to two employees of the F.B.I. who testified (Dwight Adams, Agent Examiner, and Bruce Budowle, chemist principally responsible for developing the F.B.I.’s protocol and procedures for its DNA forensic casework), the prosecution called three witnesses in its direct case: Dr. Patrick Conneally (Distinguished Professor of Medical Genetics and Neurology at the Indiana University School of Medicine); Dr. Stephen P. Daiger (Professor, Graduate School of Biomedical Sciences at the University of Texas Health Science Center), and Dr. C. Thomas Caskey (Henry and Emma Meyer Chair in Molecular Genetics at the Baylor College of Medicine; Chairman, Advisory Panel, OTA Report, *174Genetic Witness: Forensic Uses of DNA Tests; Member, National Academy of Sciences Committee on DNA Technology and Forensic Science).
At that point, the defense called three witnesses: Dr. Peter D’Eustachio (Associate Professor, Department of Biochemistry, New York University Medical Center), Dr. Paul J. Hagerman (Associate Professor of Biochemistry, Biophysics and Genetics, University of Colorado Health Sciences Center), and Dr. Richard C. Lewontin (Alexander Agassiz Professor of Zoology and Professor of Population Sciences, Harvard University).
The next witness was Dr. Eric S. Lander, (Associate Professor, The Whitehead Institute for Biomedical Research, Massachusetts Institute of Technology; Member, OTA Advisory Panel, OTA Report, Genetic Witness: Forensic Uses of DNA; Member, National Academy of Sciences Committee on DNA Technology and Forensic Science), who was called as a court’s witness.
The defense then called Dr. T. Conrad Gilliam (Assistant Professor of Neurogenetics, Department of Genetics and Development, College of Physicians and Surgeons, Columbia University) and Dr. Daniel L. Hartl (James S. McDonnell Professor and Head, Department of Genetics, Washington University School of Medicine). The prosecution called Dr. Kenneth K. Kidd (Professor of Human Genetics, Psychiatry, and Biology, Yale University School of Medicine) and recalled Dr. Caskey as rebuttal witnesses. Somewhat more than 200 exhibits were introduced into evidence.
Despite the complexity of much of the evidence, the issues about which the experts testified can be fairly easily described. The first of these issues relates to the F.B.I. protocol and procedures for determining that DNA fragments, generally referred to in the testimony as bands or alleles, from the known and unknown sources match. In order for its DNA evidence to be admissible, the government must show that there is general acceptance in the scientific community with regard to the F.B.I.’s ability reliably to declare matches over several loci.
The principal thrust of the defense challenge regarding that issue involved challenges to the design of the standards for declaring a match, quality of basic scientific work required to ensure that a laboratory can perform its work in a reliable and reproducible manner, adequacy of the F.B.I.’s research into the effects of environmental insults and other outside forces on the DNA fragments that it was testing, failure to implement a program of proficiency testing of its examiners, and ability to perform basic scientific procedures in an acceptable manner.
The government’s experts testified that the F.B.I. met standards of acceptable scientific practice with regard to the various challenges being made by the defendants’ experts. The prosecution experts also testified that, in any event, the flaws to which the defendants’ experts were pointing did not affect the ability of the F.B.I. reliably to declare matches over multiple loci and that the F.B.I.’s ability to do so was generally accepted in the scientific community.
The second issue related to the ability of the F.B.I. to make a reliable and scientifically acceptable estimate of the probability that a match once observed, would be encountered within the American Caucasian population. The government’s witnesses described how its database for such purposes was designed and implemented, and the relationship between the database and its use and pertinent aspects of the protocol and procedures for declaring a match. They contended that the F.B.I. could develop a scientifically acceptable estimate of probability, and that its ability to do so was generally accepted in the scientific community.
The defense experts, along with the court’s witness, Dr. Lander, contended that the basic design of the F.B.I. Caucasian database was flawed because it failed to take into account the likelihood that there is no such thing as an American Caucasian population. Instead, in the view of the defense experts, there was a significant likelihood of “substructure,” whereby the frequency of particular alleles might vary on the basis of the ethnic ancestry of par*175ticular subpopulations within the overall American Caucasian population.
Because the impact, in terms of both frequency and magnitude of occurrence, of such substructure on the accuracy of the F.B.I.’s database is unknown, the defense experts asserted that any estimate of probability that might be generated on the basis of the Caucasian database was too speculative to be acceptable scientifically. That speculative quality, the defense witnesses asserted, caused the F.B.I.’s probability estimates to be unacceptable within the general scientific community.
Government experts testified in rebuttal with reference to both the match declaration and probability estimate issues, though the principal focus was on the issue of probability estimates. They contended that the condition of substructure was unlikely to occur, and if it did, it was infrequent, of insignificant magnitude, and as likely to favor the defendant as it was to cause prejudice. The government’s witnesses asserted in both direct and rebuttal that “conservative” (i.e. defendant favorable) aspects of the fixed bin structure and procedures compensated for any effect of substructure. They asserted that the F.B.I.’s method for estimating probabilities was generally acceptable in the scientific community.
1. Determining a Match
With regard to the general acceptance and reliability of the F.B.I.’s application of DNA identification technology to forensics, Dr. Conneally testified that it was acceptable to apply the theories and methods of DNA testing to the forensic arena (IVa 74-5). He asserted that it was acceptable for different laboratories to apply different methods and techniques (IVa 82-3) and stated that using a quasi-continuous allele system with VNTRs was an acceptable practice in forensics (IVa 84-5).
When asked whether there was an unacceptable level of subjectivity involved in the forensic application of DNA technology, especially with respect to the declaration of a match, Dr. Conneally responded that he believed there was not (IVa 75). In his view, the RFLP technique could be applied in a reliable manner in the forensic setting using the F.B.I. probes: “I have read about them [the probes] in the literature, and I believe you can [use them in forensics], yes” (IVa 86-9). Dr. Conneally stated that he had never heard of an instance of a “false positive,” where a person was wrongly identified in a DNA test (Vb 211, 268-69).
After opining that the RFLP technique applied to VNTRs was a reliable technique that was generally accepted in the relevant scientific community, Dr. Daiger concurred with the statement that the technique was basically the same technique employed by the forensic scientist (Via 38). Furthermore he stated, “in my opinion the forensic applications of DNA laboratory procedures and DNA concepts are simply a specific application of a very broad general category of techniques and concepts which are used throughout the scientific community” (Via 39). Dr. Daiger also stated that he believed that this opinion was shared by the relevant scientific community (Id.).
Dr. Daiger testified that he was quite familiar with the F.B.I. laboratory and with the written protocols and controls used there (Via 40-42). When asked whether, as a result of its procedures, that the F.B. I.’s method could result in a DNA pattern that appeared to be a true pattern but was actually a different pattern from the correct pattern Dr. Daiger said:
I don’t see a credible biological or laboratory scenario, short of say sabotage, that would lead to a false positive in the laboratory. Essentially, all of the kinds of damage, degradation, laboratory mishaps that relate to samples would lead to either an [inconclusive] or in fact a false [exclusion] (IVa 42-43).
When again asked about the possibility of a false positive he stated, “I think it’s virtually nil” (IVa 43).
The F.B.I., Dr. Daiger said, was using appropriate protocols and standards “with great conservativism and caution, and highly reliably, yes” (IVa 48-9). Dr. Daiger responded affirmatively to a question from the Court about the reliability of the F.B.I. laboratory (IVa 50).
*176With respect to the criteria used at the F.B.I. for declaring a match, Dr. Daiger stated that he believed that they were “cautious in their interpretation” (IVa 52). In his view, the nature of the subjective judgment involved in DNA forensic analysis was perfectly acceptable to him (IVa 52). After testifying about his familiarity with the tests and procedures utilized by the F.B.I. to establish its match criteria Dr. Daiger stated, “the empirically derived match criteria developed by the F.B.I. has indeed been generated in an appropriate and scientifically conservative manner” (IVa 62). Regarding the environmental insult studies performed by the F.B.I. Dr. Daiger testified:
I think the conclusion is essentially for all of the environmental insults that have been described, one of three things can happen. Either it has no effect on the outcome of the analysis, or it leads to essentially the difference [destruction] of the DNA in its entirety. Or under some circumstances it leads to a pattern on the gel which is so obviously distorted and inappropriate that it leads to an [inconclusive] (IVa 63).
Dr. Daiger stated that he had never seen an example of a false positive (IVa 64). He responded negatively when asked whether there was anything about a quasi-continuous allele system that would suggest that it was unreliable (IVa 78).
When asked on cross-examination about the match window chosen by the F.B.I., Dr. Daiger stated that the window was selected to be “extremely generous” and “conservative” to the defendant (Vila 12-3). In response to a question posed to him by the court, Dr. Daiger said that the +/— 2.5% match window used by the F.B.I. was conservative with respect to potential for prejudice to the defendant (VII 14-5).
On cross-examination Dr. Daiger pointed out that even if an F.B.I. examiner would conclude from a visual inspection that bands matched, if the subsequent computer assisted quantitative method indicated that the bands did not fall within the F.B.I.’s matching window, the Bureau would not declare a match but would declare it uncertain (Vila 26). Dr. Daiger stated that the number of articles published in scientific journals that address the application of DNA in a forensic setting is on the order of two dozen (Vllb 223).
Dr. Caskey testified, when asked to compare the RFLP process in medical diagnostics with the process in forensics, that:
The procedures that are used in arriving at the answer, the result, are identical. There’s no variation that exists in the fundamental process. However in the case of forensics, what one is doing is developing a generic test system which has a high [out] put and great simplicity and high reproducibility ... (IX 242).
Dr. Caskey indicated that he was not troubled by the fact that different forensic DNA laboratories utilize different match criteria (IX 269).
In Dr. Caskey’s view, there was nothing unusual in the fact that the match criteria used at his forensics laboratory differed from the match criteria used by the F.B.I. (IX 270).
With regard to questions posed concerning the F.B.I.’s use of molecular weight size markers, human cell line controls, yield gels, test gels, and other quality control measures used by the F.B.I., Dr. Caskey asserted:
Its my ... opinion that the F.B.I. has set up a ... very safe and conservative system and that the quality controls that they’ve put in place at the laboratory level are very adequate (IX 279).
Dr. Caskey stated that the F.B.I. used appropriate match criteria and that the size variations that the F.B.I. obtained when they did their repetitive sampling was about the same as that obtained at his laboratory (IX 279-80). According to Dr. Caskey, the concern about false positives is greater than necessary, and that the only real source of potential false positives to be concerned with was possible human error, not system error (IX 286).
Dr. Caskey asserted that the theory of DNA forensic profiling is generally accepted in the scientific community and there are reliable procedures to implement that theory that have been generally accepted *177(IX 290). Dr. Caskey further commented that the F.B.I.’s method constituted an example of a generally accepted implementation of the theories underlying forensic DNA profiling (IX 290, 291).
In response to questions posed on cross-examination regarding the differences between the diagnostic and the forensic applications of DNA technology, Dr. Caskey stated that he believed that the “methods that are employed ... overlap considerably,” that there was a “greater simplicity” in the forensic application because it is “highly repetitive” and “narrower in its scope of technology” (Xa 22). He disagreed that forensic applications were more demanding to interpret (Xa 22).
With regard to the use of highly polymorphic VNTRs for forensic application, Dr. Caskey stated that they were selected for that reason (Xa 38). The various conditions imposed on DNA technology, especially those resulting from limited quantities of crime scene samples, do not in his view, call for compromises in the methods used to perform the tests. He stated, it is, “not [a] compromise, but the most optimized analytic method to be able to give you a result from that precious sample” (Xa 45).
During cross-examination, the defense attorneys posed a series of questions to Dr. Caskey concerning the potential effects of “expectation bias” on the execution and interpretation of forensic DNA profiling. When the Court asked for clarification on this issue the witness responded, “there’s no way to jimmy the system to get an expected result” (Xa 55). In response to a series of questions presented to him on cross-examination addressing an issue characterized by the defense as the F.B.I.’s resistance to permitting its DNA laboratory to be evaluated by external agencies through blind external proficiency testing, Dr. Caskey thought that the F.B.I.’s response to such efforts was one of “caution rather than resistance” (Xa 110).
During cross-examination, when Dr. Caskey addressed the issue of band shifting, the Court asked, “in other words, is the risk of error in interpretation from band shifting a more likely phenomenon in terms of declaring no match rather than match,” to which the witness responded, “That’s absolutely the case” (Xa 188). Dr. Caskey elaborated on this answer by remarking that the phenomenon of band shifting “works to the defendant’s advantage” Xa).
The match criteria used by the F.B.I. in comparison to his own, Dr. Caskey related was “surprisingly not terribly different,” (Xa 198), and what difference there was was not very big (Xa 210-11).
In answer to a series of questions about the potential problems of environmental insults, band shifting, and false matches or exclusions Dr. Caskey stated:
Well, you know, I think what you’re doing here is really hammering very hard on gel shifting, which turns out to be a relatively low frequency event, and one which is not in any way fooled—I mean gel shifting does not influence the pattern of the occurrence of the molecular weights across probes (Xa 223).
When asked by the Court what would be his opinion as to the general acceptance by the scientific community of the.F.B.I. laboratory’s DNA processes and the reliability of their results if the Bureau’s validation studies were to be discredited, Dr. Caskey stated that he believed that the F.B.I.’s processes would not be invalidated because they have additional ways to execute internal controls on their casework (Xa 318-21). Dr. Caskey expected there to be improvements in DNA forensic technology in the future, but that the likelihood of future improvements did not invalidate processes being used presently (Xa 327). The lack of standardized certification and accreditation procedures for forensic laboratories did not mean, in his view, that there should be a moratorium imposed on DNA forensic profiling (Xa 339).
Dr. D’Eustachio, the first defense witness, submitted an expert’s report in which he evaluated the F.B.I.’s validation studies on environmental insults and mixed body fluids and experiments from which the F.B.I. derived its quantitative matching rule (Exh. 44). The substance of Dr. D’Eustachio’s testimony at the hearing is *178contained in his expert’s report. Dr. D’Eustachio’s opinions were primarily based on his examination of the F.B.I.’s environmental insult validation studies, (Exh. 17), the F.B.I.’s fixed bin paper (Exh. 13), twenty-four autorads and the corresponding laboratory notebooks generated by experiments performed by the F.B.I., and, of course, his own knowledge and expertise as a molecular biologist.
Dr. D’Eustachio cataloged a variety of problems he observed with the F.B.I.’s validation studies on environmental insults and mixed body fluids: multiple gels were scored as successes even though the relevant positive control tracks failed; the sizing standards used by the F.B.I. in casework were not used in these studies, thereby undermining claims as to reproducibility in forensic work; on two occasions band shifts were ignored; most experiments were conducted using only a single probe, thereby suggesting that these studies are inconclusive with respect to other probes because each probe is distinctive. Dr. D’Eustachio concluded that the F.B.I. had not developed a reliable and sensitive procedure for identification of forensic DNA specimens and that the validation procedures were badly flawed (Defendant’s Exh. 44, 6-7).
As to the environmental studies, after noting various discrepancies, Dr. D’Eustachio concluded that the F.B.I. had serious problems with reproducibility. He stated that the F.B.I.’s methods and conclusions evince a flawed scientific procedure (Id. 8). With respect to the F.B.I.’s study of the effects of chemical contaminants he stated that the experiments produced some unexpected results and that these could have been resolved by conducting additional experiments (Id. 10). With respect to the bacterial and yeast contamination studies his conclusions paralleled the conclusions he drew from his examination of the F.B. I.’s study of chemical contaminants (Id. 11). He stated that only portions of the experiment were of adequate quality to be interpreted (Id. 13). Concerning the mixed body fluid studies he opined that the reproducibility of band intensity between replicate tracks was poor, there was significant band shifting with some of the samples, and on occasion extra bands appeared (Id. 13-14). As to the overall quality, of the study he concluded that the problems of failed controls, inadequate sizings, failure to assess all probes, and significant misinterpretation or misreporting indicated that the effects of environmental insults and mixed body fluids were unresolved.
Dr. D’Eustachio expounded on the F.B. I.’s quantitative match criteria. He described the standards that a forensic DNA laboratory had to meet in formulating its match criteria as: 1) understanding the factors that alter band migration, and 2) choosing a match window that does not exceed an acceptable level of risk of false positives (Id. 16). Dr. D’Eustachio asserted that the F.B.I. failed to meet either standard (Id.). He referred to the change in the size of the match window used by the F.B.I., the unreliability of the data, the size of the match window compared to the match windows of other forensic DNA laboratories, and the generally poor quality of the laboratory as reasons for his refusal to validate the F.B.I.’s match window (Id.).
Dr. D’Eustachio indicated that the comparison of the data underlying Table III of the Fixed Bin paper and the data he had received from the F.B.I. showed that the F.B.I.’s match window is not a reliable and reproducible basis for making a match (Id. 17). He analyzed the data underlying Table III and the data from the casework autorads provided by the F.B.I. to assess the quality of work done by the F.B.I. in establishing its match window.
Dr. D’Eustachio concluded that there was a significant measurement bias in the data underlying the fixed bin paper and a significant measurement bias in the data underlying the casework autorads he had received from the F.B.I. He also observed that, significantly, the measurement biases from these two sets of data ran in opposite directions. His analysis led him to conclude that under the conditions used by the F.B.I. there were anomalies in the way bands migrated on the gels and that the factors affecting band mobility are not understood (Id. 18-21).
*179He also commented on the size of the F.B.I.’s match window in comparison to the match windows used by other forensic DNA laboratories (Id. 21-25). Finally, Dr. D’Eustachio concluded that the methods and studies used by the F.B.I. to develop its matching criteria did not reflect good science and that the various studies should be done again and done right (Id. 26).
Dr. Hagerman initially reviewed the quality and reliability of the F.B.I.’s DNA laboratory by addressing the issues of DNA loading variability and the use of ethidium bromide. Like Dr. D’Eustachio, Dr. Hagerman submitted an expert’s report that embodies the testimony he gave at the hearing. Dr. Hagerman stated that the problems that he observed with the F.B.I.’s ability to reliably quantitate DNA and the F.B.I.’s use of ethidium bromide in its analytic gels seriously compromises the reliability of the F.B.I.’s casework and population database analysis (Hagerman, Exh. BBB). Dr. Hagerman stated that the procedures followed by the F.B.I. for isolating and quantifying DNA preclude accurate determination of the amount of DNA present in sample extracts (Id. 2). He also stated that the use of ethidium bromide in the F.B.I.’s analytic gels causes unpredictable effects on the mobility of DNA fragments which in turn compounds the problems that the F.B.I. can have with band shifting. These factors make the F.B.I.’s DNA system unreliable he stated (Id.).
Dr. Hagerman observed that variations in F.B.I. DNA loading mass commonly approach a ten-fold variation in samples with purportedly the same amounts of DNA (Id. 4). He alluded to various sources for the errors that could lead to the observed variations in loading mass (Id. 5). In his view, the F.B.I. protocols should incorporate some additional means for assessing the amount of nonhuman DNA that might be mixed in with the forensic sample (Id. 6). He concluded that the variation in DNA loading mass, both in conjunction with the effects of ethidium bromide and independent of the effects of ethidium bromide, leads to differential band shifting (Id. 6).
Dr. Hagerman also described and analyzed the band shifting effects of ethidium bromide (Id. 7-13). He asserted that the F.B.I. did not adequately understand the ethidium bromide caused band shift problem (Id. 13). He stated that, among other causes, the most serious problem with ethidium related band shifting, and a cause that makes the problem of addressing the ethidium bromide band shift problem difficult, if not impossible, is the inability of the F.B.I. to accurately determine DNA concentration (Id. 14). He also cited other sources of error in the F.B.I.’s own ethidium bromide experiments including loading mass inaccuracy, the unnecessary use of increased amounts of restriction endonuclease, and the persistent interpretation of autorads that displayed heavily overexposed bands (Id. 15).
Dr. Hagerman drew the following conclusions: 1) the presence of ethidium bromide in agarose gels represents an unacceptable complicating factor in the F.B.I.’s forensic DNA analysis; 2) bacterial contamination is likely in forensic samples leading to unexpected band shifting; and 3) the use of ethidium bromide in the F.B.I.’s analytic gels is unjustified (Id. 16-17).
Dr. Hagerman indicated that the unpredictability of band shifting at the F.B.I. laboratory adversely affects the population database work, thereby undermining the reliability of these studies. He stated that the autorads comprising the population database are of poor quality, and show many faint bands that are difficult to interpret, lanes in which it is difficult to determine if the probing identified a homozygote or a heterozygote, band positions that are difficult to assess or assign a location to, “doublets,” or closely spaced bands, that could easily be mistaken for single banded patterns, and instances of extra bands (Id. 18-20). Finally, Dr. Hagerman stated that the agarose gels themselves could constitute an additional cause for altered DNA mobility and band shifting if the agarose was not concentrated uniformly throughout the gel (Id. 20-21).
Dr. Gilliam stated, in discussing the various differences that distinguish the dis*180crete allele systems generally used in medical diagnostics from the quasi-continuous allele systems used by forensic laboratories, that the forensic laboratories were struggling to come up with a matching rule and that the task of the forensic DNA scientist is “different from the gene mapping community” (XVIII 42). He considered the problem of developing a quantitative match criteria to be one that has not been dealt with by the medical genetics community, stating, “it’s only come up in forensic laboratories” (XVIII 42).
Addressing the development of matching rules by the various forensic laboratories Dr. Gilliam concluded that the proponents of the forensic application of DNA technology are, in using a quasi-continuous allele system, taking DNA electrophoresis methods about as far as they can go (XVIII 44), and stated that it was a “very technically demanding problem” (XVIII 44).
Dr. Gilliam agreed with Dr. D’Eustachio that the change in direction of the bias observed when the Fixed Bin Table III data was compared to the data the F.B.I. had supplied Dr. D’Eustachio was quite disturbing (XVIII 46-50). Proper environmental insult studies are important, especially for forensic laboratories doing DNA profiling using quasi-continuous allele systems (XVIII 51-2). Dr. Gilliam indicated that he felt that more validation studies should be done (XVIII 54-5). He also regarded the discrepancies between the old and new Caucasian databases to be “unacceptable” and indicated that it only causes confusion with casework (XVIII 60).
On cross-examination Dr. Gilliam indicated that the larger F.B.I. match window would increase the statistical likelihood that a match would be declared (XVIII 76). He did state, however, that, “there could be more than one matching rule that applies to a given set of data. In fact, maybe there should be” (XVIII 79).
Dr. Gilliam was cross-examined about certain statements he had made in the Castro case concerning the effects and detectability of DNA degradation where he had said that it was relatively easy to distinguish an autorad lane containing degraded DNA from one that contained DNA that had not degraded. In response to being confronted with his earlier testimony he explained:
I guess what I was trying to say is there are times when you might not be able to make that distinction, and that’s ... where I stand today. If anything, I’m more circumspect now than I was at Castro. I’ve seen greater things happen to DNA molecules on gels than I thought would have happened. I think it’s basically ... the Castro case testimony is right. You can distinguish degraded from undegraded when you’re probing with a marker, but I certainly wouldn’t say you always can, and I wouldn’t be surprised if someone showed me a difference due to degradation (XVIII 85).
On re-direct Dr. Gilliam asserted that he considered ethidium bromide caused band shifting to be a problem that could have an unpredictable effect on casework, and that the validation studies addressing these problems have not yet been done (XVIII 103). Dr. Gilliam concluded by asserting that he was sure that investigators could discover probes that identified discrete alleles and that a forensically useful DNA identification technology could be developed based on a discrete allele system and this “would put the forensic scientist laboratories back into the realms of established technology, and it would eliminate, if this ... line of experimentation proved successful, ... a lot of problems, matching rules and binning systems that they now have to deal with” (XVIII 112).
ii. Probability Estimate
Once the observation has been made that there is a match between the known (i.e., suspect’s) and questioned (i.e., crime scene) samples, the significance of that determination must be ascertained and expressed to the jury. As described above, this is accomplished by computation of an estimate of the likelihood that the particular set of matching patterns would be found in the population.
Most commentators consider the ability to express this probability to be crucial to the admissibility of DNA-derived evidence: *181“without being informed of such background statistics, the jury is left to its own speculations.” McCormick, Evidence, 655 (Cleary ed.). During the evidentiary hearing in this case, the parties implicitly assumed, and witnesses testified, that a probability estimate is an essential prerequisite to the admissibility of DNA evidence. (Lander XVII 62-66, 155 (“no numbers, no knowledge”)).
Similarly, Dr. Caskey, testifying for the prosecution, likewise underscored the necessity for a probability estimate:
I think if you have a match over three specific probings, that that’s a very informative match, so ... one has to pay attention to it. Now you have to look at it in detail, and if you go back to your population database and you find that by occurrence each of the matches you got were incredibly common alleles, then you say, “Wait a minute,” you know. Three—three-probe match is an interesting match, but it certainly doesn’t give us a high power number. Therefore it could occur by chance, and I’m not going to quote numbers but give you the feel for it.
Now let’s take the circumstance in which we go back to the database and we say, “Well, we were quite fortunate here in that each of the matching alleles that we’ve identified are relatively rare alleles.” Now the match significance becomes quite high, and so just to get a match without taking into account its numerical significance, I think ... I’m unwilling just [to] look at a match without numbers. (IX 287-88).
Without the probability assessment, the jury does not know what to make of the fact that the patterns match: the jury does not know whether the patterns are as common as pictures with two eyes, or as unique as the Mona Lisa.
According to the defendants, flaws in the F.B.I.’s Caucasian database create a possibility of substantial understatement in the ultimate probability estimate. That possibility, they contend, has resulted in an inability of the general scientific community to accept the F.B.I.’s process of making probability estimates.
The centerpiece of the defendant’s challenge to the scientific acceptance of the Caucasian database is the testimony of Dr. Lewontin and Dr. Lander (who testified as a court’s witness), as supported and supplemented by reports (Exhs. HHH & III), an affidavit (Exh. YYY) prepared by Dr. Lewontin for this case, a report prepared by Dr. Lander for the Castro case (Exh. 28), articles authored by Dr. Lander (Exh. Y, EE), and the testimony of and a report (Exh. ZZZ) prepared by Dr. Hartl.
Dr. Lewontin is highly esteemed by the other witnesses (Lander XVII 188 (“probably regarded as the most important intellectual force in population genetics alive”)); (Daiger VII 85 (fair to say that Dr. Lewontin of Harvard was one of the pre-eminent theoreticians in the area of molecular population as early as the 1960s)). Dr. Lander is likewise highly regarded by other witnesses (Conneally IV 232 (a genius with whom it would be hard to argue)); (Daiger VII 173 (very prestigious and respected population geneticist in the field of human genetics)).
The basis for the defendants’ challenge to the reliability of the Caucasian data base is the theory that, because the frequencies of blood types, which are a kind of genetic marker, vary among European Caucasians by nationality, there will be to some degree a retained variation in genetic markers, including VNTRs, in that portion of the American Caucasian population that is descended from immigrants from Europe during the period of major immigration prior to 1924 (Exh. HHH). Such variation in frequencies within a definable segment of a larger heterogeneous population is called substructure.
Dr. Lewontin believes that substructure is present in the North American Caucasian population because of the relatively recent arrival of the European ancestors of a sizable segment of the American Caucasian population. Moreover, he states interethnic group mating has not been extensive, due in part to the fact of propinquity (i.e., most people marry within their own neigh*182borhood) (Lewontin XVI 51-53, 117-30) (Exh. HHH).
The likely existence of variation in VNTR frequencies due to substructure was also acknowledged by Dr. Kidd, who was cross-examined about variations in VNTR frequencies encountered during his studies of African pygmy populations (XXIV 65-92). Frequency variations of varying magnitudes were observed by Dr. Kidd and described as “significant” (XXIV 65, 83). Dr. Kidd also responded to a question about whether he would “expect to see some significant difference among subpopulations” in VNTR frequencies. He said that among the populations that he has studied, that would be his “expectation a priori” (XVIV 89).
While acknowledging substructure in the North American Caucasian population, Dr. Kidd also expressed the view, that any variation would be of insignificant effect and consequence (XXIV 34-35). Dr. Conneally likewise acknowledged that it was “conceivable” that the frequencies of some VNTRs might vary in subpopulations (V 221).
There is no disagreement on the fact that the extent of ethnic-dependent variation of VNTRs among Europeans is unknown, as are the magnitudes of any such frequency variations. The same is true, the witnesses agree, for Caucasian Americans, at least to the extent that such variation continues to exist. Dr. Lewontin believed (Exh. HHH) and Drs. Lander (XVII 96, 176) and Hartl (Exh. NNN) agree, that it is likely that the frequency of some of the VNTR alleles that have been incorporated in the F.B.I. Caucasian data base may vary depending on the ethnic background of the contributors of the samples.
In Dr. Lewontin’s opinion, moreover, no scientifically acceptable compensation factor has been or could be built into the F.B.I.’s Caucasian database that would adequately respond to and ameliorate the potential effects of possible substructuring. When asked whether some factor could be applied to the probability estimate to overcome any error in the estimate that might have resulted from substructuring, Dr. Lander stated that there was no such factor (XVII 177-81), as did Dr. Conneally (V 314, 317).
The government and its witnesses contend, however, that the F.B.I.’s method intrinsically corrects for any distortion that may result from substructuring by means of certain “conservative” aspects: the fixed bin structure, which overstates the true frequency of any allele (Daiger, VI 88-91, 134, 199-200; Kidd, XXIV 216 (“grossly overestimating”)); use of bins that are wider than the match window; clustering of alleles into a single bin when fewer than five alleles have been registered in any single bin; allocation of “borderline bins” to the bin with the larger frequency; and the use of a “2P” factor in the calculation when a single band is encountered (Conneally V 222-29, 290); Daiger (VII 78-100; 137, 182-87, 198-201). In the government’s view, these techniques, acting separately and together, result in an overestimation of frequencies in the population database and during the process of multiplying the frequencies, which favors the defendant.
Dr. Lewontin was unpersuaded about the ameliorating effects of the “2P” factor, which he notes would be a factor with only 13% of the alleles, i.e. single banded alleles. In any event, he believes that the 2P factor would not reduce the result by a sufficient magnitude (Exh. III).
Dr. Lewontin is unequivocal in his view that, as a result of the failure to take substructure into account, no scientifically acceptable estimate of probability can be made on the basis of the F.B.I.’s database: “I would not with the present databases available be able to make a probability statement” (XVI 200); no numerical value should be placed on the significance of a match (XVI 206); would not accept that as a scientifically useful number for characterizing the population variation (XVI 210) or as basis for making a probability statement (XVI 211); an unacceptable estimate ... an unacceptable procedure in science to float numbers for which we have such uncertainty (XVI 299); any number you give is of unknown relationship to the correct *183value (XVI 301); when you don’t know the range of uncertainty and there is no way to quantify that uncertainty, it is scientifically unacceptable even to give an estimate (XVI 301).
Dr. Lander testified with a like degree of conviction about the existence of substructure (XVII 202): “I think everyone who knows the facts says there’s evidence of population substructure” (XVII 92). There’s “lots of good evidence of substructuring amongst American Caucasians” (XVII 93).
In light of that view, Dr. Lander, like Dr. Lewontin, expressed the opinion that the method used by the F.B.I. to estimate the probability of a match is not generally acceptable in the scientific community: “is it fair to say that ... reasonable scientists are of the opinion that at this point we can multiply the frequencies”—“No, I don’t think it would actually be fair to say that” (XVII 125); “there is consensus amongst us that we do not have anything in particular that we could defend right now if we had to” (XVII126); “if you drew a random sample of the appropriate population, you could rely on those numbers ... [But] you have a consensus at least amongst those people who have now seen the facts that we are deeply in doubt about what is and is not defensible” (XVII 130); no currently generally accepted scientific method for estimating the probability of a genotype (XVII 202); in the relevant scientific community of population geneticists that the method used by the F.B.I. to calculate its estimate of the significance of a match is not generally acceptable (XVII 208).
Dr. Lander’s views about the possible existence of substructure and the ability to make a scientifically acceptable estimate of probability without taking such substructure into account have been expressed elsewhere than in his testimony in these proceedings (Exhs. Y, EE, 28).
Like Dr. Lewontin, Dr. Lander rejects the contention that the F.B.I.’s “conservative” approach provides a corrective remedy that cures any uncertainty that results from the suggestion of substructure and resulting miscomputation of frequencies in the Caucasian database (XVII 137). The testimony of the government’s witnesses indicates that these features of its protocol were implemented initially in response to the resolution difficulties encountered with VNTRs, which, unlike discrete alleles, cannot be observed with sufficient detail and distinctiveness to enable a separate identification of each individual allele (Conneally, IVa 124-35, V 56, V 185; Daiger, 243; Caskey X 152; Kidd Doc 391, 74).
When asked whether the F.B.I.’s approach to constructing its bins for the Caucasian database provided a suitable corrective device for disregarding the effects of substructure on VNTR frequencies, Dr. Lander responded:
No. It is a good idea to be conservative about those individual frequencies, and I support the fact that the F.B.I. intends to be conservative about the individual frequencies, but they are apples and oranges. One pertains to whether there’s a correlation. The other pertains to your estimate of individual facts, and you can’t, you know, penalize yourself on “A” to make up for a problem of “B”; it’s apples and oranges,____ (XVII 137).
Dr. Lander pointed out the difference between the purpose of the F.B.I.’s fixed bin approach and the problem created by the possibility of insufficiently unacknowledged substructure: “I’ve never understood the reason for a larger bin ... to be the ability to multiply____ I’ve understood the reason for a larger bin to be a desire to be cautious, careful____ But not to guarantee multiplication” (XVII 137). Being “cautious and careful” in the use of the bin approach, Dr. Lander indicated, may have referred to “calculating a frequency given a database by adding up everything within it,” but not to serve “as a finesse to a different question” (XVII 137-38). Thus, even though the end result might be “right,” that would not make the method by which that result was attained scientifically acceptable: “the fact that ... it might turn out to be right doesn’t mean that it’s got valid scientific method underlying it” (XVII 153).
*184In addition to expressing his concerns about substructure, the unacceptability of the F.B.I.’s method of estimating probabilities, and the absence of deliberate or coincidental devices that effectively and reliably overcome those problems, the testimony by Dr. Lander underscores his view that “anybody who’s looked at the whole range of facts recognizes how little we know” (XVII 125-26) about the issues related to the computation of a probability assessment.
When asked about whether “there are issues with respect to multiplying the frequencies,” he responded, “Absolutely.” Then, in response to the suggestion that “you may come up with a number which is not at all what you say it is if you have substructure,” Dr. Lander observed:
That’s the nature of the concern as it affects lawyers and Courts. The nature of the concern as it affects scientists, I think I’d put more basically. It’s that we do not have the proofs that would allow us to know what procedure to use. (XVII 89).
Thus, if
you are asking us do we have a procedure that we can use to take data, plug it into the procedure and give you ... an estimate, something for which we have scientific reliability that you could use for any purpose. The problem is we do not have such a procedure right now, and people are very concerned about it. (XVII 92).
Dr. Hartl concurred in the views of Drs. Lewontin and Lander concerning the likelihood of substructure and possible consequent variation in the frequencies of VNTRs (Exh. NNN at 7-8; Exh. ZZZ). Although his reliance on his examination of the MN blood type figures in the Mourant compendium was shown on cross-examination to have been misplaced, his testimony and experts’ reports serve to endorse the views of Drs. Lander and Lewontin, and provide support for the defendants’ contention about the degree of disagreement in the scientific community regarding whether the F.B.I.’s ability to compute probabilities is or is not generally acceptable.
The government’s witnesses reject the principal assumption on which the defendants’ witnesses rest their opinion that the F.B.I. technique does not enjoy general scientific acceptability: namely, the likelihood that ethnic-dependent substructure, to the extent that it may exist, can significantly distort the computation of the frequencies in the Caucasian database. They also express confidence that the effect of any inaccuracy in the frequencies has been overcome or significantly ameliorated by the corrective devices noted above.
In addition, the government’s witnesses are persuaded that the resulting probability estimate is acceptable to the scientific community. Each stated that the F.B.I.’s method for calculating probabilities is generally accepted in the scientific community (Conneally, V 110, 116-18; Daiger VI 192). Dr. Caskey’s view on the level of general acceptance of the F.B.I.’s methods and approaches is manifest in the fact that he has, to a large extent, incorporated the Bureau’s protocol and procedures into his own DNA laboratory (IX 302-03) and initially used the F.B.I. database until his laboratory’s database was developed (X 289).
When, however, Dr. Caskey was asked “whether there has been substantial controversy as to when it comes to calculating probabilities, what is scientifically acceptable and what is not, that that’s been a hotly debated issue,” he answered, “Population geneticists have had considerable controversy in the calculation area” (IX 271). In addition, Dr. Caskey stated that “the debate is still open” when asked if “there is still a debate whether you need to have separate data bases for various ethnic or national subgroups of the larger racial population?” (IX 271).
With regard to the significance of a match between several loci, Dr. Conneally noted that “the possibility that they’re from different individuals is very, very, very small” (IVa 79-80), and the existence of a match across three, four, or five loci is, to a reasonable degree of scientific certainty, very significant (V 300-01). The same view was expressed by Dr. Daiger, who *185observed that “the chance that two individuals would match by coincidence alone, at three or four or more loci is extremely small” (VI 68). Dr. Conneally also noted that the outcome of computing the probabilities is to express an estimate, rather than an “accurate or precise probability” (V 208).
The object of making such estimate, as Dr. Kidd noted, is “to make some sort of statement that this is an uncommon pattern,” rather than make an identification of a specific individual (XXIV 40). Despite the objections that have been made by the defense and other critics, Dr. Kidd is confident that, in light of the general rarity of VNTR alleles and the corrective measures built into the F.B.I. system, the value will be “robustly uncommon,” although he would not “place any strong reliance on any absolute frequency” (XXIV 107).
With reference to the issue of independence, Dr. Conneally expressed the view that the American Caucasian population is in Hardy-Weinberg equilibrium (V 105), and thus, on the basis of the Hardy-Weinberg principle, the alleles were independent (IVa 107-09). On the question of the likelihood of substructuring, Dr. Conneally expressed an entirely different view from that expressed by Dr. Lewontin, stating his perception that Americans inter-marry outside their neighborhood and across ethnic lines (IVa 163). That view, he believed, is widely shared (V 221). Dr. Kidd concurred in the description that most of the American Caucasian population above the age of puberty manifests an ethnically diverse background (XIII 426-27). In his view, the existence of frequency variations among “classical markers” in the European populations has not been established, so that there is little likelihood of substructure in the American Caucasian population (XIV 120).
The government’s witnesses also expressed confidence that the F.B.I. agents who contributed to the Caucasian database represented diverse ethnic backgrounds, though none of those witnesses gave a demographic or similar basis for his assumption (Conneally V 87); (Daiger V 189-90); (Kidd XIII 375).
Dr. Caskey was apparently not troubled by the possibility of substructure when he composed the database for his DNA laboratory from samples obtained from blood donors at a Houston hospital (IX 277). Upon visual comparison, Dr. Caskey concluded that his database looked the same as the F.B.I.’s Caucasian database and other Caucasian databases (X 314), though he acknowledged that to obtain an answer to the question of whether there is variation in the frequencies among the databases would require “a lot more analytic time” than the visual inspection that he undertook (X 314).
Dr. Daiger stated that the issue of substructure is “not relevant” (VII 215). In his view, differences in VNTR frequencies “are not reflected in the United States population, certainly not in a statistically significant and substantial manner” (VII191), because there are neither geographical nor cultural barriers inhibiting random mating (VII 214). In his view, an empirical basis exists, at most, to believe that substructuring occurs with polymorphic genetic markers (i.e., like VNTRs) only in very large populations, and then only to a “relatively small” extent (VI 184).
Though a study to ascertain the degree of substructure “would be scientifically interesting,” Dr. Daiger does not believe “it is essential or critical for the F.B.I. to proceed with the particular methodology it uses” (VII 198), because “I don’t believe that there is major and profound genetic substructuring within Caucasians in the United States” (VII 218). Dr. Kidd likewise is of the opinion that it is not necessary for the F.B.I. to collect “ethnic data” (XXIV 198) before its methodology can be applied in forensic cases (XXIV 200) because he is “satisfied with the data and the methodologies for compensating for the uncertainty that are already in place” (XXIV 201).
In Dr. Daiger’s view, the only indication of a lack of independence (such as would result from substructuring of the kind postulated by the defendants) is manifested in *186the problem of excess homozygosity (which witnesses generally attribute to a variety of factors including the loss of lower molecular weight alleles that “ran off” the gel (e.g., Conneally, IV 164)) (VII 213). He expressed the view that if such variation in the frequency of VNTRs exists, the corrections instituted by the F.B.I. are sufficient (IV 128-47; VI 183, 216-18).
The only government witness whose entire testimony came in rebuttal to that of Drs. Lewontin and Lander was Dr. Kidd. At the outset of his testimony, Dr. Kidd stated that he did not agree with the contentions that the possibility of substructure invalidates the process of multiplication to obtain a probability estimate, the introduction of conservative features into the F.B.I. procedures is insufficient to overcome any distortion that may result from substructure, there is a need to collect data about subpopulations before database frequencies can be multiplied to determine a probability estimate, or a structured random sample of a “true ethnic mix” of North American Caucasians must be obtained before meaningful statements about the frequency of a pattern can be made (XIII 341-42).
In Dr. Kidd’s view, the method of bin construction and allocation of alleles resulted, for each' of the bins in the database, in “an estimate of the frequency of that class of patterns where the alleles fall into the bins that were observed, and that is not the estimate of the frequency of any one pattern. It’s a frequency of a class of similar patterns” (XIII 376). This approach, in his view, is “based on very sound scientific theories with a major component of practicality added into it because there is, for example, no way one can actually construct a sample that represents in any stratified way the ethnic mixture of the United States” (XIII 377).
With reference to the effects of substructure, Dr. Kidd testified that from his research in VNTRs it did not appear that there were “dominant alleles” (i.e., alleles of substantially disproportionate frequency). Instead, all alleles appeared to be relatively infrequent, with the result that a ten-fold difference (as might result from substructure) “isn’t nearly as important ... as whether there is any allele that reaches a frequency of fifty or sixty percent” (XIII 384).
Like Dr. Caskey, Dr. Kidd made a visual examination of various Caucasian data bases, and on the basis of that review, concluded that they were all “remarkably similar” (XIII 387-88). On that basis, he concluded that in those databases there “is no evidence of major substructure affecting these VNTRs because in the various samples there has been a very similar frequency distribution in virtually all of them so that where there are bins that have very high frequencies, they tend to be high frequencies” (XIII 388). To the extent that Dr. Kidd observed some variation in “precise frequency,” he concluded that “there’s sampling error in all of them” (XIII 388). In a portion of his testimony that has been submitted under seal in order to preserve the publishability of certain numerical data, Dr. Kidd noted that it would not be scientifically acceptable to compare populations by eyeballing the data (Doc 391, 80).
In Dr. Kidd’s view, the important question to be asked in light of his examination of the databases is whether, in light of the existence of substructure (“of course there is substructure in the U.S. population”) is “does that substructure affect significantly the frequencies here” (XIII 391). The answer to that question, he stated, is “no” because “what substructuring exists is not a major problem because the frequencies are not that different” (XIII 392). He discounted observations concerning substructure made by Prof. Cohen (Exh. FF) and Dr. Acton (Exh. NNN-2). In his view, the Cohen work had not been subjected to adequate peer review (XXIV 27-28) and the Acton paper manifested frequencies that were “reasonably similar” (XXIV 31) and whose variations were likely attributable to “the kind of systematic difference you’d find when two different laboratories are independently measuring their own data” (XXIV 30).
Dr. Kidd expressed a similar view (“there is no more variation between the measure*187mente than I might expect on two replicate measurements” (XXIV 32)) on the basis of visual observation of the results of a scatter plot (Exh. 71) that he prepared with regard to one of the markers in the Caucasian database that had been compared unfavorably by Dr. Hartl.
The government’s witnesses, therefore, uniformly discounted the likelihood of both the existence of and significant effect from substructure and attendant frequency variations in VNTRs. In addition, they shared the view that, even if substructure existed and caused substantial variation in VNTR frequency, there was no reasonable or troublesome likelihood, as Dr. Conneally stated, that the difference between the observed frequency, as used in the Caucasian database, and the actual, but unknown frequency would be disfavorable to the defendant or not average out (Va 88).
Thus, as Dr. Daiger observed, experts who fear the effects of substructure, such as Dr. Lander, consistently
speculate that as you multiply across a number of alleles, ... the error you’ve made is always in the extreme direction and in only one direction. That is, they never take into consideration that if a calculation, even with exceptional sub-structuring, ... leads to a high allele frequency in one case, it is just as likely to lead to a low frequency calculation in another case, and those effects should average out. (VII 215).
In other words, Dr. Daiger views any danger from substructure as being self-correcting over multiple alleles: even with substructuring, the likelihood is as great that frequency will be overstated as it is that it will be understated (i.e., contrary to the defendant’s interests); and, in any event, even if the frequency is understated for one allele, it necessarily must be overstated for some other allele.
Dr. Kidd expressed the same view, beginning with the observation that “for every allele that is more frequent, there has to be an allele in the subpopulation that is less frequent because frequencies still have to sum to one” (XIII 393). Thus, in response to a question about the impact on the product of multiplying frequencies, Dr. Kidd stated that “the expectation is that [such variations] will differ in different directions for the different loci, even for the same two pairs of populations, and hence the expectation is that they will tend to cancel out rather than magnify” (Doc 391, 85).
There is “no reason to expect,” Dr. Kidd stated, “that a given pattern will fall only in bins that differ in one direction.” The expectation, rather, “is that sometimes a band will fall in the bin deviating in one direction and other times it will fall in a bin deviating in the other direction” (Doc 391, 87). Thus, from a statistical standpoint, it would be a “rare exception” in which deviations in frequencies involving multiple loci would work in combination against the defendant (Doc 391, 97). Dr. Kidd noted, however, that “it can happen” that the deviations at one locus of two alleles could disfavor the defendant (Doc 391, 99), but the probability that deviations across loci would have that effect is very slight (Doc 391, 98-100).
2. The Frye Standard in the Sixth Circuit
A. General Acceptance in the Scientific Community/Generally Accepted Explanatory Theory
In 1923 the District of Columbia Circuit considered the admissibility of polygraph evidence in Frye v. United States, 293 F. 1013 (D.C.1923). In the course of rejecting polygraph results, the court stated that the standard for accepting novel scientific evidence was whether the proffered technique has “gained general acceptance in the particular field in which it belongs.” Id. at 1014.
In many federal and state courts, the Frye standard has undergone permutations and changes as the legal system has become less suspicious of scientific change and more tolerant of technological advances that aid in the search for truth. See generally Giannelli, The Admissibility of Novel Scientific Evidence: Frye v. United States, a Half-Century Later, 80 Colum. L.Rev. 1197 (1980). Thus, an approach that accepts novel scientific evidence on the ba*188sis that its reliability has been shown (without regard to the level of acceptance within the scientific community) has been endorsed as “consistent with the underlying policies” of Article VII of the Federal Rules of Evidence, which relates to opinion testimony. Weinstein & Berger, 3 Weinstein’s Evidence 702-36 (1988).
Although, as the following discussion of the applicable cases demonstrates, the path in the Sixth Circuit has not been entirely consistent and clear, the court has arrived at a formulation of the test for admitting novel scientific evidence that abides by the Frye standard while concurrently embracing an additional formulation that, in my opinion, neither supplants nor serves as a substitute for the basic doctrine of “general acceptance in the scientific community.”
That formulation, which was first expressed in United States v. Green, 548 F.2d 1261 (6th Cir.1977), provides in pertinent part that novel scientific evidence is admissible if the proponent establishes that it conforms “to a generally accepted explanatory theory.” Id. at 1268. Although this phrasing is unique to our Circuit, see United States v. Kozminski, 821 F.2d 1186, 1217 (6th Cir.1987) (Guy, J., dissenting), the following extensive review of the pertinent cases makes clear that the Green formulation, along with the more orthodox “general acceptance in the scientific community” standard from Frye has become the test for admitting or rejecting novel scientific evidence in this Circuit.
The Frye/Green standard adopted by the Sixth Circuit is to be distinguished from the reliability standard that is applied in many other courts, which view “the validity of the underlying principle and the validity of the technique as aspects of relevancy.” Giannelli, supra, 80 Colum.L.Rev. at 1203. Under Frye, in contrast, “it is not enough that a qualified expert, or even several experts, believes that a particular technique has entered the demonstrable stage. Frye imposes a special burden—the technique must be generally accepted by the relevant scientific community.” Id. at 1205.
The importance of the distinction' between the “general acceptance in the scientific community” standard and a standard that bases admissibility on a demonstration of reliability was underscored by Professor Giannelli: to equate “general acceptance with reliability,” he wrote, “would represent an abandonment of Frye because the reliability of a scientific technique could be established notwithstanding its lack of general acceptance in the scientific community.” Id. at 1220. It has been held, accordingly, that reversible error occurs in a Frye jurisdiction when a trial court focuses more on the reliability of a technique than its general acceptance within the scientific community. United States v. Addison, 498 F.2d 741, 744 (D.C.Cir.1974).
Although the government makes an interesting assertion that questions about the appropriate standard for admitting novel scientific evidence need not be asked or answered in light of the Supreme Court’s 1983 decision in Barefoot v. Estelle, 463 U.S. 880, 895, 103 S.Ct. 3383, 3395, 77 L.Ed.2d 1090 (1983), its speculations about the applicability of Barefoot overlook the consideration that in that case the issue was the admissibility of opinion evidence that came from a field whose underlying principles, theories, and practices were already well established. The issue in Barefoot was due process, not a departure from or redefinition of the standard for admitting opinion evidence in cases in which, like this one, the underlying scientific principles, theories, and practices have yet to undergo adversarial challenge and judicial scrutiny in the federal courts.
The government’s claims about the supervening authority of the statements in Barefoot likewise overlook the fact that the contentions that it makes about that case, and the case itself, have not been cited, discussed, or adopted by any intervening Sixth Circuit case involving the Frye or related standards. That consideration gives further reason to disregard the government’s attempt to leapfrog over the Sixth Circuit’s cases that may stand in the way of its effort to introduce DNA evidence.
*189The line of cases leading to the Sixth Circuit’s current formulation of the standard for admissibility of novel scientific evidence begins with United States v. Stifel, 433 F.2d 431, 438-441 (6th Cir.1970). In that case the court quoted the Frye standard verbatim, and, applying that formulation, concluded that the results of neutron activation analysis had been admitted properly.
The clear applicability in our Circuit of the orthodox Frye standard came, however, to be somewhat in doubt as a result of some statements in United States v. Franks, 511 F.2d 25 (6th Cir.1975), in which the court upheld a lower court’s decision admitting voice spectrographic evidence. The basis for Franks appears to be the fact that the trend among other courts was towards admissibility and the defense had not “produced a witness rebutting the government’s claim that voiceprint analysis is sufficiently accurate to be admissible.” Id. at 33.
The conventional Frye standard was not discussed with reference to the holding in Franks. The court relied on Stifel, and emphasized the discretion of the trial judge to admit or reject novel scientific evidence. Id.
Had the court refrained from further comments on the issue, there would have been little doubt that it had sub silentio abided by the Frye doctrine. But cause to doubt whether that was the case was strewn in a footnote, in which, the court, after noting that Stifel had applied the Frye standard, observed that it deemed “general acceptance,” as used in Frye, to be “nearly synonymous with reliability. If a scientific process is reliable, or sufficiently accurate, courts may also deem it ‘generally accepted.’ ” Id. at 33 n. 12.
This dictum suggested that the court was repudiating the conventional Frye doctrine, and supplanting it with a more flexible and liberal standard. As noted above, the Frye standard of general acceptance in the community is more demanding and clearly distinct from a reliability standard, and to adopt a reliability standard necessarily constitutes an abandonment of Frye. Giannelli, supra, 80 Colum.L.Rev. at 1220.
The suggestion in Franks that there is no difference between Frye and a reliability standard has not been further discussed or developed in subsequent Sixth Circuit cases, although the Franks dictum is cited on occasion along with Frye and Green, and there are occasional references to reliability in applying those standards. In light, however, of the intervening en banc decision in United States v. Kozminski, 821 F.2d 1186 (6th Cir.1987) (en banc), which reconfirms the Green standard, and subsequent cases that make no mention of Franks, I conclude that there could be little doubt that the Franks dictum is not controlling precedent, and that the reliability standard plays no role in the admissibility of novel scientific evidence in this Circuit.
The Sixth Circuit’s decision in Green came next. Although its statement of a standard has become a component of the current standard for admissibility of novel scientific evidence in our Circuit, the admissibility of such evidence was not at issue in Green. Instead, that case involved a challenge under Rule 702 to the admission of opinion evidence by D.E.A. agents concerning the chemical properties of an hallucinogenic chemical, the quantity of the substance that could be compounded from precursor chemicals in the defendants’ possession, and the likely street value per unit of such quantity. Consequently, the Frye standard had nothing directly to do with the case.
The focus for decision, rather, was on the sufficiency of the factors prescribed for the admissibility of opinion evidence generally in criminal cases. The standard established by Rule 702, the court stated, “is deficient when applied to criminal cases [because] it fails to include among the factors to be balanced by the trial court the one which is unquestionably most important from the point of view of the criminal defendant: the potential prejudicial impact of the expert testimony on the substantial rights of the accused.” Id. at 1268.
*190Quoting a Ninth Circuit decision, United States v. Amaral, 488 F.2d 1148, 1152 (9th Cir.1973), the court stated further in Green that “ ‘[scientific or expert testimony particularly courts the second danger [of undue prejudice or of confusing the issues or misleading the jury] because of its aura of special reliability and trustworthiness.’ ” 548 F.2d at 1268. To guard against the danger of undue prejudice from opinion evidence generally, the court adopted the following standard:
In recognition of the outcome determinative impact of ‘opinion evidence clothed with the weight of expertise,’ Bridger [v. Union Rwy. Co., 355 F.2d 382, 388 (6th Cir.1966)], we adopt for use in criminal appeals the four criteria proposed in Amaral for review of trial court decision concerning expert testimony: ‘1. qualified expert; 2. proper subject; 3. conformity to a generally accepted explanatory theory; and 4. probative value compared to prejudicial effect.’
As with the reference in Franks to the equation of “general acceptance” with reliability, the court in Green took no note of the fact that its adoption of the four-part test from Amaral may have represented reworking of the standard for admissibility of novel scientific evidence. Unlike its statements in Franks, however, the position taken in Amaral was clearly the deliberate holding of the court on the issue before it, although, as noted, that issue was not related to the admission of novel scientific evidence.
The court’s reformulated standard for evaluating opinion evidence generally soon came, however, to be applied to novel scientific evidence. In United States v. Brown, 557 F.2d 541 (6th Cir.1977), the court held that the results of ion microprobic testing involving the comparison of a hair sample from the defendant and an unknown crime scene sample had been admitted improperly at trial.
The court in Brown restated its adoption of the Amaral standard, again emphasizing the potentially prejudicial impact that can result if scientific evidence, with its “ ‘aura of special reliability and trustworthiness,’ ” is allowed into evidence on the basis “of an unproved hypothesis in an isolated experiment which has yet to gain general acceptance in its field.” Id. at 556. The question in Brown was whether Green’s third element, “conformity to a generally accepted explanatory theory” had been satisfied.
The court recited several factors in support of its conclusion that that requirement had not been met. The prosecution’s witnesses were not expert in the requisite fields; their procedures had not been duplicated elsewhere and support for them was not to be found in published writings; no reported case had accepted microprobic analysis; and their conclusions were based on comparisons that were subjective and dependent on visual calculation rather than standards on which their assessments could be evaluated. Id. at 557-58.
Thus, rather than being based on a “generally accepted explanatory theory,” the novel scientific evidence in Brown, the court concluded, rested on “nothing in the record, except unsupported assertions by the prosecution’s expert, to indicate that ion microprobic analysis may achieve a reliable and meaningful result.” Id. at 558. Absent an “absolute standard by which to gauge the accuracy of the tests,” the court stated, it had been “incumbent on the proponents to demonstrate that the relative standards of comparison they employ in their experiments are sufficiently reliable and accurate.” Id.
Although Green had been cited and its standard quoted and used as the apparent basis for the court's holding in Brown, the opinion in Brown included an additional reference to the general acceptance standard of Frye. “Expert testimony on a critical fact relating to guilt or innocence,” the court stated, “is not admissible unless the principle upon which it is based has attained general acceptance in the scientific community and is not mere speculation or conjecture.” Id. at 558.
The court then concluded its discussion of the inadmissibility of ion microprobic test results with stating its holding in terms that linked the Frye and reliability *191standards: it was not persuaded, the court stated, that such evidence “has yet reached the level of general acceptance in its field, or that the experiments conducted in this case have been shown to be sufficiently reliable and accurate, to provide an acceptable basis for expert identification in a criminal trial.” Id. at 559.
The next case involving novel scientific evidence likewise related to the admissibility of testimony about comparisons of hair samples. In United States v. Brady, 595 F.2d 359 (6th Cir.1979), the court recited the Green standard, and found that three of the factors (qualified expert, proper subject, and probative value) had been decided properly in favor of admission of the evidence.
With regard to the remaining factor, the court, again using the Frye terminology despite its recitation of the Green standard (which the court referred to as the standard adopted in Brown), concluded that “the Government presented no evidence as to the general acceptance of microscopic hair analysis in the scientific community,” though such acceptance had been “implicit in the expert’s testimony.” Id. at 363. The court’s discussion of this issue was, however, truncated, as it held that the objection on that ground had not been preserved in the trial court, and that, in any event, any error in the admission of the evidence had not been prejudicial.
The court’s conjunction of the standards from Green and Frye was repeated in United States v. Distler, 671 F.2d 954 (6th Cir.1981), another case involving comparison of forensic samples (in that case, oil samples) on the basis of a novel scientific technique. Though quoting the Green test, Id. at 960, the court in Distler, reciting the Frye standard verbatim, gave greater attention to whether the oil matching techniques at issue in that case had “reached the level of general acceptance that establishes an acceptable basis for the introduction of expert testimony in a criminal trial.” Id. at 961. The court in Distler also noted in passing its earlier dictum in Franks about the equation between reliability or accuracy and general acceptance. Id.
In concluding that the evidence was admissible, the court in Distler began its inquiry by noting that “the question of whether the oil matching procedures involved in this case are generally accepted in their scientific field [requires] two separate inquiries.” These included, first, the question of whether the methods were “generally accepted in the oil matching field in general,” and, second, if so, whether the methods were “properly transferable to an analysis of the sewer samples tested in the instant case.” Id. at 961-62. In Distler, accordingly, the court appears to be placing equal, if not somewhat greater emphasis on the conventional Frye standard as the basis for admitting novel scientific evidence.
In its next case, however, the Sixth Circuit referred only to the Green standard and its “generally accepted explanatory theory” requirement. United States v. Smith, 736 F.2d 1103, 1107 (6th Cir.1984). The court, without elaborate discussion, upheld the trial judge’s exercise of discretion in excluding expert testimony about the vagaries of eyewitness identification.
In contrast to Smith, in which the court referred only to the Green standard, the court used the Frye formulation in United States v. Metzger, 778 F.2d 1195, 1203 (6th Cir.1985), which involved testimony by an A.T.F. chemist about the source of a trace compound found at the scene of an explosion. Rejecting the defendant’s claim that the requisite degree of general acceptance had not been established, the court pointed out that the witness had been qualified to testify in fifty other cases, had co-authored the only article relating to the test at issue, and the test was in use by law enforcement agencies around, the country. Stating that the witness’s testimony was properly admitted over the defendant’s Frye objection, the court stated that “articles in professional journals are ... of great value in that the basis of the article is subjected to close scrutiny by other experts in the field” and “we are dealing with the testimony of an expert in the area of his expertise, *192whose work in the narrow area at issue has been published and adopted by local, state and federal agencies across the nation.” Id. at 1204.
Discussion by the Sixth Circuit of the standards for determining the admissibility of scientific opinion next occurred in United States v. Kozminski, 821 F.2d 1186 (6th Cir.1987) (en banc), an en banc decision reversing the defendants’ conviction for engaging in involuntary servitude. Among the reasons for reversal was the court’s determination that the trial court had improperly admitted testimony by an expert about the victims’ “ ‘involuntary conversion’ to complete dependency akin to ‘captivity syndrome.’ ” Id. at 1194.
The majority quoted the Green standard, and italicized the requirement that scientific opinion evidence must be shown to be “in conformity to a generally accepted explanatory theory. ” Id. Without reference to Frye or a description of how a court is to handle conflicting testimony about the general acceptance of an explanatory theory, the majority held that the criticisms of defense witnesses showed that the expert’s testimony had not been supported by the requisite explanatory theory.
As Judge Krupansky noted in a concurring opinion in Kozminski, there had been neither literature nor published research that addressed his theory, and the expert’s testimony had represented the first public presentation of his theory. Thus, he stated, “it necessarily followed that it never received peer evaluation or validation, let alone recognition as an explanatory theory that had attained general acceptance within the scientific community, to which it belonged within the mandates of existing precedent.” Id. at 1202.
In an opinion that was joined in by then Chief Judge Lively and Judges Martin and Jones, and which was criticized by Judge Krupansky as an effort “to overrule overwhelming legal precedent in this circuit and throughout the nation,” Id. at 1209 n. 10, Judge Guy dissented. He observed that, although the Green standard was “derivative of” the Frye doctrine, Id. at 1215, the factor of “conformity with a generally accepted explanatory theory” represents “a subtle but important change” in the Frye standard. Id. at 1216. Whereas in Frye “the emphasis was placed upon the technique or methodology employed to reach the result, ... in Green the focus has been transferred to the theory. In other words, rather than assessing the ‘thing from which the deduction is made,’ we have focused on the deduction itself.” Id. at 1217.
This shift of focus from the methodology to the deduction, Judge Guy wrote, caused the majority, in its rejection of the expert’s testimony, improperly to find that the testimony of the other experts was more credible and to base its ruling on admissibility on an assessment of credibility (and thus, implicitly, reliability). His discussion did not make, however, any connection between his concerns that the trial court had overstepped its proper bounds and the Franks dictum.
Following Kozminski, the Sixth Circuit confirmed the four-factor test in Sterling v. Velsicol Chemical Corp., 855 F.2d 1188, 1208 (6th Cir.1988). In doing so, the court stated that, “with respect to the third criterion [of “a generally accepted explanatory theory”], the principles upon which the scientific evidence is based must be sufficiently established to have gained wide acceptance in the field to which it belongs. ” Id. at 1208 (emphasis added). As thus formulated, with the introduction of the notion of “wide” acceptance, this element of the four-factor test was held not to have been satisfied with regard to expert testimony by “clinical ecologists” asserting that chemicals produced by the defendant caused personal injury to the plaintiffs.
The basis for the court’s finding was that the “leading professional societies in the specialty of allergy and immunology ... have rejected clinical ecology as an unproven methodology lacking any scientific basis in either fact or theory.” Id. In addition, “while numerous other professional organizations and societies, ..., have not discredited completely the potential usefulness of clinical ecology, few have endorsed either its scientific methodology or the re-*193suits of any experiments conducted under the guise of clinical ecology.” Id.
The court in Sterling also emphasized the fact that plaintiffs’ experts had not conducted tests in support of their conclusions nor had they examined or interviewed the plaintiffs on whose behalf they had testified. “Without the requisite clinical tests and a widely accepted medical basis for reaching its conclusions,” the court stated, “plaintiffs’ expert opinions are insufficient to sustain plaintiffs’ burden of proof that the contaminated water damaged their immune system.” Id. at 1209.
The Sixth Circuit’s most recent formulation of the Green test appears in Novak v. United States, 865 F.2d 718 (6th Cir.1989), a “swine flu” case in which the court held that plaintiff should not have recovered a judgment in the district court because the trial court had erroneously permitted an expert to give an opinion that a swine flu shot had proximately caused the death of plaintiff’s husband.
“An expert’s opinion,” the court stated in Novak, “must be based on a theory that is generally accepted in the relevant scientific community.” Id. at 721. “Since Green,” the court continued, “we have required that in order for the testimony of experts to be admissible under Federal Rule of Evidence 702, it must conform to a ‘generally accepted explanatory theory’ accepted or recognized by the relevant scientific or medical community.” Id. at 722.
This formulation, which again involves conjunction of Green and Frye, was held not to have been met in that case, because “the plaintiff's experts conceded that the scientific and medical community was unsure about, and could not state with any degree of medical certainty” what had caused the condition that caused the death. Consequently, the court stated, “the medical theories about a direct connection between the flu shot and [the cause of death] advanced by [plaintiff’s three experts] are neither ‘widely accepted’ nor ‘generally accepted’ by the medical community,” Id. at 725, and could not, accordingly, serve as the basis for allowing the experts’ testimony.
The foregoing overview of the Sixth Circuit’s treatment of the standard for admissibility of scientific opinion evidence generally and novel scientific evidence in particular indicates that the court has continued to recite and rely on the Frye standard while concurrently using the Green “conformity to a generally accepted explanatory theory” test. The court has not clearly delineated the relationship between the two standards, although it is seems apparent that it has not given primacy to one over the other.
There is a question, accordingly, about how to interpret the conjunctive recitation of Green and Frye that has been a fairly consistent feature in the Sixth Circuit’s cases. On the one hand, the Green standard could be viewed as either an available substitute for Frye, as may have occurred with the majority opinion in Kozminski, or it could be viewed, as by Judge Guy’s dissent in that case, as a modification of the Frye standard, whereby greater latitude is given to evaluate, if not adjudicate the merits of scientific disputes.
Alternatively, the third criterion (“conformity to a generally accepted explanatory theory”) of Green could be viewed as a reworking of the “general acceptance in the scientific community” standard of Frye that has no distinct significance, but is just another way of trying to express the same doctrine.
There is, as well, a question about the meaning of the recently introduced term, “widely accepted,” which has been recited in Sterling and Novak. Is that term, as its location in those cases seems to suggest, to be read as synonymous with the term “generally accepted,” as used in both the Green and Frye formulations? Or is it to be viewed as providing an alternative, and quite possibly more flexible basis by which to measure whether the proposed scientific principle or practice has reached the requisite level of acceptance?
Turning to the last of these questions first, and knowing the answers that I postulate to each of these questions may misinterpret the Sixth Circuit’s meaning, I conclude that no significant alteration in the *194standard for evaluating the admissibility of novel scientific evidence has been undertaken by inclusion of the term “widely accepted” in the court’s two most recent cases. Unlike the adoption of the Green standard, in which the court discoursed at some length on the desirability of avoiding prejudice in criminal cases, 548 F.2d at 1268, no explanatory discussion accompanied the introduction of the term “widely” into the court’s terminology for its standard for admitting opinion and scientific evidence. At most, therefore, the phrase “widely accepted” appears to have been interjected as a synonym for the traditional term, generally accepted.
Turning to the question of the relationship and inter-workings of the Frye and Green standards, I conclude that complete supplantation of Frye by Green has been neither intended nor accomplished. Despite Judge Guy’s perception in Kozminski that Green has worked a “subtle change” in Frye, so that judges can base their admissibility decisions on reliability, rather than general acceptance in the scientific community, I find no such effect to be manifest in either that decision or the many other decisions prior to and after Kozminski that have referred to both standards in the course of adjudicating the admissibility of novel scientific evidence.
I also conclude, however, that Green is not to be used simply as an alternative for the Frye requirements, whereby a court can pick and choose whichever formulation suits its perception of the novel evidence. Continued recitation of those requirements in conjunction with the Green standard indicates that the Court of Appeals views the two standards as complementary, rather than exclusive.
In light of that view, and the court’s conjoining of the two standards, I conclude that Green manifests an effort to express the meaning of the Frye standard in equivalent terms. To a considerable extent, therefore, I share the view manifest in Judge Krupansky’s concurrence in the court’s en banc decision in Kozminski, in which he interprets Frye and the four-factor standard as jointly requiring
that the theory be firmly anchored in sound, reliable, and sufficiently accurate scientific principles, and sufficiently established to the point of having achieved general acceptance within the particular field to which it belongs. Stated differently, the scientific explanatory theory must have (a) received at least some exposure within the scientific peerage to which it belongs; (b) received peer evaluation to determine its scientific validity and reliability; and (c) achieved general acceptance within the scientific community to which it belongs. 821 F.2d at 1201.
As I understand the Judge’s views, the requirement of conformity to a generally accepted explanatory theory is simply another way of seeking to express the Frye standard. The Green formulation represents an effort, all too rarely attempted, to give some additional, but not inconsistent or competing content, to the vagueness and ambiguity of the term, “general acceptance.”
Thus, in light of my examination of the Sixth Circuit’s decisions, I conclude that in the context of this case, the government must show that the principles and procedures on which its proposed DNA evidence is based are generally accepted in the scientific community. In other words, those principles and procedures must be shown to conform to generally accepted explanatory theories about molecular biology and population genetics.
This conclusion, though achieving an accommodation between Frye and Green, and stating the standard for evaluating the government’s motion to admit its DNA evidence in this case, does not answer all the questions about applying the standard. Among these are: a) specification of the pertinent scientific community whose general acceptance must be manifest; b) the standard of proof; and c) the meaning of the term “general” acceptance.
B. Pertinent Scientific Community
Professor Giannelli points out that in order to apply the Frye standard, “courts must decide who must find the procedure acceptable, they must define exactly what must be accepted, and they must determine *195what methods will be used to determine general acceptance.” 80 Colum.L.Rev. at 1208. Too narrow a definition of the pertinent scientific community can render the Frye standard meaningless and ineffective. See Jonakit, Will Blood Tell? Genetic Markers in Criminal Cases, 31 Emory L.J. 833, 852-54 (1982).
In defining the pertinent scientific field, courts must first identify the field in which the underlying principle falls, and next determine whether that principle has been accepted by scientists in that field. Giannelli, supra, 80 Colum.L.Rev. at 1208. In this case neither party has undertaken to define expressly its views about the identity of the pertinent scientific community. The government, most noticeably in the conduct of its cross-examination of many of the defendants’ witnesses and its emphasis on the frequency with which the F.B.I. DNA test results have passed muster in other courts, appears to suggest that DNA testing should be found to meet the Frye standard if the F.B.I.’s protocol and procedures enjoy the approval of other forensic scientists. The defendants, by their selection of expert witnesses, implicitly assert that approval must come from a broader scientific community made up of persons familiar with molecular biology and population genetics.
I agree with the perception of the defendants that, in order to meet the Frye standard, the F.B.I.’s DNA principles and procedures must be shown to be generally acceptable to scientists beyond the forensic users of such techniques. This was the approach taken in the voice spectrogram cases. Thus, in Reed v. State, 283 Md. 374, 391 A.2d 364, 377 (1978), error was held to have occurred when a trial court limited its consideration to the testimony of members of “ ‘the group actually engaged in the use of [the] technique and in the. experimentation with this technique.’ ”
As the court noted in Reed, “the purpose of the Frye test is defeated by an approach which allows a court to ignore the informed opinions of a substantial segment of the scientific community which stands in opposition to the process in question.” Id. Accord, Jonakit, op. cit.; State v. Gortarez, 141 Ariz. 254, 686 P.2d 1224, 1233 (1984) (“experts in many fields, possibly including acoustical engineering, acoustics, communications electronics, linguistics, phonetics, physics, and speech communication”); Cornett v. State, 450 N.E.2d 498, 503 (Ind. 1983) (“linguists, psychologists, and engineers, in addition to the people who use voice spectrography for identification purposes”). See also People v. Reilly, 196 Cal.App.3d 1127, 242 Cal.Rptr. 496, 503 (1987) (scientists in “broader disciplines ... knowledgeable about bloodstain typing ... should be considered as part of the relevant scientific community”).
To the extent that the government seriously intended to contend that scientists from the broader fields of molecular biology and population genetics, including theorists in those fields, were not credible if they had not had experience with the forensic application of DNA and genetic theories, I reject that contention. In my opinion, the scientific community to which we must turn in order to assess whether general acceptance has been attained is composed of scientists from the fields of molecular biology and population genetics who have expertise in either or both of those fields and a reasonably comprehensive understanding about the F.B.I.’s DNA testing protocol and procedures.
In light of that definition of the pertinent scientific community, it is clear that both parties produced competent and articulate representatives of that community at the hearing in this case. Those witnesses were, moreover, clearly aware of the current views within the pertinent scientific community toward the F.B.I.’s protocol and procedures.
C. Determination of General Acceptance
i. Standard of Proof
Rule 104(a) of the Federal Rules of Evidence provides that “preliminary questions concerning ... admissibility of evidence shall be determined by the court.” This rule is applicable to determination of the admissibility of novel scientific evidence. United States v. Kozminski, 821 F.2d *1961186, 1194 (6th Cir.1987) (en banc). Though not discussed in the cases applying the Frye/Green standard or by the parties, the standard of proof appears to be a preponderance of the evidence.
The basis for this conclusion is the Sixth Circuit’s decision in United States v. En-right, 579 F.2d 980 (6th Cir.1978). In that case, which involved the question of the standard of proof to be applied to predicate facts relating to the admission of coconspirator statements, the court held that the preponderance of the evidence standard was to be applied to preliminary fact questions that had to be resolved by the judge before evidence could be admitted for the jury’s consideration. Id. at 984-86.
There appears to be no basis, legal or otherwise, on which this Court could deviate from the Enright holding. Accordingly, the government’s burden of proof is a preponderance of the evidence.
ii. Scope of the Inquiry
The court in Enright emphasized that adjudication of predicate facts “calls for the exercise of judicial fact-finding responsibilities by the trial judge, responsibilities which require him to evaluate both credibility and the weight of the evidence.” Id. at 985. This is true, the court noted, even if to some extent the particular preliminary question “may also coincide with an ultimate question of fact for the jury.” Id.
In light of Enright, this court must identify the nature of the factual disputes that it can adjudicate in determining the admissibility of novel scientific evidence. To be sure, in some instances, as was the case in Enright, there will be an overlap of facts that will be decided by the judge at a preliminary stage and the jury at trial. But the court must take care not to expand the scope of its adjudication, or permit it to exceed those which are delineated by the applicable criteria for admissibility.
In this case, that criterion is the “general acceptance” of the protocol and procedures; the criterion is not reliability. Application of the Frye/Green standard does not involve consideration of the validity of the underlying scientific principles or reliability of the scientific methodology or results. Under Frye/Green, assessment of the ultimate validity and reliability of the evidence is for the jury; the scope of the inquiry under that standard for admissibility is limited to the question of general acceptance of the protocol and procedures.
This understanding of the distinct elements and viewpoints of the Frye/Green standard, on the one hand, and the reliability standard, on the other, provides a basis for accommodating the concerns of Judge Guy in his dissent in Kozminski, in which he suggested that the majority had improperly encouraged trial courts to weigh the opinions of experts on the merits of the scientific principles at issue. Id. 1217.
Judge Guy described his view of the limited scope of a Frye/Green hearing and the need to refrain from intruding into areas of fact-finding that belong to the jury with a quote from Ibn-Tamas v. United States, 407 A.2d 626, 638-39 n. 24 (D.C.App.1979): “The judge’s role is properly limited to verifying credentials, including findings that the scientific field is generally recognized and that the methodology proffered is generally accepted by the expert’s colleagues in the field. The judge is not to take over the jury’s function of weighing the persuasiveness of the testimony.”
As the majority’s description, Id. at 1194, and Judge Krupansky’s dissection of the record makes clear, Id. at 1202-03, however, Judge Guy misperceived what had happened in Kozminski. The issue of admissibility was viewed from the standpoint of the level of acceptance of the novel theory in the pertinent scientific community. On that basis, it was found wanting. Thus, the danger feared by Judge Guy— namely, that application of the Green standard fosters usurpation of the jury’s role and right to make factual findings on ultimate fact issues by permitting adjudication of reliability issues, did not arise in that case.
Nonetheless, Judge Guy’s views underscore the limited role of the judiciary during the admissibility phase, and the need for judges to limit their focus to the issue of general acceptance, rather than expand *197their inquiry and decisional basis into the area of reliability. When applying the Frye/Green standard, matters of reliability are neither relevant nor material.
If in making its determination about the level of acceptance, therefore, a court ventures into adjudicating the merits of any underlying scientific disputes, it necessarily will be required to reach conclusions about the validity of the scientific principles and reliability of the procedures and results. At that point, what should be a Frye/Green hearing, limited solely to the question of whether the proponent has shown general acceptance, would improperly become converted into a hearing whose outcome is dependent on the court’s determination of the validity and reliability of the scientific method employed by the proponent. The effect of adjudication of the merits of the scientific dispute is, therefore, unavoidably to abrogate the Frye/Green standard and substitute in its place the reliability standard.
iii. Definition of General Acceptance
The issue at a Frye/Green hearing is, accordingly, whether the proponent has shown by a preponderance of the evidence that the proffered novel scientific evidence is generally accepted in the scientific community. In the context of this case, that requires a finding that the pertinent scientific community generally accepts the ability of the F.B.I.’s protocol and procedures to reliably determine the existence of a match and provide a scientifically acceptable estimate of the relative rarity of the particular pattern in the Caucasian population.
There appears to be no dispute that in order to find that a principle or practice is generally acceptable in the scientific community and conforms to a generally accepted explanatory theory, a court need not find that there is unanimity, or consensus within the scientific community concerning such acceptability. Thus, in United States v. Stifel, 433 F.2d 431, 438 (6th Cir.1970), “every useful new development must have its first day in court. And court records are full of the conflicting opinions of doctors, engineers and accountants, to name just a few of the legions of expert witnesses.”
In reaching its conclusion that the government had not met its burden of proof on the admissibility of the novel scientific evidence in United States v. Brown, 557 F.2d 541 (6th Cir.1977), the court, after noting that absolute unanimity was not required for admission of novel scientific evidence, Id. at 556, noted in dictum (there having been no rebuttal experts offered by the defense) that “conflicting testimony concerning the conclusions drawn by experts, so long as they are based on a generally accepted and reliable scientific principles, ordinarily go to the weight of the testimony rather than to its admissibility.” Id. at 557.
The Sixth Circuit likewise has not conditioned the admissibility of novel scientific evidence on a showing that the results that it purports to reach are absolutely sure and certain. As observed in Stifel, supra, at 441, “ ‘conclusiveness’ is not the requirement for the admissibility of scientific evidence,” and “neither newness nor lack of absolute certainty in a test suffices to render it inadmissible in court.” Id. at 438. Thus, despite these circumstances, the court in Stifel, citing and quoting from Frye, held that the evidence had “gained ‘general acceptance in the particular field in which it belongs.’ ” Id.
The court again expressed its view that the “lack of certainty [goes] to the weight to be assigned to the testimony of the expert, not its admissibility,” in United States v. Brady, 595 F.2d 359, 363 (6th Cir.1979) (upholding testimony based on microscopic examination of hair samples). And Judge Krupansky, concurring in the court’s en banc decision in United States v. Kozminski, 821 F.2d 1186, 1200 (6th Cir.1987) (en banc), noted that “absolute certainty of result and unanimity of scientific opinion [are] not required so long as the conflicting testimony concerning the conclusions drawn by the experts are based on generally accepted and reliable scientific principles.” Accord, Brown, supra, 557 F.2d at 556.
*198Although neither consensus nor certainty-are an element of the proponent’s burden of proof, that does not mean that the absence of consensus is immaterial, or that there are not other factors to take into account. As Professor Giannelli notes:
The percentage of those in the field who must accept the technique has never been clearly delineated. Most courts applying Frye have not addressed the issue adequately; they have either ignored it altogether or offered rather general statements. For example, one court has defined general acceptance as “widespread; prevalent; extensive though not universal.” Another court has conceded that “a degree of scientific divergence of view is inevitable,” without elaborating on how much divergence would be dis-positive. Again, the latitude allowable to a court under the malleable Frye standard could yield the admission of evidence that a large segment of the scientific community would find unacceptable. Giannelli, supra, 80 Colum.L.Rev. at 1210-11 (footnotes omitted).
Judge Weinstein, who is a critic of the Frye standard, does not discuss the meaning of “general” in any detail. He does, however, note that the determination of whether novel scientific evidence has been generally accepted may take into account such factors as “the expert’s qualifications and stature, the use which has been made of the new technique, the potential rate of error, the existence of a specialized literature, and the novelty of the new invention.” Weinstein, supra, 702-41 to 702-42 (footnotes omitted).
The Sixth Circuit has pointed to a number of factors that have led to findings that novel scientific evidence was admissible. In Stifel the court pointed to the decision in State v. Coolidge, 109 N.H. 403, 260 A.2d 547 (1969), which it described as having “dealt with many of the same problems with which we deal in this case.” Id. at 439. The court in Stifel quoted a lengthy excerpt from Coolidge, including a reference to disputes about matching tests and theories of probability, which the Coolidge court held had properly been for the jury. Id. at 440 (quoting Coolidge, supra, 260 A.2d at 561). On this basis, the challenge of three “well-qualified” defense experts to the evidence was found not to have overcome the showing of acceptance by the scientific community. Id. at 438.
In Franks, with its dictum about the equivalence between general acceptance and reliability, the court upheld admission of voice spectrograph evidence on the basis of a “twenty-five page inquiry” into the qualifications of the witness and “reliability of the scientific process,” which had been followed by cross-examination of the witness about “his purported role as an advocate of the process and some other courts’ refusals to admit voiceprint evidence.” The court also noted that no rebuttal witnesses had been produced by the defendant. 511 F.2d at 33.
Testifying in support of admissibility of evidence regarding the matching of oil samples drawn from sites operated by the defendant with samples causing pollution at nearby sites was held to have been properly admitted in United States v. Distler, 671 F.2d 954 (6th Cir.1981), on the basis of expert testimony by the persons who tested the samples and subjected them to chromatographic analysis and experts on mass spectroscopic analysis and organic analysis. There was, as well, evidence that the procedures had conformed to standards published by a professional society. In light of that testimony, the court found the requisite degree of scientific acceptance of the method. Id. at 962.
In dictum, the court suggested in United States v. Smith, 736 F.2d 1103, 1107 (6th Cir.1984), that testimony by an expert psychologist about the inadequacies of eyewitness identification may qualify for admission under the Green/Frye test, but that no error had occurred when it had been excluded. In support of its dictum, the court noted that the evidence had come increasingly to be viewed as reliable, and had been favorably reviewed in Scientific American. Id. at 1106-07.
Several factors were recited in United States v. Metzger, 778 F.2d 1195 (6th Cir.1985), as supporting admissibility of expert *199opinion that the chromatography showed that the source of a trace chemical at an explosion site was a particular brand of dynamite. The defendant contended that the application of chromatography to the detection of the particular trace chemical was “new, untested, and not accepted in the scientific community.” Id. at 1203. This challenge, the court held, was overcome by the testimony of an A.T.F. chemist who had an advanced degree in polymer chemistry, attended courses and seminars dealing with explosives, and previously been qualified as an expert in fifty cases. In addition, several crime laboratories had adopted the procedure, which had been developed by the government witness.
Without otherwise rebutting the witness’s testimony, the defendant challenged the degree of acceptance on the basis that the only article on the process had been authored by the witness. Rejecting this contention, the court stated, “the implication that ... a publication supports a witness only where the view held by the witness is widely shared by other experts in the field ... is too limited. Articles in professional journals are also of great value in that the basis of the article is subjected to close scrutiny by other experts in the field.” Thus, principally on the basis of the witness’s expertise, his publication, and the adoption of his process by other law enforcement agencies, the court upheld the admission of his testimony.
Thus, the Sixth Circuit has looked to the testimony of experts in the particular field, the acceptance of the proponent’s writings in professional journals, and the absence of rebuttal testimony as providing a basis for admission of novel scientific evidence. It should also be noted that acceptance of the process by other courts was a factor in Stifel, 433 F.2d at 438, Franks, 511 F.2d at 33 n. 12, and Distler, 671 F.2d at 962. Other courts have, however, tended to discount the significance of judicial approval of a technique with the observation that the requisite acceptance “is that of scientists, not courts.” People v. Reilly, 196 Cal.App.3d 1127, 242 Cal.Rptr. 496, 500 (App.1987).
In summary, I have not encountered, and the parties have not cited, a case applying the Frye standard rejecting the admissibility of the evidence where a set of experts, such as in this case, have testified that the procedure was generally accepted. Where such experts have testified, the evidence has been admitted despite the firmly held countervailing views of the opponent’s experts.
Some understanding about what constitutes general acceptance in our Circuit may also be gained from considering those cases in which general acceptance has not been found. The cases in which the Sixth Circuit has held that novel scientific evidence, or opinion evidence based on challenged scientific theories, should not have been admitted shows that the court has found a want of general acceptance only where the evidence has been manifestly unsupported outside the proponent’s own laboratory.
Thus, in United States v. Brown, 557 F.2d 541 (6th Cir.1977), in which the court rejected the results of microprobic testing of hair samples, the court pointed out that the prosecution’s witnesses were not experts in the requisite fields; their procedures had not been duplicated elsewhere and support for them was not to be found in published writings; no reported case had accepted the results of their analysis; there were no standards by which to gauge the accuracy of their tests; and no effort had been made to compare the test samples against a statistically valid test group. Id. at 557-58. The court concluded that there “appears to be nothing in the record, except unsupported assertions by the prosecution’s experts, to indicate that ion microprobic analysis may achieve a reliable and meaningful result.” Id. at 558.
The defendants might contend that each of the conditions described in Brown with regard to the deficiencies of the government’s principles and practices exist in this case: involvement of inexpert participants; insufficient experimentation and validation; lack of published results; and challenged standards for comparison of results.
*200Such reading of Brown, in my opinion, overlooks basic distinguishing features that went to the issue of general acceptance of the procedure at issue in that case. In that case, there was “no authority in the field in support of their positions,” Id. at 557, so that the only testimony in support of the technique was that that was being offered by the persons who had developed and implemented it. Id. at 555. That, in my view, was the crucial consideration, because, as the cases cited above indicate, testimony solely by the developer of the novel technique almost never has been held to have shown that a procedure enjoys general acceptance. In this case, there is extensive testimony by experts other than F.B.I. employees about the scientific acceptability in each of these areas. This distinction is crucial, because the government’s evidence does not simply stand alone and unsupported.
A similarly deficient record was also the basis for the en banc court’s finding that an expert’s opinion about the “involuntary conversion” variant of the “captivity syndrome” should not have been admitted at the defendant’s trial in Kozminski, supra. One of the defense witnesses stated that he had never heard of the theory, and that, in any event, it was not applicable to the case at hand. The explanatory theory (i.e, captivity syndrome) on which the plaintiff relied, the court concluded, was inapplicable, as none of the ten elements of that syndrome were established at trial. Id. at 1194.
Most importantly, as Judge Krupansky noted in his concurring opinion in Kozminski, the expert’s “own admissions render his conclusions inadmissible.” Id. at 1201, so that his “first public presentation of his theory” in that case, Id. at 1202, of “a theory of first impression,” Id. at 1203, could be described as “hypothecation that had not ‘attained general acceptance in the scientific community.’ ” Id. at 1202. Thus, it appears that the plaintiff’s expert in Kozminski was projecting an idiosyncratic theory that had no acceptance beyond his own claims.
Similarly, in Sterling v. Velsicol Chemical Corp., 855 F.2d 1188 (6th Cir.1988), the Sixth Circuit found a want of acceptance in the pertinent scientific community with regard to the theories of “clinical ecology.” The court particularly emphasized the facts that “the leading professional societies in the specialty of allergy and immunology ... have rejected clinical ecology as an unproven methodology lacking any scientific basis in fact or theory,” Id. at 1208, the inability of the putative experts to point to studies supporting their views, and their failure to have personally examined the plaintiffs before expressing their opinions. Id. at 1208-09.
Finally, in Novak v. United States, 865 F.2d 718 (6th Cir.1989), the court held that in view of the fact that the proponent’s own witnesses conceded that “the scientific and medical community was unsure about, and could not state with any degree of medical certainty” what caused the condition regarding which the experts were seeking to testify, the evidence was not admissible. Id. at 723. In that case, the issue was not the merits of the basis on which the proponent’s experts expressed their opinions, or the validity of data customarily used to support conclusions such as they were asserting, but the fact that there was “no such support for [their] theory in the instant case.” Id. Plaintiff’s experts, the court stated, themselves “conceded” that, with regard to “the very heart of [their] theory,” there was “no proof.” Id. at 725 n. 7.
Additional understanding of the meaning of general acceptability may also be gained from examining cases from other courts in which the proffered evidence was found not to have met that requirement. Such has occurred where the evidence was truly novel, as in United States v. Tranowski, 659 F.2d 750, 756 (7th Cir.1981), in which neither the witness “nor anyone else to his knowledge had ever attempted this procedure before,” Kropinski v. World Plan Executive Council, 853 F.2d 948, 957 (D.C. Cir.1988), where there was no evidence of “a significant following in the scientific community, let alone general acceptance,” Robertson v. McCloskey, 680 F.Supp. 408, *201412 (D.D.C.1988), in which the court stated that “a single scholarly article” on the topic “hardly qualifies” as “sufficient evidence of the general acceptance” of the witness’s field, and United States v. Shorter, 809 F.2d 54, 61 (D.C.Cir.1987), in which the proponent’s experts disagreed amongst themselves about various aspects of the proposed evidence.
Of the cases that I have uncovered in which novel scientific evidence was evaluated on the Frye standard and either admitted or rejected, those dealing with the voice spectrographic evidence seem most comparable to the case at hand. At issue in those cases, as here, was a novel technique for identifying criminal suspects by means of a characteristic believed to be unique. With voice spectrographs, however, the underlying theory (uniqueness of speech patterns) was disputed, People v. Law, 40 Cal.App.3d 69, 114 Cal.Rptr. 708, 712-13 n. 8. With DNA evidence there is universal acceptance of the fundamental premise that everyone except identical twins has portions of his or her DNA that is unique.
Aside from the distinction that the supposed unique quality on which the analysis was based was not nearly as well established with speech patterns as it is for the composition of each individual’s DNA, the voice spectrograph cases had features that were otherwise generally similar to the issues raised in this case: the opponents contended that only the developer of the technique and his pupils considered it reliable; there was substantial countervailing opinion and dispute in professional scientific and legal journals; and experts frequently testified in opposition to claims of general reliability.
In at least four jurisdictions the challenges to voice spectrographic evidence led reviewing courts to repudiate the Frye standard. See United States v. Williams, 583 F.2d 1194 (2d Cir.1978); State v. Williams, 388 A.2d 500 (Me.1978); State v. Williams, 4 Ohio St.3d 53, 446 N.E.2d 444 (1983). The issue of voice spectrography was the occasion for the Sixth Circuit’s temporary acquiescence in the reliability standard in Franks, supra. Thus, many of the courts avoided the problems presented by a split among the experts by changing the standard for assessing the admissibility of the evidence.
In only three reported cases was the Frye standard found to have been satisfied. One of these was the district court decision in Williams, which was later affirmed on different grounds (i.e., changed standard). In United States v. Williams, 443 F.Supp. 269, 273 (S.D.N.Y.1977), aff'd on other grounds, 583 F.2d 1194 (2d Cir.1978), the court upheld admission of voice spectrographic evidence under Frye in the face of defense claims of lack of acceptance on the basis of its finding that the technique “has been accepted by a substantial section of the scientific community concerned.”
The same conclusion was recently reached in United States v. Maivia, 728 F.Supp. 1471 (D.Haw.1990). In that case the defendant sought admission of voice spectrographic evidence, and the government opposed his request. The evidence was admitted on the basis of the testimony of a single expert for the defense, which had been opposed for the government only by the testimony of an F.B.I. agent. In the other reported case applying the Frye standard where the record contained evidence of a dispute about the acceptance of voice spectrography in the scientific community, Commonwealth v. Lykus, 367 Mass. 191, 327 N.E.2d 671, 675-78 (1975), the court resolved the dispute by limiting the applicable scientific community to “those who would be familiar with its use.”
The other Frye challenges to the admissibility of voice spectrographs were successful: courts, when, as we are here, confronted with a split of scientific opinion on the issue, declined to admit the evidence. But a reading of those cases shows clear factual distinctions between those cases and this. In those cases the proponents usually produced testimony only by the developer of the technique or a pupil, whereas the opponents produced the testimony and opinions of experts from the broader scientific community. See People v. Tobey, 401 Mich. 141, 257 N.W.2d 537, 539 (1977); Commonwealth v. Topa, 471 Pa. 223, 369 *202A.2d 1277, 1281 (1977). On occasion, the developer or his pupil himself acknowledged a “substantial division of opinion among those who [had] done work or performed experiments relating to the voice-print process.” Reed v. State, 283 Md. 374, 391 A.2d 364, 377 (1978). Where an outside expert testified for the proponent, courts emphasized his reservations about the extent of acceptance among his professional colleagues. See United States v. Addison, 498 F.2d 741, 744-45 (D.C.Cir.1974); People v. Law, 40 Cal.App.3d 69, 114 Cal.Rptr. 708, 715, 718 (App.1974); Reed, supra, 391 A.2d at 374 n. 14.
The purpose of this review is to show that, at least in the instance of voice spectrographs, defendants tended to prevail in their contentions that the method was not generally accepted in the scientific community where the prosecution failed to marshal testimonial support from outside experts. Indeed, as noted, in several instances, courts confronted with that situation altered their standards for admissibility rather than excluding the evidence. This set of cases is an indication of the reluctance with which reviewing courts have found a want of general acceptance within the scientific community.
iv. Findings re. General Acceptance
In light of the evidence of record, as heard and reviewed by me and summarized above, and on the basis of the Frye/Green standard, I find that the government has met its burden of showing by a preponderance of the evidence that the general scientific community, but by no means the entire scientific community, accepts the F.B.I. protocol and procedures for determining a match of DNA fragments and estimating the likelihood of encountering a similar pattern.
With regard to the issue of the ability to determine that bands match as provided in the protocol, I am persuaded by the testimony of the prosecution’s four principal experts that, in their view and based on their knowledge of the F.B.I.’s practices, the government’s laboratory has designed and implemented a program whereby multiple loci matches can reliably be ascertained.
In making my determination, I take note of the relative professional standings of the prosecution witnesses and the defense witnesses regarding the band shift issue. Drs. Conneally, Caskey, and Kidd have been selected by invitation for membership in the Human Genome Organization, and Dr. Caskey is President of the American Society of Human Genetics. These professional accomplishments are a manifestation of esteem on the part of professional colleagues at the highest level of their disciplines. In addition to the scientific stature and judgment that such election implies, participation in the activities of such organizations and affiliation with fellows of that rank gives such individuals, I believe, a somewhat better basis on which to gauge the vjews of those colleagues about the acceptability of new developments related to their discipline.
This is not to question for a moment the scientific competence of any of the witnesses, including the defense witnesses who spoke to the issues pertinent to possible band shifting and problems with the Bureau’s validation, mixed body fluid, and environmental insult studies. This finding simply reflects my judgment in light of the entire record on the question of which of the experts is more likely to have a better general understanding of the level of acceptance within the scientific community. I find that the stature and professional standing of the government’s witnesses on that issue place them in a position in which they are somewhat better able to assess the sense of the scientific community on the ability of the Bureau’s ability to make reliable multiple loci matches.
Another important gauge of the general acceptance of the scientific community is the fact that Dr. Caskey, who is the director of one of the country’s major genetics laboratories, has adopted the F.B.I. protocol almost in its entirety. The fact that he uses a smaller match window, his own population databases, and may have made some other alterations in the F.B.I. methodology does not detract from the significance of the fact that, after considering the protocols and procedures of the private lab*203oratories, he chose, in light of his knowledge of those practices and those of the F.B.I., and, as well, in light of his expertise, the F.B.I. protocol.
Though Dr. Caskey may currently be within the community of forensic DNA scientists, he remains, as he was at the time that he was making his decision to adopt the F.B.I. protocol, a pre-eminent academic and clinician. His views, accordingly, reflect those of someone who may be viewed as being both “inside” and “outside” the forensic community.
During the course of this hearing, Dr. Caskey became aware of the challenges being made about the quality of much of the underlying work performed by the Bureau in implementing its protocol. He expressed reservations about the scientific acceptability of some of that work. Nonetheless, he remained confident, with reference to the ability of the Bureau to determine multiple loci matches, that his confidence in the reliability of such findings would be shared by the general scientific community.
I give particular weight to Dr. Caskey’s testimony about the level of acceptance within the scientific community in light not only of his professional standing, activities, and adoption of the F.B.I. protocol and procedures. His continued affirmation in the face of the challenges being made here and elsewhere to the Bureau’s scientific methodology is also a factor in my assessment of the accuracy of his testimony that the general scientific community would accept the adequacy of the Bureau’s procedures.
Dr. Caskey no doubt hopes that the ruling in this case will favor the F.B.I., as it will provide a judicial imprimatur to his own program. But from his perspective, the legal contest is less important, in terms not only of his professional standing, but, as well, for the long-term welfare of his forensic laboratory. Regardless of how his views are received in this or any other court, it is clear that Dr. Caskey has more to lose if his decisions are judged to have been erroneous from a scientific perspective, and if his confidence in the quality of the procedures is proven to have been mistaken. To some extent, therefore, Dr. Caskey has staked his professional reputation on the accuracy of his judgment regarding the scientific quality of the F.B.I.’s procedures.
I conclude that to the extent that that stake affects his testimony, it lends it greater, not lesser credibility, to that testimony. I am persuaded that Dr. Caskey would not, at this point, abide by his confidence in the system and acceptability of its casework among his professional colleagues unless he were confident that his estimate of their view was not only accurate, but as importantly, will be borne out by future developments in this area of scientific development.
If he had doubts in that regard, I find that he would express them because to fail to qualify his assessment at this point, if he had concerns about either the quality of the Bureau’s results or the views of his colleagues, exposes him to a greater loss of scientific standing and stature than if he withholds any such reservations and abides, most unscientifically, by his assessment merely on the hope that somehow it will all work out.
I note that two of the witnesses were confronted with significant errors of professional judgment during the course of these proceedings, and both acknowledged readily that those errors had occurred. Dr. Hartl was shown through cross-examination to have made a mistaken choice of blood types as the basis for his examples in his report and direct examination. Dr. Kidd was shown to have published results of some of his work that likewise were later shown to have been in error.
Both these witnesses acknowledged those errors. I am unable to conclude that Dr. Caskey is any less of a scientist, or that he would be less willing to acknowledge a mistake in his views if he deemed some part of those views to be in doubt or error. Dr. Caskey, in my opinion, would not continue to endorse the F.B.I. program if he concluded, after the flaws described by its critics had been called to his attention on cross-examination, that those criticisms had *204significant scientific merit or might, in time, either be proven to have such merit or would diminish the level of acceptance in the scientific community.
I find that the testimony of Drs. Conneally and Daiger, in addition to expressing their own views about the level of acceptance of the F.B.I.’s protocol and procedures, provides as well a further measure of support to the opinions expressed by Dr. Caskey about the level of acceptance within the scientific community of the F.B.I.’s ability reliably to declare multi-loci matches. They also to some degree have a professional stake, in terms of their future standing among their professional colleagues, in the ultimate outcome of the scientific disputes regarding the F.B.I.’s protocol, procedures, and practices.
Many times witnesses have an interest in the outcome of the proceeding, and consideration of that fact is an important means of evaluating the accuracy of their testimony. The government’s witnesses find themselves in a position of defending a process that is under vigorous attack. As they entered this hearing and throughout their testimony, which touched on all the pertinent subjects of scientific dispute, they had the ability to express reservations about the extent of their colleagues’ approval of the Bureau’s performance of its casework. In that way, they could minimize the potential damage to their professional reputations and stature that will result if they are ultimately shown to have been in error. They chose not to do so, and that decision on their part is a factor in my determination that they accurately express the views of the general scientific community.
Finally, I find Dr. Kidd's comments about the level of acceptance for the F.B. I.’s methods for determining a match also to be persuasive. He, like the other government experts, has had an opportunity during the course of these proceedings to take the criticisms being made by the defense into account, and has formulated responses that satisfy him that the flaws in the Bureau’s system, particularly in regard to band shifting, can and do have no effect on either the ability of the Bureau to make accurate matches or the general acceptance of its procedures among the scientific community.
The fact that other law enforcement laboratories have implemented the F.B.I.’s program and procedures is, as well, some further indication that that program is generally viewed as acceptable by persons concerned with implementing a forensic DNA methodology that can operate reliably. In addition, though I discount in large part the lengthy listing of courts that have, on the basis of substantially less thorough inquiries than occurred in this case, upheld the admissibility of forensic DNA evidence, I give some small measure of weight to the fact that a state court, which was conducting a Frye hearing and heard many of the same witnesses concurrently with our hearing, has concluded that the prosecution met its burden of showing general acceptance in the scientific community. State v. Jobe, SIP No. 33903565 (9/6/90, Hennepin Co., Minnesota) (Gov’t Brief, Doc. 385 Appendix Exh. N).
I likewise find that the F.B.I. method for computing an estimate of the likelihood of encountering such a match in the Caucasian population is generally accepted in the scientific community. In reaching this finding, I give principal weight to the testimony of Dr. Kidd and Dr. Caskey, though I am fully cognizant of the pre-eminent status and stature of Drs. Lewontin and Lander. I do not disregard their testimony or discount its pertinence and persuasive power.
The determination of which of these experts most accurately describes the level of acceptance or rejection in the scientific community of the F.B.I.’s ability to estimate probabilities is the most difficult single decision in this case. Each of the principal witnesses on this issue is within the first rank of his profession; indeed, it is fair to say that each occupies a primary spot within that rank. But that is not the exclusive or determinative factor in assessing the accuracy of their understanding of the views of their professional colleagues *205about the acceptability of the F.B.I.’s method of computing probability estimates.
In trying to make this determination, it is worth noting that the possibility that substructure might exist and thus subject the F.B.I.’s database to challenge is not a suggestion that arose for the first time during this proceeding. Although this was the first occasion that Dr. Lewontin testified about the possibility of substructure and its effects on the accuracy and acceptability of the database, Dr. Lander has made his views known through his writings, such as the Nature article (Exh. Y at 501, 503), Banbury paper (Exh. EE at 148-49), and testimony and report (Exh. 28 at 31-33) in the Castro case. Thus, those views have been available for consideration by scientists attentive to the application of RFLP procedures to forensic uses.
That the views expressed prior to this hearing by Dr. Lander have been considered and taken into account, at least with regard to the formulation of Dr. Caskey’s opinion about the level of continued acceptance in the scientific community, is apparent from his acknowledgement of the existence of substantial dispute about calculating probabilities, his statement that “population geneticists have had considerable controversy in the calculation area,” acknowledgement that “the debate is still open” when asked if “there is still a debate whether you need to have separate data bases for various ethnic and national subgroups of the larger racial population” (IX 271).
I am persuaded, accordingly, that Dr. Caskey has been sufficiently aware of the issues of substructure and their possible effect on the ability to estimate probabilities with a scientifically acceptable degree of accuracy to enable him to consider those issues in the course of both formulating his own course of action in his forensic laboratory and assessing, despite those problems, the level of acceptability within the scientific community of the F.B.I.’s database and method of computing probabilities.
To be sure, Dr. Caskey is principally a molecular biologist and geneticist, who, though familiar with the applicable theories and principles of population genetics, is not as expert in that area as Drs. Lewontin, Lander, and Hartl. And Dr. Lander, is also, like Dr. Caskey, extensively involved at the national level with the effort of the National Academy of Science of Office of Technology to evaluate the implementation of DNA technology to forensic purposes. Thus, Dr. Lander has had considerable opportunity to familiarize himself with the views of the scientific community about these issues, as has Dr. Caskey.
Nonetheless, I remain persuaded that the views of Dr. Caskey, with reference to the issue of the degree of acceptance of the reliability of the F.B.I.’s probability estimates, more correctly describe the level of acceptance within the general scientific community. In making this determination, I take into account three principal factors: first, those factors mentioned above regarding Dr. Caskey’s stake in the ultimate outcome of all the disputes concerning the forensic application of DNA; second, his awareness of the debate about the ability to calculate frequencies accurately; and, third, his continuing reliance on his own, essentially similar database and confidence in the general acceptability of the F.B.I.’s method of calculating probability estimates despite that debate and the disputes on that issue. In light of these factors, I am persuaded that Dr. Caskey’s assessment is somewhat more likely to reflect an accurate understanding of the view of the scientific community than that of Dr. Lander, though I have no doubt that Dr. Lander is equally convinced of the accuracy of his assessment and seeks to testify with the highest possible measure of accuracy.
I am also taking into account the fact that Dr. Kidd shares Dr. Caskey’s perception, though he too is well aware of the challenges to that position- being made by Drs. Lewontin, Lander, and Hartl. Dr. Kidd, like Dr. Daiger, is convinced of the efficacy of the F.B.I.’s “conservative” approach in the fixed bin and related techniques and the potential of that approach for ameliorating, though not entirely overcoming the effects of substructure. They *206remain persuaded that the general scientific community, upon taking all those factors into account, would accept the ultimate number generated by the F.B.I., despite the questions about the accuracy of its data base.
The views of Drs. Daiger and Kidd take the practical aspects of the F.B.I.’s fixed bin approach into account in assessing whether their fellow scientists would share their approval of the Bureau’s ability to estimate probabilities accurately. Dr. Lander, in contrast, dismisses reference to the “conservative” features of the Bureau’s fixed bin approach as “apples and oranges” (XVII 137), and on that basis discounts the potential of that approach to ameliorate the effects of substructure (XVII, 137-38,153). In my opinion, the government’s witnesses, by assuming that their colleagues would, like them, take the “conservative” features into account (while likewise considering the views of Drs. Lewontin (Exh. Ill) and Hartl NÑN), are in this regard somewhat more likely to perceive the views of those colleagues more accurately.
Finally, I am taking into account the reservations that Drs. Conneally (V 105; V 203) and Daiger (VII 20) acknowledge that they would have if they were persuaded that the “referent” population were not in equilibrium. This suggestion that they would be willing to change their conclusions if they were convinced of the error of their underlying assumption adds a measure of plausibility to their views about the attitudes of their professional colleagues.
I conclude, consequently, that, despite the prestige, standing, and expertise of the witnesses who share the view that the scientific community could not and would not find the F.B.I.’s database and resulting probability estimates acceptable, the view of the government's witnesses about the level of acceptance, when all factors are taken into account, is more likely to be the accurate view.
Thus, I conclude that as to both pertinent issues concerning the F.B.I.’s procedures it is more likely than not that the general scientific community accepts the reliability and scientific suitability of the F.B.I.’s protocol and practices. To be sure, there is present in this case a degree and intensity of disagreement on that issue that is not encountered in any of the cases in the Sixth Circuit, and which is approximated, if at all, only by the voice spectrogram cases. Scientists of indisputable national and international repute and stature, aided and confronted by lawyers of unusual skill and understanding of the issues, took diametrically opposed views on the issue of general acceptability, and those views reflected the division of opinion on the merits of the underlying scientific disagreements.
In my effort to comprehend the disagreements and come to some assessment about the extent to which the community of molecular biologists and population geneticists would view the scientific acceptability of the F.B.I.’s protocol and procedures for its limited but important purposes, I have tried to give careful attention to each of the disputants, and to reach a decision that accurately reflects the level of that acceptance.
In the last analysis, I am persuaded that the views of the prosecution’s witnesses more accurately project the extent to which their professional colleagues would concur that, despite the unfortunate, and to some extent unjustifiable flaws, the F.B.I. is able to declare matches accurately and provide a scientifically acceptable estimate of the resulting probabilities.
v. Alternative Findings re. Reliability
Throughout this opinion, I have expressed the firm view, which I believe is mandated by the law of the Sixth Circuit, that this court’s function is not to adjudicate the merits of the underlying scientific disputes. To do so would be to disregard our Circuit’s consistent recitation of and reliance on the Frye/Green standard, because to adjudicate the merits of those disputes necessarily results in a determination of admissibility that is based on reliability.
In the event that my interpretation of Sixth Circuit doctrine is in error, I will express the following factual findings on the scientific disputes in the alternative, for the court’s consideration if it deems such *207findings to be an element of the Frye/Green inquiry.
With regard to the issues relating to the ability reliably to determine a match, I am persuaded by a preponderance of the evidence that the Bureau’s procedures, even with all their flaws and defects, can, in fact, reliably discern matches across multiple loci. I reach this determination despite being also persuaded of the possible occurrence of band shifting.
The defendants argue that the F.B.I.’s procedure is flawed at its very basis, due to the use of a quasi-continuous allele system for forensic DNA typing. I disagree that the Bureau’s selection of this form of DNA typing, rather than a discrete allele system, constitutes a systemic flaw that affects its ability reliably to determine matches.
Unquestionably, clinicians who use discrete allele systems for diagnostic work are not confronted with the problems of discerning a match that arise with quasi-continuous allele systems with their high number of genetic fragments per sample and the limitations of the resolution power found in the F.B.I. laboratory.
These objections, in my opinion, go to weight and not admissibility. . The F.B.I.’s fixed bin approach accommodates weaknesses in the power of resolution by establishing categories of bands. Use of the fixed bin structure and the +/— 2.5% match window compensate adequately, in my opinion, for the problems caused by the initial decision to use VNTRs. I heard no testimony and have seen no exhibit sufficient to persuade me that selection of the quasi-continuous allele system was such a mistaken choice that, either standing alone or in conjunction with the other problems to which the defendants directed their attention, the system is thereby rendered incapable of producing reliable results.
The defendants raise several challenges to the scientific adequacy of the F.B.I.’s validation studies. It is clear that Dr. D’Eustachio’s comparison of the data from Table III of the Fixed Bin Paper and the May, 1990, casework raises troublesome questions about the quality of the Bureau’s work with either or both of those sets of gels. Dr. Budowle did not respond persuasively to Dr. D’Eustachio’s criticisms, and he refused to acknowledge the potential significance or merit of a competent scientist’s critique and to consider the desirability for further experimentation and confirmation.
I understand the importance, from a scientific standpoint, of adequate validation studies that prove the ability reliably and reproducibly to obtain like results in like circumstances. But the issue for determination at this stage is whether the defects pointed to by the defendants on the basis of Dr. D’Eustachio’s comparison of casework with the Table III materials undercuts the other evidence that establishes by a preponderance of the evidence that the F.B.I. can reliably determine true matches and avoid false positives. Regardless of the qualms raised by Dr. D’Eustachio’s evaluation, report, and testimony, I am persuaded that the defects in the validation study, like the other deficiencies in the operation of the Bureau’s laboratory, do not affect its ability reliably to make accurate determinations of matches and avoid false positives. The defects perceived by Dr. D’Eustachio go to weight and not admissibility.
I reach the same determination with regard to the similar criticisms by Dr. D’Eustachio of the Bureau’s mixed body fluid and environmental insult studies, although Dr. D’Eustachio’s review of the deficiencies with those studies is cogently, comprehensively, and correctly critical of the Bureau’s design and implementation of these studies. The issue, with regard to admissibility, is whether this Court can be persuaded, despite the perceived and putative flaws in the implementation of these important studies, that, nonetheless, the Bureau can reliably declare a match with multiple probes. I am persuaded by a preponderance of the evidence that it can.
The defendants challenge the F.B.I.’s selection of a larger match window than other practitioners, including Dr. Caskey, have selected. Particularly in light of the government’s acknowledgement that all but a small minority of casework matches *208fall within a smaller range (85% within a 1.75% range (Exh. 46)), the F.B.I.’s use of the larger window invites the criticisms made of it by the defendants.
I conclude that those criticisms do not affect the ability of the F.B.I. to make reliable matches and avoid false positives across multiple loci. As I read the record, aside from the obvious fact that the larger window results in a capture of a wider span of alleles, none of the witnesses testified that an unnecessarily large window increased the likelihood of erroneous matches with multiple probes.
Moreover, as I understand the technology, defendants' objections to the match window, and applicable law, defendants who would be outside a smaller window but are within the F.B.I.’s larger window can make that point clear at trial. Thus, the issue of the match window clearly goes to weight and not admissibility.
The problems manifest in the “Repeat Caucasian Database,” as outlined in Dr. Hartl’s report,' are not satisfactorily explained by the government’s witnesses. Nor are they overcome by the government’s successful cross-examination of Dr. Hartl on his mistaken selection of the MN blood group as a basis for his examples in other portions of his report.
But with regard to the impact of this information on the issue of the ability to declare matches across multiple loci, I again conclude that this data and its analysis by Dr. Hartl do not justify a finding that the F.B.I. does not have such ability. The question is whether the government has shown by a preponderance of evidence, where multiple probes are used, that there is no significant risk of a declaration of a false match or positive.
The flaws to be inferred from Dr. Hartl’s report show, in contrast, that there may be a likelihood of false exclusion. Such potential for error should be troublesome from a law enforcement standpoint, but in view of the limited issue before this Court—the reliability of multiple loci matches—it is not persuasive from a legal standpoint.
The F.B.I.’s failure to implement a comprehensive program of effective proficiency testing likewise goes at most to weight rather than admissibility, where the issue is the ability reliably to declare matches with multiple probes. The defendants have persuasively established, and the government has not rebutted, the fact that the F.B.I. program of proficiency testing has serious deficiencies, even without consideration of the troubling hint in the record of an impulse at one point to destroy some of the small amount of test data that had been accumulated earlier. The contentions about the absence of a meaningful proficiency testing program seem to be the sort of dispute about “technique” that the Sixth Circuit in United States v. Stifel, 433 F.2d 431, 438 (6th Cir.1970), stated “went to the quality of the evidence and were for the jury.”
With regard to the testimony of Dr. Hagerman about the effects of ethidium bromide, I find that there can be little doubt that there is a likelihood of band shifting that can result from the use of ethidium bromide, just as the defects in the validation, mixed body fluid, and environmental insult studies suggest that band shifts can occur from other causes. However, even accepting the likelihood of band shifting in some instances, I find that the likelihood of multiple shifts resulting in a match to be so slight as to be a matter of weight and not admissibility.
Like the F.B.I.’s selection of a wider match window than any other forensic DNA laboratory, its continued use of ethidium bromide invites sound scientific criticism of the sort provided by Dr. Hagerman. On the other hand, that criticism does not overcome the unlikelihood of a false match being declared over multiple loci. Accordingly, the ethidium bromide issue is also a matter of weight, not admissibility.
Though the issue of the effect of band shifts on the Bureau’s ability to declare multiple loci matches reliably may be, as Dr. Lander testified, a question of population genetics rather than of molecular biology, I find Dr. Kidd’s testimony to be persuasive in this regard, along with that of Dr. Daiger and Dr. Caskey. I note, as *209well, that an article by experts who generally are acknowledged as critics of the F.B. I.’s procedures states that “there is a possibility, albeit low, that band shift will bring non-matching bands into alignment.” Thompson & Ford, The Meaning of a Match: Source of Ambiguity in the Interpretation of DNA Prints, Forensic DNA Technology (In Press) (Exh. WV at 21) (emphasis supplied). I disagree that the forcefulness of the testimony and rationale advanced by the government’s experts is overcome by the analysis found in the Addendum # 1 (“The Multiple Locus Ploy”).
In that addendum, defendants present five problems that, they argue, negate the impact of the government’s claim that DNA profiling using several probes will minimize the possibility of false matches.
Defendants claim that if the F.B.I.’s proposition is taken seriously, there should be a restriction on the minimum number of probes or the minimum number of matched bands that the F.B.I. must use before being entitled to declare a match. Yet, by its very formulation, defendants’ argument on this point goes to the question of the weight of the evidence and not its admissibility. That there needs to be a minimum number of probes or bands that must be run before the evidence can be presented to the jury is a proposition of dubious distinction. Indeed, if defendants’ worst case scenario were to occur, and the proponents introduced into evidence a DNA profile resulting from only one probe, the statistical impact of that profile would be of only marginal significance, and the likelihood that the defense could cast doubt on the credibility of the evidence would be enhanced. Defendants’ argument is a sword that cuts in both directions.
Defendants next present Dr. Lander’s assertion concerning the issue of the likelihood of a false match over a larger number of probes compared to the likelihood of a false match over a fewer number of probes. Dr. Lander stated that this issue is implicitly a question of population genetics. As the Court understands Dr. Lander’s analysis, he appears to be saying that because the significance attached to a match is primarily a function of population genetics, without knowing more about the statistical significance to be attached to the location of a given band, the statistical likelihood of band shifting due to the various factors that affect band mobility, and the interrelationship of band shifting and band frequency, it is impossible to say whether a match across one or two loci is more likely than a match, across three or more loci. Thus, he asserts, it is impossible to say that the statistical significance of band shifting is greater when fewer probes are run than when more probes are fun.
Yet the government’s witnesses have stated that false positives over two, three, or more probes are highly unlikely to the same extent that the random occurrence of two different persons (other than identical twins) having identical DNA profiles is unlikely. Drs. Conneally and Caskey testified about the negligible possibility of band shifting and other electrophoresis anomalies producing identical profiles with multiple probes in individuals whose DNA patterns are actually different.
Dr. Kidd addressed the question of the likelihood of band shifts resulting in a false match over several probes by graphically illustrating the operative conditions that would have to prevail for this to happen. Dr. Kidd expanded upon the issues implicit in the false-match-over-multiple-loci problem by describing how the true underlying DNA pattern of the actually-different-but-apparently-identical suspect would have to be proportionately distributed over several loci in a manner that the same degree of band shifting that moved one of the suspect’s bands into a false match with the sample band would have to move all other of the suspect’s bands into a false match with the equivalent sample band.
In other words, all the bands in the forensic and suspect’s profiles would have to be so uniquely located that the band shifting—which caused an alteration of all the band positions to an equal extent and in an equal direction—resulted in a false match. According to Dr. Kidd, this can happen only when the “true” band positions across all the loci are all equally distant above or *210are all equally distant below the bands into whose position they move. This according to Dr. Kidd is as unlikely as the possibility of the random occurrence of a “true” match.
Furthermore, even if, as defendants argue, band shifting does not cause all the shifted bands to alter their positions to an equivalent extent along the gel, and some shifted bands move further than other shifted bands, the same initial (but now more complex) distribution of “true” positions relative to apparent/shifted positions would have to obtain. In either case, Dr. Kidd’s analysis would be applicable. And, in either case, the question of band shifting into a false match would be an issue of the credibility of the evidence for the jury to resolve.
Defendants’ third objection to the “multiple locus ploy” comes in the form of the necessity of doing proper reproductibility studies to determine the effects of DNA concentration, partial digestion, environmental insults, and other features of the technique of forensic DNA profiling on band shifting.
As discussed in the preceding paragraph, I conclude that these defects go to weight rather than admissibility. Defendants’ next suggest that the size of the F.B.I.’s match window and possible operator error create an increased likelihood of false matches.
I conclude, therefore, that the government has met its burden of proving by a preponderance of the evidence that its procedures can reliably determine matches over multiple loci. In reaching this finding, I do not either disregard or discount the accuracy of many of the criticisms about the remarkably poor quality of the F.B.I.’s work and infidelity to important scientific principles.
I likewise am persuaded that it is more likely than not that the F.B.I.’s probability estimates are reasonably accurate, and that the potential impact of substructure on the accuracy of such estimates is, in the final analysis, a matter of weight for the jury to consider, rather than of admissibility. Unquestionably, the testimony by Drs. Lewontin and Lander was powerful and persuasive, and there can be little doubt that population substructure exists in the United States and has not been taken directly into account in assembling the F.B.I. population database.
On this issue, and despite my acceptance of the proposition that research must be undertaken to devise a means of responding more fully to the possibilities of substructure, I simply conclude in light of all the evidence of record that the ultimate likelihood that substructure will prejudicially distort a probability estimate is sufficiently slight that such potential distortion is a matter of weight, not admissibility.
I am persuaded that the “conservative” approach and “corrective measures” of the fixed bins, wider bin than match window, collapsing bins, and allocation of borderline alleles to the bins with the larger frequencies ameliorate, though they do not fully overcome, the impact of ethnic-dependent VNTR variation when and if it occurs. I find that the defendant-favorable factors that these features represent provide a compensating effect, despite the fact, as noted by Dr. Lander, that that may not have been the purpose in their original design. What matters in this regard is not purpose but outcomes, and these features contribute to the overall reliability of the results reached by the F.B.I.
I am persuaded, in light of the testimony of Dr. Kidd about the probable infrequency of a defendant-adverse effect of substructure and the possibly low magnitude of any such effect when it occurs, that the issue of substructure is one of weight, not admissibility. I am not persuaded by the testimony of Drs. Lewontin and Lander, despite their clear pre-eminence and their manifest integrity and conviction, that the legal issue on the question of substructure is one of admissibility rather than weight.
In that regard, I note the distinction between the scientific concern with a very high degree of accuracy and the lower legal standard for permitting evidence to be considered by a jury. As the Sixth Circuit has noted consistently since its decision in United States v. Stifel, 433 F.2d 431, 438 *211(6th Cir.1970), a “lack of absolute certainty in a test” does not render it inadmissible. Accord, United States v. Brown, 557 F.2d 541, 556 (6th Cir.1977) (“the lack of certainty went to the weight to be assigned to the testimony of the expert, not its admissibility”); United States v. Brady, 595 F.2d 359, 363 (6th Cir.1979) (same).
To be sure, the potential effect of substructure cannot be known or even estimated in any given case, and their is no factor that rationally and indisputably will compensate for its presence. That fact invites citation of and reliance on State v. Sneed, 76 N.M. 349, 414 P.2d 858, 862 (1966), involving rejection of an estimate that a particular name would occur once in 3,000,000 telephone entries on the basis that the evidence required the court “to speculate on the validity of the estimates.” Refusing to do so, the court stated that “mathematical odds are not admissible as evidence to identify a defendant in a criminal proceeding so long as the odds are based on estimates, the validity of which have not been determined.” Id. 414 P.2d at 862.
Probability testimony was also condemned in Miller v. State, 240 Ark. 340, 399 S.W.2d 268, 270 (1966) as “unsubstantiated” and “speculative” without an adequate foundation, and likewise in People v. Collins, 68 Cal.2d 319, 66 Cal.Rptr. 497, 438 P.2d 33 (1968), because the expert was unable to give any basis for his estimate of the frequencies of the allegedly independent events of a biracial couple, where the man had a beard and the girl blond hair with a ponytail, and the couple had departed the scene of a crime in a partially yellow car.
Other courts have expressed reservations about probability estimates due to the possibility of prejudice by misleading or confusing the jury. United States ex rel. DiGiacomo v. Franzen, 680 F.2d 515, 518 (7th Cir.1982); People v. Harbold, 124 Ill. App.3d 363, 79 Ill.Dec. 830, 845, 464 N.E.2d 734, 749 (1984) (testimony based on blood marker frequencies excludable); Davis v. State, 476 N.E.2d 127, 134 (Ind.App.1985); Commonwealth v. Drayton, 386 Mass. 39, 434 N.E.2d 997, 1005 (1982) (error to admit testimony re. probability of fingerprint being one out of 387 trillion); State v. Carlson, 267 N.W.2d 170, 176 (Minn.1978) (probability estimate excluded re. blood markers); State v. Woodall, 385 S.E.2d 253, 261 (W.Va.1989). In Minnesota, these concerns have led, as a result of a series of cases that includes Carlson, supra, to a ruling that no probability estimate is permissible with regard to DNA evidence. State v. Schwartz, 447 N.W.2d 422, 428 (Minn.1989).
The consistent basis for rulings that either exclude probability estimates or express reservations about such evidence is that the estimate of frequencies on which the computation is made is speculative. As the court stated in Davis, supra, “when unsubstantiated estimates are used in probability calculations, speculation is presented to the jury clothed in scientific accuracy,” so that “the prejudicial impact clearly outweighs the probative value.” 476 N.E.2d at 134.
This is, of course, the concern in this case: namely, that the inability to assign a numerical value to the likelihood of substructure deprives the F.B.I. Caucasian population database of any claim to a reasonable degree of accuracy, and shifts its figures from the realm of the reliable into a region of speculation.
The Rule 403-style analysis referred to in Davis is, in my opinion, an appropriate consideration. But, as noted in the ensuing subsection, a Rule 403 analysis can and should be undertaken only in light of the facts of each particular case, and should not be looked to as a basis for exclusion of all probability estimates.
In light of Dr. Kidd’s testimony about the improbability that substructure will significantly prejudice any given defendant, I conclude that the Caucasian database remains a reasonable basis for computation. On the basis of Dr. Kidd’s testimony, I conclude that in most cases substructure will not play any role. In many of the minority of cases in which it will play a role, that role will be in the defendant’s favor (though neither he nor anyone else will know that). In the remaining cases, *212the impact will disfavor the defendant, but the magnitude of the distortion appears reasonably likely to be slight because the degree of ethnic-dependent variation may be minimal or modest and the conservative and corrective consequences of the fixed bin structure and its components may buffer its impact. Though in some small number of cases there may be a distortion of considerable proportion and resultant prejudice, I conclude that admissibility should not be foreclosed on the basis of that possibility.
Limitations on the state of our understanding of the presence and effect of ethnic-dependent variations among VNTRs is, I conclude, a matter relating to certainty, and not a circumstance that causes the F.B.I.’s database to produce probability estimates on the basis of speculation. Consequently, I conclude that the issue of the effect of population substructure on the frequencies of the alleles in the Caucasian database on the F.B.I.’s probability estimates is an issue of weight, not admissibility-
This conclusion is supported by cases involving similar uncertainties in statistical evidence. A decision of this Court held that questions about the adequacy of a database for statistical testimony (albeit of a type unrelated to the probability evidence in this case) goes to weight and not admissibility. Old West End Ass’n v. Buckeye Fed’l Savings & Loan, 675 F.Supp. 1100, 1106 (N.D.Ohio 1987) (McQuade, J.). More pertinently, cases involving the somewhat similar use of blood type frequencies as a basis for probability estimates have consistently taken the position that challenges to the statistical foundation of such evidence go to weight. See State v. Washington, 229 Kan. 47, 622 P.2d 986, 995 (1981); Commonwealth v. Gomes, 403 Mass. 258, 526 N.E.2d 1270, 1280 (1990) (issue of substructure goes to weight); State v. Chavez, 100 N.M. 730, 676 P.2d 257, 260 (N.M.App.1983); Plunkett v. State, 719 P.2d 834, 841 (Okla.Crim.App.1986).
I understand the differing numerical magnitudes that are reached with probability estimates in cases involving blood markers, so that cases relating to such markers are not entirely controlling. But the principle that issues about statistical evidence, once such evidence has been shown to “be based on empirical scientific data, rather than unsubstantiated estimates,” Davis, supra, 476 N.E.2d at 135, are matters of weight, not admissibility, remains the same. And, on the basis of Dr. Kidd’s testimony, I conclude that it is more likely than not that population substructure will play such an insignificant role in the F.B. I.’s probability estimates that its database consists of the requisite “empirical scientific data, rather than unsubstantiated evidence.”
3. Rule 403 Issue
The fourth factor of the Green expression of the Frye standard is “probative value compared to prejudicial effect.” United States v. Green, 548 F.2d 1261, 1268 (6th Cir.1977). “This balancing” under Green, the court stated in United States v. Smith, 736 F.2d 1103, 1107 (6th Cir.1984), “is identical to Rule 403 balancing.” Under Rule 403 relevant evidence may be excluded “if its probative value is substantially outweighed by the danger of unfair prejudice.”
The Frye doctrine developed, as the Sixth Circuit noted in United States v. Brown, 557 F.2d 541 (6th Cir.1977), out of the same concerns that led to the adoption of Rule 403: namely, the concern that lay jurors might be misled by testimony that was unfairly prejudicial, confusing, or misleading. “In recognition of the inherent danger that expert testimony,” absent a proper foundation “may tend to confuse or mislead the trier of fact and thus defeat a defendant’s right to a fair trial,” the court stated in Brown, “we adopted the four criteria for reviewing a district court’s decision to admit expert testimony” in Green. Id. at 556.
Such precautions, the court noted, have historically been viewed as necessary when a defendant is confronted with scientific evidence that bears an “ ‘aura of special reliability and trustworthiness,’ ” Id., as “possible prejudicial dangers [are] inherent in any expert scientific testimony at trial.” United States v. Brady, 595 F.2d 359, 362 *213(6th Cir.1979). When the Sixth Circuit adopted the Green formulation, it noted that the factor that is “unquestionably the most important from the point of view of the criminal defendant [is] the potential prejudicial impact of the expert testimony upon the substantial rights of the accused.” 548 F.2d at 1268.
With regard to DNA evidence, one court has taken the absolutist position that the prejudicial impact of probability estimates is so substantial that such estimates are not admissible. State v. Schwartz, 447 N.W.2d 422, 428 (Minn.1989). Otherwise, courts appear to treat DNA evidence like other forms of highly persuasive scientific proof. See Martinez v. State, 549 So.2d 694, 697 (Fla.App.1989). See also Doc. 385 Appendix Exhs. J, K, P, R, U) (unreported state trial court decisions).
In this case, there is no evidence in the record concerning the government’s DNA evidence against the defendant Bonds. Thus, in my opinion, the record is not ready for Rule 403 analysis, which would appear of necessity to be case- and fact-specific.
I make, accordingly, no recommendation with reference to the balance between the probative evidence and its prejudicial impact in this case.
Conclusion
Upon review of the testimony, exhibits, arguments of counsel, and applicable legal doctrines, it is
RECOMMENDED THAT the government’s motion to admit DNA evidence be granted; and that the defendants’ motion to exclude such evidence be denied.
*214APPENDIX
Chapter 1
Summary, Policy Issues, and Options for Congressional Action
Genetic uniqueness is a fact of life. From generation to •" generation, characteristics are inherited, combined, assorted, and reassorted among individuals through a common denominator: the chemical deoxyribonucleic acid, or DNA. And, except in the case of identical twins, no two humans share the same DNA sequence.
This report is about technologies used to distinguish the DNA among individuals. It is about techniques to identify and prosecute violent criminals, as well as exonerate innocent persons who are suspects in criminal cases. To . a lesser extent, it is about applications that use the same techniques to determine parentage or identify and reunite missing children with relatives. Undertaken at the request of the Senate Committee on Labor and Human Resources, this assessment evaluates the scientific, legal, and ethical issues surrounding forensic applications of DNA tests: the validity and reliability of DNA tests for forensic casework, quality assurance and standards for DNA analysis by forensic laboratories, the legal basis for the admissibility of such tests, in courts of law, privacy and civil liberties concerns about collecting, using, and storing genetic information and material, and criminal justice interest in employing DNA tests at the Federal, State, and local level.
TERMINOLOGY
Forensic science involves the application of many scientific expertises (e.g., biology, chemistry, toxicology, medicine) to situations concerned with courts of justice or public debate. This report uses the term forensic applications to refer to potential uses of recombinant DNA technologies to identify individuals.
The increased acceptance and popularization of recombinant DNA techniques for forensic uses, especially criminal investigations, have led to some confusing terminology. In particular, some commentators have adopted the terms “genetic fingerprinting,” “DNA fingerprinting,” or “DNA prints” as generic phrases to describe all techniques, while others úse the terms to describe specific techniques by specific companies. This report uses the terms DNA testing, DNA identification, DNA analysis, DNA typing, and DNA profiling to describe the two current and any future technologies, the practical goal of which is unique association or exclusion determined by DNA-based tests.
DNA AND HOW IT DIFFERS FROM PERSON TO PERSON
As the chemical dispatcher of genetic information, DNA’s structure resembles a twisted ladder, referred to as a double helix (figure 1-1). DNA in all organisms consists, in part, of four chemical subunits commonly called bases. These four bases—guanine (G), adenine (A), thymine (T), and cytosine (C)—are the genetic alphabet. Their unique order, or sequence, in the DNA helix serves as the blueprint for an organism. Of the 3.3 billion base pairs making up a human
Figure 1-1—The DNA Double Helix
[[Image here]]
SOURCE: Office of Technology Assessment, 1990.
*215Figure 1-2—DNA Patterns From 12 Individuals
[[Image here]]
In this mock-up to demonstratethat DNA patterns differamong individuals, blood samples were obtained from 12 different people and RFLP analysis performed using 1 single-locus probe. Although some individuals do share 1 band in common, aif 12 exhibit different patterns overall.
SOURCE: Federal Bureau of Investigation, 1989.
blueprint, only a fraction—approximately 3 million—differ between any two individuals.
Several methods to detect DNA differences exist; the majority of DNA tests currently used in forensic applications detect some of these differences through DNA probes that reveal size variations. Scientists measure these size distinctions between people through a process called restriction fragment length polymorphism t RFLP) analysis (figures 1-2 and 1-3)1. Although the specific protocols used for RFLP analysis vary from laboratory to laboratory, the vast maturity of forensic casework carried out today nn ol\ es this basic approach.
Another technology, polymerase chain reaction (PCR), can be thought of in some respects as. molecular photocopying (figure l-4i. PCR itself is not used to directly analyze DNA. rather it makes possible the'application of other techniques when only minute biological specimens are available. PCR allows a scientist to take a sample of what ordinarily would be insufficient DNA to assess, and reproduce it until enough DNA copies are available for examination by a number of technologies, including RFLP analysis. Chapter 2 describes details of RFLP analysis and PCR.
DNA is found in all body cells except red blood cells. (Blood contains many cell types in addition to red blood cells, such as white blood cells, and it is from these cells that DNA can be obtained when forensic evidence is a bloodstain.) With few exceptions, the composition of a person’s DNA does not vary from cell to cell, except in egg and sperm cells, which have half the complement of DNA present in other body cells. (Although DNA content differs from sperm to sperm, a DNA profile of semen—e.g., from evidence in a rape case—is a composite of thousands ofDNA molecules from thousands of
*216Figure 1-3—Detailed Schematic of Single-locus Probe RFLP Analysis
[[Image here]]
*217Figure 1-4—The Polymerase Chain Reaction
[[Image here]]
SOURCE: Office of Technology Assessment, 1990.
sperm and therefore reflects a man’s overall profile (figure 1-5).) Thus a scientist can examine DNA from blood or tissue from a hair root and, if the specimens are from the same person, find the same DNA banding pattern. Similarly, patterns can be matched between DNA isolated from sperm on a vaginal swab or a semen stain and a known blood sample from a suspect.
THE ROLE OF DNA TYPING IN FORENSIC IDENTIFICATION
Traditional genetic markers, such as ABO blood groups, have been used in forensic casework since the turn of the century. Conventional markers available to forensic analysts provide the potential for a high degree of discrimination among different individuals, but the upper limit is attained infrequently, in part because of the instability of some of these markers in dried and aged evidence stains. Thus, in practice, the individualization of many evidentiary stains cannot be carried out to any great extent given the present array of conventional serological landmarks. In general, traditional genetic tests used in forensic casework also, at best, can associate an unknown sample with a suspect specimen at a level of 90 to 95 percent inclusion.
Forensic applications of DNA tests involve two components: molecular biology and population genetics. Molecular biological techniques allow analysts to directly examine the material responsible for heritable differences among humans, i.e., DNA. Population genetics, also a part of traditional forensic genetic testing, is used to interpret DNA tests to approximate the degree to which two samples are associated by greater than random chance. Like traditional genetic tests, DNA typing is used in the forensic context to determine whether biological material from a known individual can be linked to a sample from an unidentified specimen (i.e., whether the individual can be included in or excluded from the population of humans who could have deposited the biological material). Yet unlike traditional genetic testing, DNA typing technologies—one of which was first used in a criminal case in the United Kingdom
*218Figure 1-5—Example oí One DNA Pattern in a Rape Case
[[Image here]]
[[Image here]]
Known Evidence
Biological evidence from this rape case was separated by laboratory techniques into separate male and female fractions. After RFLP analysis of these fractions and known samples obtained from the victim and suspect, the results reveal that—for this particular probe—the DNA pattern of the male fraction matches the pattern of the suspect.
SOURCE: Fedsral Bureau of Investigation, 1969.
(box 1 - A)—have been heralded as forensic tools that will change the judicial landscape.
It is the population dynamics of DNA markers that separates it from the use of conventional genetic markers in forensic analysis. With DNA markers, much greater variation exists and can be detected—hence their potential for what amounts to statistical individualization when a combination of markers is examined. That is, because the assortment of genetic markers detected by DNA tests is great, a sufficiently detailed examination of DNA patterns can yield a result that effectively amounts to a positive identification between a questioned sample and a suspect sample. By the same token, because DNA markers do vary so much, exclusion of innocent suspects can be easier to achieve.
Forensic DNA analysis can provide more definitive and objective evidence to ascertain the innocence or guilt of an individual— especially compared to subjective evidence such as eyewitness testimony.
Forensic applications of DNA techniques are not limited to criminal investigations. Their use in parentage testing (figure 1-6), the identification of unknown remains, human rights abuses, and immigration has been successful. And as more information is gained through genetic research, including efforts to map and sequence the human genome, the range of applications, of information gained, and of technologies involved in forensic uses of DNA tests is likely to increase.
. Frye v. United States, 293 F. 1013 (D.C.Cir. 1923).
The illustration of RFLP analysis involves the use of a type of DNA probe called a single-locus probe. A similarly performed analysis uses a different type ofDNA probe, called a multilocus probe, which is only applied in some paternity cases in this country. Chapter 2 describes the differences between the two approaches. For purposes of this report—with the exception of chapter 2—RFLP analysis refers only to the use of single-locus probes.