United States v. Jakobetz

747 F. Supp. 250 (1990)

UNITED STATES of America
v.
Randolph B. JAKOBETZ.

Cr. A. No. 89-65.

United States District Court, D. Vermont.

September 20, 1990.

Charles A. Caruso, Asst. U.S. Atty., Rutland, Vt., for U.S.

William K. Sessions, III, Sessions, Keiner, Dumont & Barnes, Middlebury, Vt., for defendant.

OPINION AND ORDER

BILLINGS, Chief Judge.

The singular issue before the court is whether DNA profiling is admissible in a criminal case when proffered by the prosecution to prove identity.[1] The defendant is charged with kidnapping. The United States claims that he abducted a woman from an Interstate 91 rest area in Westminster, Vermont, forced her into the back of a tractor-trailer truck, drove to an unknown location, raped her, and ultimately released her in the New York City area. The defendant has moved to prohibit any evidence of DNA profiling on the basis that it is unreliable and unfairly prejudicial. For the forthcoming reasons, the court concludes that DNA profiling is a reliable scientific technique that was properly applied in this particular case and that its probative value is not outweighed by the danger of unfair prejudice. The defendant's motion to prohibit introduction of DNA profile evidence is therefore DENIED.

*251 I. INTRODUCTION

Deoxyribonucleic acid (DNA) is found in all nucleated cells and contains coded information that provides the genetic blueprint for all living things.[2] DNA is contained in packages called chromosomes. An individual has twenty-three pairs of chromosomes, one-half of which are inherited from each parent. Every cell of a particular individual contains the same configuration of DNA.[3] The important feature of DNA for forensic purposes is that, with the exception of identical twins, no two individuals have identical DNA.

A molecule of DNA is shaped like a double helix and thus resembles a twisted ladder. The rails of the ladder are comprised of repeated sequences of phosphate and deoxyribose sugar. For the purposes of DNA profiling, the critical components of the ladder are the rungs. Each rung is composed of one pair of the following four organic bases: Adenine (A), Guanine (G), Cytosine (C), and Thymine (T). Because of their chemical composition, A will only attach to T and C will only attach to G — no other combination will normally occur. Thus, the order of the bases on one side of the rung of the DNA ladder determines the order on the other side.

Each rung on the DNA ladder is called a "base pair" or "base sequence." The sequence of these base pairs as they occur up and down the DNA ladder provides distinct instructions to the cell. A specific sequence of base pairs that is responsible for a particular trait is called a gene. There are roughly three billion base pairs in a single molecule of DNA and if the tightly coiled DNA molecule were unraveled, it would measure approximately six feet in length.

Because human beings share more biological similarities than differences, our DNA molecules — that is, our base pair sequencing — are in large part (99%) the same. Consequently, it is the areas of the DNA molecule that differ from individual to individual that are significant in the forensic setting. These areas of variation are called "polymorphisms" and are the basis for DNA identification. The length of each polymorphism is determined by the number of repeat core sequences of base pairs. The core sequence is called a Variable Number Tandem Repeat (VNTR) while the total fragment length is called a Restriction Fragment Length Polymorphism (RFLP). Alternative forms of RFLPs are called alleles.[4] A site or locus of a DNA molecule is polymorphic when the number of core sequences of base pairs varies from individual to individual. A locus is a particular location on the DNA molecule where a specific VNTR or core sequence occurs. Of the approximately three billion base pairs contained in one DNA molecule, roughly three million are thought to be polymorphic. RFLPs have no known function and are sometimes referred to as "anomalous" or "junk" DNA.

While some RFLPs exhibit only two alternative forms (specific sequences of base pairs), others are hypervariable and have many alternative forms. Because it is impractical to examine all the polymorphic regions of the DNA molecule, DNA profiling focuses on several highly polymorphic or hypervariable segments of DNA. A hypervariable locus will have the same core sequence of base pairs but will differ in length because varying numbers of the VNTRs are linked together. Though a person does not have a unique polymorphic area at any one locus, the frequency with which two people exhibit eight or ten of these alleles at four or five different loci is significantly lower.

The Federal Bureau of Investigation (FBI) and other labs that employ DNA *252 profiling utilize a process called Restriction Fragment Length Polymorphism Analysis to obtain a DNA profile from forensic samples. RFLP analysis involves essentially six steps:

1. Extraction of DNA. The DNA is first extracted from the evidentiary sample and purified. Eighty percent of the samples the FBI analyzes are vaginal swabs from rape cases. Accordingly, the sample must be fractionated to separate the female component from the male component.

2. Restriction or Digestion. The DNA is then "cut" by the use of chemical scissors that are also known as restriction endonucleases. This process severs the DNA molecule at all sites along the three billion base pair length of the molecule where the targeted base pair sequence occurs. To reiterate, the length of these cut segments may or may not differ among individuals, depending upon how many core sequences or VNTRs an individual has at a specific locus.

3. Gel Electrophoresis. At this point, the fragments of molecules created by the restriction enzymes are sorted by length by a process known as gel electrophoresis. This procedure entails placing the sample in an agarose gel that is then electrically polarized. Because DNA is negatively charged, the RFLPs will migrate toward the positive end of the gel. The distance traveled will be dependent upon length; thus, the shorter fragments, which are lighter and less bulky, will travel further in the gel. Fragments of known base pair lengths, called molecular weight markers, are placed in separate lanes to allow the measurement of RFLPs in units of base pairs. Several different samples are run on the same gel but in different tracks or lanes.

4. Southern Transfer. Because the agarose gel is cumbersome to work with, it is necessary to transfer the RFLPs to a more functional surface. This method is known as Southern Transfer. A sheet of nylon membrane (nitrocellulose sheet) is placed in contact with the gel and through capillary action, the RFLPs move on to the membrane. The end result is that the RFLPs are permanently fixed in their respective positions. During this step, the RFLPs are also split through denaturization, a process that in effect cuts the fragments of the DNA molecules lengthwise along each base pair so that the base pairs are separated into two strands.

5. Hybridization. Next, a radioactive probe or marker is used to locate a specific locus of a polymorphic region of the DNA. The probe is a single stranded segment of DNA that is designed to complement a single stranded base sequence of a RFLP. Because the probe contains the corresponding half of the core sequence for the RFLP that was split in two, it will bond with RFLPs of all sizes containing that particular core sequence or VNTR.

6. Autoradiography. The nylon membrane is then placed against a piece of x-ray film where the radioactive probes will expose the film at their respective locations. After the film is processed, black bands will appear where the radioactive probes bonded to the RFLPs. Usually, two bands will appear for each probe because most people are heterozygote; that is, they inherit a different length allele or RFLP from each parent. If, however, one band appears, the person to whom the sample belongs may or may not be a homozygote; that is they may or may not have inherited the same length allele from each parent.[5] The film is referred to as an autoradiograph, or autorad for short. In theory, the autorad provides a depiction of the allele sizes for a particular locus.

The above process is then partly repeated by introducing additional probes that hybridize with different VNTRs. The FBI ordinarily uses four or five different probes. The use of several probes is necessary because the degree of individualization is not high for the two alleles (one from each parent) that occur at one locus. However, according to proponents of DNA profiling, it is exceedingly rare for two *253 people to share eight or ten alleles occurring across four or five loci.

The next step involves the interpretation of the autorads to determine if in fact a match exists in the two lanes of the autorad between a known sample of a suspect and the unknown sample extricated from the crime scene or victim. The FBI uses a two stage procedure for deciding whether two autorads match. First, the FBI determines if a visual match exists. If one does not, the FBI will determine whether the non-match should be interpreted as inconclusive or as excluding the suspect. If a visual match is declared, however, the FBI takes a mechanical measurement to verify that a match indeed exists. This mechanical determination is obtained through a computer imaging process that references the bands to the molecular weight markers on the autorad. Because these reference points have a known value in base pair units, a measurement of the base pairs can be determined for respective bands on each autorad. If two bands from the two separate lanes of the autorad fall within plus or minus 2.5% in base pairs, the FBI will proclaim a match for that particular RFLP. If the bands are over plus or minus 2.5%, the autorad is considered either inconclusive or as an exclusion.

Once matches are declared for the respective RFLPs, the FBI determines the statistical significance of a match between two DNA profiles. This process largely involves the field of human population genetics because the central issue is the frequency with which a pattern of alleles (genotype) occurs in a specific population. Therefore, it is first necessary to determine the frequency with which each individual allele occurs in a particular population.

To achieve this end, the FBI uses an approach called fixed bin analysis. A bin is an arbitrarily defined range of base pairs. For example, one bin that the FBI uses has as its boundaries 872 and 963 base pairs; thus, any allele having a base pair length within that range is classified as belonging to that bin. The FBI then samples a targeted population to establish a data base of allele frequencies. The FBI has compiled, or is in the process of compiling, data bases for Caucasians, Blacks, Asians, and Hispanics. The FBI's Caucasian data base at issue here was derived from blood samples of approximately 225 FBI agents from throughout the United States. The FBI produced autorads for each blood sample, measured the alleles, and categorized them within the appropriate bin. The frequency of occurrence for alleles falling within their respective bins can thus be calculated. The FBI then uses these frequencies to predict the frequency with which the entire pattern of alleles produced from the forensic sample would occur in the target population. To arrive at this final calculation, the FBI assumes that the alleles are randomly occurring and applies the product rule. This assumption, as will be discussed, is vigorously contested by the defendant's experts.

Almost 90% of all VNTRs are double-banded or heterozygous. To calculate the frequency of a double-banded pattern, the FBI multiples the frequencies of the two alleles together and then multiplies that number by two to reflect the fact that each allele could have come from either parent.[6] This step is repeated for every probe. Finally, the frequencies of each probe are multiplied together to reflect the total frequency with which that genotype or pattern of alleles will appear in the relevant population. In this particular case, the FBI calculated that the frequency with which the defendant's genotype occurs in the Caucasian population is one in 300 million.

The general theories of genetics which support DNA profiling are unanimously accepted within the scientific community. The debate before the court thus centers on the question of whether adequate technology and knowledge now exists to allow *254 DNA profiling to pierce the protective evidentiary boundaries of the criminal trial.

II. ANALYSIS

The legal standard is not in dispute; both parties acknowledge that United States v. Williams, 583 F.2d 1194 (2d Cir.1978), cert. denied, 439 U.S. 1117, 99 S. Ct. 1025, 59 L. Ed. 2d 77 (1978), provides the applicable guidepost. There, the Second Circuit rejected a strict application of Frye v. United States, 293 F. 1013 (D.C.Cir.1923), in favor of the flexible approach afforded by the relevancy test. Because Williams apparently settled the Frye debate for this circuit, it is unnecessary to discuss the relative merits of the Frye versus the relevancy test.[7]

In Williams, 583 F.2d at 1198, the court concluded that the appropriate considerations for the admission of novel scientific evidence were the same as those used to determine the admissibility of other evidence. Thus, the test is inherently a balancing one that weighs the probativeness, materiality, and reliability of the evidence against the tendency to mislead or confuse the jury, or unfairly prejudice the defendant. See Fed.R.Evid. 401-03, 701-03.

In applying this inquiry, the Williams court easily concluded that the government's proffer of voice spectrograph evidence went directly to the issue of identity and thus was both material and probative. The court then faced the more difficult issue of reliability and noted that the defendant had presented a list of ten scientists who favored the admission of voice spectrographs and seventeen who opposed it. The court rejected at once the significance of this Harris poll of scientists in its oft-quoted expression: "A determination of reliability cannot rest solely on a process of `counting (scientific) noses.'" Id. The court then incisively observed:

[U]nanimity of opinion in the scientific community, on virtually any scientific question, is extremely rare. Only slightly less rare is a strong majority. Doubtless, a technique unable to garner any support, or only minuscule support, within the scientific community would be found unreliable by a court. In testing for admissibility of particular type of scientific evidence, whatever the scientific `voting' pattern may be, the courts cannot in any event surrender to scientists the responsibility for determining the reliability of that evidence.

Id.

The court proceeded to delineate factors to be considered when assessing whether a particular scientific technique is reliable: (1) the potential rate of error; (2) the existence and maintenance of standards; (3) the care with which the scientific technique has been employed and whether it is susceptible to abuse; (4) whether there are analogous relationships with other types of scientific techniques that are routinely admitted into evidence; and (5) the presence of failsafe characteristics.[8]Id. at 1198-99. Other factors relevant to admissibility, *255 which Williams did not expressly endorse but nonetheless are congruous with the court's reasoning, are: (1) the expert's qualifications and stature; (2) the existence of specialized literature; (3) the novelty of the technique and its relationship to more established areas of scientific analysis; (4) whether the technique has been generally accepted by experts in the field; (5) the nature and breadth of the inference adduced; (6) the clarity with which the technique may be explained; (7) the extent to which basic data may be verified by court and jury; (8) the availability of other experts to evaluate the technique; and (9) the probative significance of the evidence. See, e.g., J. Weinstein and M. Berger, 3 Weinstein's Evidence ¶ 702[03] (1988); McCormick, Scientific Evidence: Defining a New Approach to Admissibility, 67 Iowa L.Rev. 879, 911-12 (1982). The essential question is not whether the technique is infallible, but rather whether the scientific technique exhibits "a level of reliability sufficient to warrant its use in the courtroom." Williams, 583 F.2d at 1198.

The balancing test's counter-weight to the reliability of the scientific technique is the tendency of the technique to mislead or confuse the jury or unfairly prejudice the defendant. This balancing entails a traditional Fed.R.Evid. 403 analysis. Thus, the court must consider the effectiveness of limiting instructions and allowing the weight of the evidence to be challenged through cross-examination. The potential for the jury to be awed by the "mystic infallibility" often surrounding scientific techniques should also be appraised. Id. at 1199-1200.

An important characteristic of the relevancy test adopted by Williams is that it is a flexible standard that adapts to the exigencies of a particular scientific technique and case. Thus, when a scientific technique is more likely to mislead or confuse the jury, the test requires that a proportionally stronger showing of reliability must be made. See McCormick, supra at 909-10. The defendant contends that reliability must be proven beyond a reasonable doubt. The court, however, is hesitant to ascribe a specific standard of proof that the proponent of the scientific evidence must meet. See McCormick, supra at 908 (stating that specific standard of proof not necessary as long as courts are cautious in applying relevancy test). The difficulty in choosing a standard arises because the degree of reliability that the proponent must establish depends in large part on the character of the scientific technique at issue. Moreover, the analysis entails mixed questions of law and fact that does not fit easily into a standard of proof. See United States v. Downing, 753 F.2d 1224, 1240 (3rd Cir.1985) ("balancing analysis incorporates important policy elements ... which render the determination something more than a fact-finding"). Nonetheless, we do hold that the proponent of novel scientific evidence must make more than a prima facie showing; consequently, it is necessary for "the court to exercise to some degree an evidentiary screening function."[9]Downing, 753 F.2d 1240 n. 21.

There currently are no written federal court opinions addressing the admissibility of DNA profiling. Most all of the state trial and appellate courts that have confronted the issue have held that DNA profiling is generally admissible.[10] Some *256 of these decisions, however, are of limited force because the defense could not proffer any expert witnesses to counter the prosecutions's claims.[11] Of those courts favoring the admissibility of DNA evidence, at least two have refused to admit DNA profiling in a particular case because of slipshod application of the procedures.[12] Only one court has ruled that DNA profile evidence is generally inadmissible under the relevancy test.[13] Another trial court that applied the relevancy test ruled that only the RFLP evidence of a match was admissible because the genotype frequency methods had "not been demonstrated to rest on a sound scientific basis."[14] Despite the abundance of precedent favoring the admission of DNA profiling, this court is obligated to make independent findings; thus, we place limited probative value on these decisions.

A. Reliability of DNA Profiling

DNA profiling is an amalgamation of primarily two disciplines: molecular biology and population genetics. The fields of molecular biology, biochemistry and related disciplines are largely responsible for the RFLP laboratory procedures used for determining whether a match exists between the unknown sample and a suspect; that is, whether the allele patterns are "identical." The fields of population genetics and human population genetics are in turn responsible for determining the significance of a declared match; that is, the probability that there is a coincidental match. Accordingly, it is helpful to analyze separately the reliability of each process.

1. Reliability of RFLP Analysis

Government experts Dr. Kenneth Kidd, a molecular geneticist and human population geneticist, Dr. C. Thomas Caskey, a molecular biologist and population geneticist, Dr. Bruce Budowle, FBI research chemist and population geneticist, and Mr. (soon to be Dr.) Lawrence Presley, FBI research chemist and geneticist, all testified that FBI procedures for RFLP analysis are entirely acceptable. Each testified that every step of RFLP analysis is both reliable and generally accepted in the scientific community for forensic purposes. RFLP techniques have been utilized for genetic research and medical diagnostics for several years. Defense expert Dr. Joseph Nadeau, a molecular biologist and population geneticist, did not dispute that in a non-forensic setting each phase of the RFLP is generally accepted by the scientific community and is reliable for the purposes for which it is used. Dr. Nadeau, however, argued that the forensic setting typically involves contaminated samples rather than the pristine samples used in research and diagnostics.

Although Dr. Nadeau is correct to observe that the forensic setting is much more demanding than diagnostic and experimental utilization of the RFLP procedure, the court disagrees that the FBI has failed to compensate for these difficult conditions. The court finds as credible the position of the government's experts that the potential rate of error (false positive) in the determination of a match is at worst remote and at best inconceivable. The court also finds that the government has sufficiently established that erroneous application of the FBI procedures or the degradation of a sample of unknown origin will result in either inconclusive results or a false negative and thus will redound to the defendant's benefit. See Williams, 583 F.2d at 1199 ("[A] convincing element in determining reliability is the presence of `fail-safe' characteristics.").

*257 The FBI has formulated strict protocols to ensure that RFLP procedures are implemented in a consistent manner. Moreover, the FBI protocols provide for several controls that indicate whether the RFLP process is producing accurate results. First, the failure of molecular weight markers to align properly on the gel indicates a malfunction of the electrophoresis process. Second, a cell line marker, which is an allele of a known number of base pairs, must produce a band at the appropriate molecular weight marker to validate the particular electrophoresis. Furthermore, if the test is conducted properly and the sample is not overly deteriorated, the autorad from the known sample of the victim will match the autorad produced from the female component of the vaginal swab sample. Finally, the FBI protocols mandate the testing of each lot of restriction endonuclease. According to Mr. Presley, test results are thrown out if these control measures indicate unreliability. The court finds that the control measures, as well as the numerous other procedures mandated by the FBI protocols, collectively ensure detection of most of the laboratory and sample degradation problems that could conceivably occur.

Mr. Presley also testified that he had conducted two proficiency tests at the FBI lab and that neither one resulted in error.[15] Although undoubtedly, blind proficiency testing would provide the most objective scrutiny of FBI procedures, the court does not believe the lack thereof substantially undermines the FBI procedures currently in place.

Another point of contention concerned the FBI's matching rule that requires that in order for a visual match to be confirmed, the number of base pairs of the alleles measured on the two autorads must be within plus or minus 2.5%. The FBI derived this 5% window through an empirical analysis based upon the total variation of matches from known samples rather than a statistical approach that utilizes confidence intervals. Defense expert Dr. Nadeau testified that this distinction renders the FBI's mathematical approach scientifically unacceptable. All of the government experts, who the court notes are at least as renowned as Dr. Nadeau, testified to the contrary.

Although Dr. Nadeau raises some wellreasoned objections, he does not substantially undermine the FBI's basis for its match criteria. Dr. Nadeau conceded that if the autorad matches in this particular case were within plus or minus 1% of the number of base pairs, he would have more confidence in the conclusion that there was in fact a match. Indeed, in this case, all sixteen band matches (eight alleles from each the victim and the suspect on four different autorads) were within plus or minus 1%. Dr. Budowle testified that 96% of all conclusive matches fall within plus or minus 1.25%. The court concludes that although a match may in some instances be declared for an empirically based window that would not constitute a match for a statistically based window, that discrepancy does not, in itself, render the matching criteria unreliable as a whole but rather provides fodder for effective cross-examination when that condition occurs. Lastly, the fact that a match must be verified by physical measurement eliminates inconsistencies inherent whenever subjective judgments are involved.

The court also finds that in this particular case the FBI applied the RFLP technique in an exacting and reliable manner.[16]*258 Indeed, Dr. Nadeau conceded on cross-examination that three of the difficulties that cause him some concern about RFLP analysis —band shifting, sample degradation, and partial digestion by the restriction enzyme —did not occur in this case. Moreover, as previously noted, all sixteen matches of alleles in this particular case were within plus or minus 1%; thus, Dr. Nadeau's concerns about the 5% matching window are not implicated here. In addition, the molecular weight markers and the cell line markers all aligned properly in this particular test. Finally, the known sample of the victim matched the female fraction of the vaginal swab sample. This last control not only provides further verification that the test was run properly, but it indicates that the female component of the vaginal swab sample was not significantly degraded by bacteria or other environmental insults and that the male fraction of the sample was likely of equal quality.

The United States has proffered no strong analogies between DNA profiling and other types of scientific techniques that are routinely admitted into evidence in criminal cases. Although protein gel electrophoresis is at first blush similar, it involves different techniques and court decisions concerning its admissibility are therefore not particularly helpful. It is the enormous degree of identity that DNA profiling provides (300 million to one in this case) that makes DNA profiling the most important advance in forensic science since the advent of fingerprinting.[17] Because this power of identity is considerably greater than most other forensic techniques provide (with the exception of fingerprinting) strong analogies are difficult to establish. Nonetheless, the court does not believe that the lack of persuasive analogies, standing alone, is enough to preclude admission. To hold otherwise would, in effect, erect a permanent barrier against the admission of any truly "novel" scientific technique. It is noteworthy that the defendant's experts concede that RFLP techniques are reliable and generally accepted in the scientific community for non-forensic uses. Since the court has found that the FBI has sufficiently considered and compensated for the more rigorous demands of forensic use of RFLP analysis, we find probative the analogy to the more established non-forensic uses of RFLP analysis.

In addition, the court finds that the government's experts are imminent and highly regarded in their respective fields and the court finds as credible their testimony as to the reliability and acceptance of the forensic use of RFLP. The court also believes that the specialized literature that has been published concerning RFLP laboratory techniques in forensics indicates that its theory and practice have a solid foundation within the scientific community.[18] Although the publications by no means unequivocally endorse DNA profiling,[19]*259 neither unanimity of scientific opinion nor a strong majority is a prerequisite to finding a scientific technique reliable. Williams, 583 F.2d at 1198.

In summary, the court finds that: the potential rate of error (false positive) for RFLP matching is exceedingly low; the FBI has maintained rigorous standards; the RFLP protocols established by the FBI preclude abuse of the technique; the FBI has meticulously applied its RFLP protocols in this case; and the RFLP matching procedure is exceptionally "fail-safe." Furthermore, the court finds that the use of RFLP matching in a forensic setting is generally accepted within the molecular biology scientific community. In light of these findings, we conclude that RFLP is a reliable technique for determining whether alleles from two DNA samples match. It thus becomes necessary to examine the scientific technique used to calculate the frequency with which a particular pattern of alleles occurs in the relevant population.

2. Genotype Frequency Reliability

The FBI established its frequencies for particular alleles through fixed bin analysis and has recently submitted for publication its procedures and reasoning in a manuscript entitled: Fixed Bin Analysis For Statistical Evaluation of Continuous Distributions of Allelic Data From VNTR Loci for Use in Forensic Comparisons. Two of the nine coauthors are expert witnesses in this case — Dr. Budowle and Mr. Presley. All of the government's experts testified that the fixed bin approach to determining allele frequencies is a reliable and generally accepted method to calculate allele frequencies. The experts further testified that the FBI's fixed bin analysis is a very conservative estimate of allele frequency that more than compensates for potential errors that might result from limitations in technology, limited sample population data, substructure or linkage disequilibrium, and sampling error.

Dr. Budowle, for example, testified that the bin boundaries the FBI established are at least twice the measurement error calibrated in number of base pairs; therefore, two alleles that fall within the same bin might not satisfy the matching criteria. Consequently, according to Dr. Budowle, the frequencies of the alleles in each bin are on average at least two times greater (more frequent) than the actual frequencies observed in the sample populations for an allele (defined with the +/- 2.5% matching criterion) occurring within the bin. Moreover, certain bins have frequencies that are up to ten times more frequent than the actual observed frequency for an allele occurring within that bin.

In addition, Dr. Caskey testified that when alleles fall within a region of the gel that has reduced discrimination capacity and the allele is very rare, the FBI will not consider that autorad when calculating frequency, despite the fact that there may be a good match between the known and unknown samples. Indeed, in the case before us, the FBI declined to use the frequency data for two alleles on one probe because the alleles fell within this rare area of the gel.

Finally, when the FBI calculates the frequency of occurrence for a forensic sample, it categorizes any alleles for that sample that fall on the boundary of a bin (boundary is defined by the measurement error) as belonging to the bin with the higher frequency. Because higher frequency estimates make it more likely that the match is purely a coincidence, the FBI procedures redound to the benefit of the defendant. See Williams, 583 F.2d at 1199 (emphasizing that error should benefit the defendant).

The defendant's experts claim that there is no factual basis for claiming that substructure or subgroups do not exist within the Caucasian race for these alleles and that therefore, the use of the product rule to estimate genotype frequencies is wholly *260 inappropriate.[20] Defense expert Dr. Richard C. Lewontin, a human population geneticist, testified that just because individuals do not chose their mates based on a specific gene or trait does not mean that humans mate randomly as to that gene. Dr. Lewontin, stated that, to the contrary, individuals may form endogamous groups based on religion, ethnicity, and geography. If genetic substructure exists between these groups with respect to VNTRs then mating is not truly random. Dr. Lewontin claimed that because no studies have examined genetic substructure for VNTRs, in Caucasians, it is necessary to assume that substructure exists because analogous studies involving blood type (non-VNTR) genes show there is substantial substructure within European Caucasians. Therefore, Dr. Lewontin concluded that until more is known, it is entirely inappropriate to use one data base for all Caucasians and to use the product rule to calculate an allele pattern's frequency. In short, Dr. Lewontin believes that an accurate estimate of allele frequency cannot be made until extensive studies of VNTR frequencies in ethnic, religious, and geographic subgroups are completed. Dr. Lewontin also testified that because the extent to which substructure exists is unknown, it is impossible to know whether the FBI binning process is conservative. The respective testimonies of defense experts Dr. Nadeau and Dr. Mueller largely corroborates Dr. Lewontin's testimony.

Although Dr. Lewontin's testimony was, in many respects, insightful, the court does not find that it substantially undermines the FBI genotype frequency procedures as a whole. To be sure, the court finds Dr. Lewontin's observations concerning random mating and the possibility of genetic substructure persuasive and to some extent Dr. Lewinton did discredit the government's experts who casually concluded that VNTRs must randomly occur throughout the population because individuals do not consciously consider VNTRs when they chose their mates. Nevertheless, the court finds that to the extent that substructure might exist for VNTRs within Caucasians, the FBI has sufficiently proven that it has compensated for this possibility by using conservative binning procedures.

Indeed, Dr. Kidd and Dr Budowle concede that some substructure exists, but contend that the frequency differences for VNTRs between subgroups are insubstantial and thus are more than offset by the FBI's conservative fixed bin procedures. The bases for their conclusions consist primarily of personal observations of data collected from an array of ethnic groups from around the world. Dr. Kidd testified that he has looked at data from many subgroups, including Italians, Swedes, Irish, Amish, and mixed Europeans, and all have "very small differences" in allele frequencies. Further, Dr. Kidd stated that the differences in subgroups are "absolutely insignificant" to the method in which the FBI uses its Caucasian data base. Dr. Budowle testified that he has examined world-wide data from several populations, including German, Dutch, French, Lebanese, and Israeli, and that the frequencies of RFLPs for these groups is "amazingly similar." Finally, Dr. Budowle stated, in complete exasperation, that "there comes a point where you got to say, how many populations do we have to do to convince."

*261 While it is of some concern that neither Dr. Kidd nor Dr. Budowle cite to any published studies to support these conclusions, the court finds their testimony credible and convincing.[21] Moreover, the study largely relied on by Dr. Lewontin to show significant substructuring of Rh-blood typing genes was discredited by Dr. Conneally, a human population geneticist, who testified that the study is unreliable because it was undertaken in 1947 when the blood typing methods applied there were inaccurate. The court finds that to the extent that substructure may exist, it is very unlikely to be substantial for VNTRs. Consequently, at least with regard to the FBI methods, the existence of some substructure does not significantly affect the accuracy of VNTR frequencies.[22] In other words, the court concludes that it is highly unlikely that the FBI's frequency estimate of a specific genotype across four or five loci would be lower (prejudicial to the defendant) than the actual frequency of that genotype if in fact substructure existed and a less conservative fixed bin system was employed.

The defendant also objects to the size and composition of the FBI data base. In particular, defense experts testified that a data base from 225 FBI agents is insufficient to reflect adequately the frequency of these alleles within the Caucasian population. The defendant's objections to sample size and composition, however, are largely undermined by the court's conclusion that the existence of substructure, if it exists at all, is minor and would not materially alter the FBI's conservative frequency calculation. According to Dr. Kidd, once it is determined that the alleles are randomly occurring throughout a targeted population, sample size can be decreased to as little as one hundred individuals. Moreover, Dr. Kidd testified that the composition of the data base may be less rigorous when the targeted genes or VNTRs occur randomly. At any rate, Dr. Kidd concluded that the FBI data base, which consists of Caucasian FBI recruits from around the country, provides an adequate representation of Caucasians in the United States for purposes of VNTR frequencies.

An issue that developed during the course of the proceedings concerned the FBI's rerunning of its data base. Apparently, the second running of the same 225 blood samples from FBI agents produced results that did not entirely correspond to the initial data base. Specifically, the FBI changed the number of size markers from 15 to 30 in order to provide a more precise measurement of base pairs. The FBI then ran the samples again to produce new autorads. In the second set of autorads, ten out of approximately 400 alleles did not provide a match with the first set of autorads within the plus or minus 2.5% matching criterion. In addition, after the second running of the samples, some alleles landed in different bins than they had fallen during the first running. The defendant characterizes the differences between the two tests as "alarming" while the FBI describes the differences as so small that it is "remarkably good."

The defendant's experts, particularly Dr. Nadeau, felt that because the FBI could not replicate its experiment, the procedure for performing RFLPs and determining genotype frequencies must be deemed unreliable. The FBI claims that it was not attempting to reproduce its experiment because each were run under different conditions. Rather, Dr. Budowle testified that the FBI was attempting to obtain more *262 precise measurements by using a 30 band marker system rather than a 15 band system. Dr. Budowle testified that using different reference points will inevitably produce slightly different results. Dr. Conneally characterized the expanded number of size markers as analogous to a new ruler that is graduated to a finer degree and thus produces a more reliable measurement. The court agrees with Dr. Budowle and Dr. Conneally and finds the FBI position well-supported; thus, the court does not believe that the results obtained after the second running of the samples discredit the FBI procedures. Indeed, because the autorads produced for this case employ 30 band markers, the defendant might very well have attempted to impeach the FBI procedures if the FBI had not resubjected the sample population used to calculate his genotype frequency to the same degree of precision.

The court also finds that the FBI has sufficiently established that its procedures for determining genotype frequency were carefully employed in this case. In fact, as previously discussed, the FBI completely disregarded the results from one probe because the alleles were in too rare an area of the gel to provide a statistically significant frequency estimate. Moreover, in this particular case, the defendant proffered no evidence that there was a likelihood that he belonged to an ethnic, religious, or geographic subgroup that raised the possibility that his genotype occurs more frequently than the FBI's estimate.[23]

In short, the court finds that the FBI's genotype frequency calculation procedures have an exceptionally low potential for underestimating the actual frequency of a genotype. Moreover, the court finds that the FBI has devised strict standards and has followed them in this particular analysis. The court further concludes that the genotype frequency calculations are largely mathematical, involving no subjective judgments, and that they therefore do not lend themselves to abuse in specific applications. The court also finds that the FBI has employed "fail-safe" characteristics in their fixed bin approach that errs on the side of higher genotype frequencies and thus redounds to the benefit of the defendant. In light of these findings, the court believes that the FBI's method for calculating genotype frequencies is reliable and accurate.[24] Although DNA profiling conducted by the FBI lab may very well become more precise in the future, the court harbors few doubts regarding its current accuracy.

B. Rule 403 Analysis

As the court has indicated, the reliability of DNA profiling must be weighed against the danger of unfair prejudice and misleading the jury.[25] Fed.R.Evid. 403. Inherent in this balancing test is a court's obligation to require a proportionally higher showing of reliability when there is an increased likelihood of misleading the jury or unfairly prejudicing the defendant.

There is no doubt a risk whenever scientific evidence is admitted that the jury will regard it with an "aura of mystic infallibility." Williams, 583 F.2d at 1199. Arguably, DNA profiling is particularly capable —in more ways than one — of lulling a jury into slumbering at its post and not rigorously sifting the evidence. See Note, The Dark Side of DNA Profiling: Unreliable Scientific Evidence Meets the Criminal Defendant, 42 Stan.L.Rev. 465, 466, 511-17 (1990) (arguing that juries have no ability to weigh DNA profile evidence). This contention, however, undervalues the combined ability of cross-examination, opposing expert witnesses, and limiting instructions to counteract the hazards of DNA profile evidence. See Williams, 583 *263 F.2d at 1199-1200. Moreover, like the voice spectrographs held admissible in Williams, the jury can visually inspect the autorads to compare not only the bands of the defendant with the bands produced from the forensic sample, but to contrast their clarity and respective positions with the bands produced from the victim's DNA.[26]Id. at 1199.

The court does not believe that a jury will be awed into complete submission by DNA profile technology. To the extent a jury will be impressed, however, the United States has sufficiently established that the current reliability and accuracy of DNA profiling justifies an aura of amazement. That DNA profiling is a remarkable advancement in forensic science, however, does not preclude it from being presented to a jury. Although substructure is arguably the weakest link of the DNA profiling chain, the court believes the debate over the existence of ethnic, religious, or geographic substructure for VNTRs is comprehensible to most lay people. Moreover, a jury is certainly capable of and at liberty to reject the testimony in which Dr. Kidd and Dr. Budowle stated that substructure, if it exists, is insubstantial. Thus, the most controversial of all the premises underlying DNA profiling is also the most likely to receive proper scrutiny by the jury.

In conclusion, given the high degree of individualization that DNA profiling provides there is no doubt that the evidence would have a substantial impact on a jury's verdict. Nonetheless, the government has established that DNA profiling is highly reliable and such reliability outweighs the increased potential for DNA profiling to unfairly prejudice the defendant or mislead or confuse the jury. The defendant's motion to prohibit the introduction of DNA profile evidence is thus DENIED.

SO ORDERED.

NOTES

[1] At the outset, the court notes that the attorneys and expert witnesses for both parties have done stellar jobs in presenting this complex scientific procedure in a comprehensible manner.

[2] Non-nucleated cells such as mature red blood cells have no DNA. This phenomenon does not, however, prevent DNA typing of blood because white blood cells contain a nucleus and thus DNA.

[3] No one cell, however, uses the entire "blueprint." Cells located in different parts of the human body read only those segments of the DNA necessary to perform their respective functions.

[4] During the course of the hearings, witnesses and lawyers have referred to the RFLPs by various names, including VNTR genes, alleles, genes, fragments, and loci.

[5] Single bands are not always homozygous because alleles may migrate to the outer edge of the gel and are thus not displayed on the autorad. Secondly, two alleles may in fact be different sizes but are so close together on the gel that the two bands appear as one.

[6] For a single band autorad, the FBI takes the frequency of the allele and multiplies it by two. The FBI does not assume a single band is homozygous and thus does not square the frequency of that allele before it is multiplied by two. The FBI contends that this is another conservative factor built into their calculations. Defense expert Dr. Lewontin pointed out, however, that this calculation has no effect in the majority of cases, including this one, because almost 90% of VNTRs are double-banded.

[7] The court recognizes that at least two cases suggest that the Second Circuit Court of Appeals is less than harmonious on this issue. In United States v. Torniero, 735 F.2d 725, 731 n. 9 (2d Cir.1984), cert. denied, 469 U.S. 1110, 105 S. Ct. 788, 83 L. Ed. 2d 782 (1985), for example, the court, while expressly declining to decide which legal standard applied to the admissibility of scientific evidence, failed to acknowledge that Williams had decided the issue for this circuit. And in United States v. McBride, 786 F.2d 45, 49 (2d Cir.1986), the court stated that Fed.R.Evid. 702 specifically applies Frye by "requiring that specialized knowledge `assist the trier of fact to understand the evidence or to determine a fact in issue.'" The McBride court concluded that psychiatry enjoyed general acceptance in the field of medicine and thus the trial court had erred in basing its decision, in part, on the infancy of psychiatry. Id. at 50-51. The court nowhere mentioned Williams. Implicit in the court's analysis is that general acceptance in the relevant scientific community is a prerequisite to admission.

Despite the mixed signals emanating from McBride and Torniero we will continue to apply Williams until directed otherwise. While this court takes the view that general acceptance within the particular scientific community is a factor to weigh, we do not believe general acceptance is either necessary or sufficient in and of itself to establish admissibility. See United States v. Downing, 753 F.2d 1224, 1237 (3rd Cir.1985). See also J. Weinstein and M. Berger, 3 Weinstein's Evidence ¶ 702[03] (1988).

[8] In application these factors tend to overlap.

[9] Despite the court's judgment that the government's burden of establishing reliability cannot be ascribed a specific standard of proof, we recognize that this issue is not at all settled. Thus, the court notes that all of the factual findings in this case at the very least satisfy the clear and convincing evidence standard.

[10] See, e.g., State v. Pennington, 327 N.C. 89, 393 S.E.2d 847 (1990) (applying relevancy test); Kelly v. State, 792 S.W.2d 579 (Tex.Ct.App.1990) (applying Frye); Glover v. State, 787 S.W.2d 544 (Tex.Ct.App.1990) (applying Frye); Caldwell v. State, 260 Ga. 278, 393 S.E.2d 436 (1990) (applying relevancy test); Spencer v. Commonwealth, 240 Va. 78, 393 S.E.2d 609 (1990) (DNA evidence satisfies Frye or relevancy test); State v. Ford, 392 S.E.2d 781 (S.C.1990) (applying less restrictive formulation than Frye); Spencer v. Commonwealth, 238 Va. 275, 384 S.E.2d 775 (1989) (applying relevancy); State v. Schwartz, 447 N.W.2d 422 (Minn.1989) (applying modified Frye); Andrews v. State, 533 So. 2d 841 (Fla.Dist. Ct.App.1988) (applying relevancy test but stating that DNA evidence would still be admissible if Frye applied); People v. Castro, 144 Misc. 2d 956, 545 N.Y.S.2d 985 (Sup.Ct.1989) (applying Frye); People v. Wesley, 140 Misc. 2d 306, 533 N.Y.S.2d 643 (Albany County Ct.1988) (applying Frye); People v. Shi Fu Huang, 145 Misc. 2d 513, 546 N.Y.S.2d 920 (Nassau County Ct.1989) (applying Frye).

[11] See Spencer, 384 S.E.2d 775; Andrews, 533 So. 2d 841; Schwartz, 447 N.W.2d 422.

[12] See Castro, 545 N.Y.S.2d at 997-98; Schwartz, 447 N.W.2d at 428.

[13] See State v. Wheeler, No. C89-0901CR (Wash. County.Cir.Ct., Or. Mar. 8, 1990) (Lund, J.) (DNA profiling evidence inadmissible under relevancy test) (hearing transcript).

[14] State v. Pennell, 1989 WL 167430 at 11 (Del. Super.Ct. Nov. 6, 1989) (Gebelein, J.).

[15] The defense attempts to discredit the proficiency tests because the FBI printer incorrectly printed the results of one of the tests. Lawrence Presley testified that this error was caused by a computer glitch that did not reflect the actual test results which were correctly displayed on the screen. Having been the frustrated victim of a computer glitch or two itself, the court finds this testimony reasonable.

[16] The United States contends that the particular results in this case are irrelevant to a determination of reliability and thus to the question of admissibility. Rather, the United States claims that any flaws that may exist in the application of DNA profiling in this case may be exposed through cross-examination and thus are issues of weight for the jury to resolve. The court disagrees. As long as a scientific technique is considered novel, evidence establishing that the technique was not meticulously applied tends also to show that the technique is not conducive to reliable application; thus, specific application is certainly relevant. Furthermore, the third reliability factor espoused in Williams, 583 F.2d at 1199, concerning "the care and concern with which a scientific technique has been employed," strongly suggests that the results in the particular case have a strong bearing on reliability. At least one judge and scholar supports this interpretation of the third Williams factor. See McCormick, Scientific Evidence: Defining a New Approach to Admissibility, 67 Iowa L.Rev. 879, 911-12 (1982). See generally Castro, 545 N.Y.S.2d at 985 (DNA identification inadmissible because of flawed application of techniques even though generally accepted under Frye); Schwartz, 447 N.W.2d at 422 (DNA profiling generally accepted but inadmissible when specific test results do not comport with standards).

[17] It is misleading, however, to label DNA profiling or typing as DNA fingerprinting because such a characterization not only grossly oversimplifies the technical aspects of RFLP analysis, but it misrepresents the meaning of a genotype match. No two fingerprints are known to match unless they are from the same person. In contrast, a RFLP genotype across four or five loci can, in theory, be duplicated though the frequency of such an occurrence is exceedingly low. If technology is developed to examine all three million base pairs that differ in sequence among individuals, the fingerprinting analogy would be more accurate.

[18] See Castro, 545 N.Y.S.2d at 990 n. 3, for Judge Sheindlin's extensive list of publications concerning DNA typing.

[19] See, e.g., Lander, DNA Fingerprinting on Trial, 339 Nature 501 (1989); Mayersak, DNA Fingerprinting Problems for Resolution, 36 Med. Trial Tech. Q. 441 (1990); Thompson & Ford, DNA Typing: Acceptance and Weight of the Genetic Identification Tests, 75 Va.L.Rev. 45 (1989).

[20] In other DNA profiling cases, much debate centered on whether the population was in Hardy-Weinberg equilibrium, recognized as a prerequisite to using the product rule. See Castro, 545 N.Y.S.2d at 992; Wesley, 533 N.Y.S.2d at 656-67. Recently, however, there has been general agreement that Hardy-Weinberg is a poor test for substructuring, at least with the sample sizes involved here. In this case, government experts Dr. Budowle and Dr. Conneally are in apparent agreement with defense experts Dr. Lewontin and Dr. Nadeau over the shortcomings of relying on Hardy-Weinberg equilibrium. In light of this consensus, it is unnecessary — and this court happily declines — to blaze a trail through this thicket of true homozygosity versus single bands. Similarly, whether there is a physical linkage between alleles of different loci is not an issue. Defense expert Dr. Mueller, a population geneticist and evolutionary biologist, observed in his paper entitled Population Genetics of Hypervariable Human DNA that the FBI's loci are all on different genes. Physical linkage, however, should be contrasted with statistical linkage or association between alleles of different loci, which is contested in this case.

[21] The court notes that Dr. Kidd is the director of the Human Gene Mapping Library, an international organization that maintains a computer data base containing information on gene locations. Dr. Kidd also oversees the DNA committee, a subcomponent of the Human Gene Mapping Library, through which he has observed data on the frequencies and character of all known DNA polymorphisms occurring within different populations throughout the world. In light of his background and persuasive testimony, the court finds Dr. Kidd's observations on RFLP frequencies particularly credible and authoritative.

[22] Undoubtedly, the FBI's substructure assumptions would be less impeachable if there was a longer period of peer review that included published studies on substructure for VNTRs. However, the court does not view the lack thereof as fatal to admissibility under Williams.

[23] This observation should not be interpreted as placing the burden of proving inadmissibility on the defendant. The court merely finds it probative that not one of the defendant's experts testified that the defendant belonged to a subgroup that raised a likelihood that he was prejudiced by the FBI data base. After all, it is the defendant who has insisted that the application and results of DNA profiling in this case are highly relevant to a determination of admissibility.

[24] Although the court finds the genotype frequency methods employed by the FBI highly reliable and accurate, we cannot say at the present time that they are generally accepted within the population genetic or human population genetic scientific communities.

[25] The probativeness of DNA profiling evidence in this case is obvious and undisputed.

[26] To be sure, there are more technical steps involved in producing an autorad that the jury is not privy to than are involved in producing a voice spectrograph and therefore the analogy to a voice spectrograph is not flawless; nonetheless, the autorads do allow the jury to observe visually at least part of the RFLP process.